The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 11151385 B1

Date of Patent:

Oct. 19, 2021

Filed:

Dec. 20, 2019

System and method for detecting deception in an audio-video response of a user

Applicant:

Rtscaleai Inc, Cedar Park, TX (US);

Inventors:

Vivek Iyer, Austin, TX (US);

Peter Walker, Cedar Park, TX (US);

Assignee:

RTScaleAI Inc, Cedar Park, TX (US);

Attorney:

Ziegler IP Law Group, LLC

Primary Examiner:

Charlotte M Baker

Int. Cl.

CPC ...

G06K 9/00 (2006.01); G10L 15/02 (2006.01); G10L 15/22 (2006.01); G10L 15/18 (2013.01); G06K 9/32 (2006.01); G06N 5/04 (2006.01); G10L 21/0232 (2013.01); G10L 25/63 (2013.01); G10L 25/90 (2013.01); G06K 9/62 (2006.01); G06N 20/00 (2019.01);

U.S. Cl.

CPC ...

G06K 9/00744 (2013.01); G06K 9/00281 (2013.01); G06K 9/00315 (2013.01); G06K 9/00335 (2013.01); G06K 9/3233 (2013.01); G06K 9/6201 (2013.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01); G10L 15/02 (2013.01); G10L 15/18 (2013.01); G10L 15/22 (2013.01); G10L 21/0232 (2013.01); G10L 25/63 (2013.01); G10L 25/90 (2013.01);

Abstract

A method for (of) detecting deception in an Audio-Video response of a user, using a server, in a distributed computing architecture, characterized in that the method including: enabling an Audio-Video connection with a user device upon receiving a request from a user; obtaining, from the user device, an Audio-Video response of the user corresponding to a first set of questions that are provided to the user by the server; extracting audio signals and video signals from the Audio-Video response; detecting an activity of the user by determining a plurality of Natural Language Processing (NLP) features from the extracted audio signals by (i) performing a speech to text translation and (ii) extracting the plurality of NLP features from the translated text, and determining a plurality of speech features from the extracted audio signals by (i) splitting the extracted audio signals into a plurality of short interval audio signals and (ii) extracting the plurality of speech features from the plurality of short interval audio signals; aggregating (i) the plurality of NLP features to obtain a plurality of temporal NLP features and (ii) the plurality of speech features to obtain a plurality of temporal speech features; aggregating the plurality of temporal NLP features and the plurality of temporal speech features to obtain first temporal aggregated features; detecting a plurality of micro-expressions of the user by splitting extracted video signals into a plurality of short fixed-duration video signals, detecting a plurality of Region Of Interest (ROI) in the plurality of short fixed-duration video signals, and comparing the plurality of detected ROI with video signals annotated with micro-expression labels that are stored in a database to detect the plurality of micro-expressions of the user in the plurality of short fixed-duration video signals; tracking and determining a gesture of the user from the extracted video signals; aggregating the plurality of micro-expressions and the gesture of the user to obtain second temporal aggregated features; aggregating the first temporal aggregated features and the second temporal aggregated features to obtain final temporal aggregated features; and detecting, using a machine learning model, a deception in the Audio-Video response based on the final temporal aggregated features.

Find Patent Forward Citations