The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

A61B 5/16 (2006.01); A61B 5/00 (2006.01); G06F 18/25 (2023.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/048 (2023.01); G06N 3/08 (2023.01); G06T 7/00 (2017.01); G06V 10/80 (2022.01); G06V 20/40 (2022.01); G10L 25/30 (2013.01); G10L 25/57 (2013.01); G10L 25/63 (2013.01); G10L 25/66 (2013.01);

U.S. Cl.

CPC ...

A61B 5/165 (2013.01); A61B 5/4803 (2013.01); A61B 5/7275 (2013.01); G06F 18/253 (2023.01); G06N 3/08 (2013.01); G06T 7/0012 (2013.01); G06V 20/46 (2022.01); G06V 20/49 (2022.01); G10L 25/30 (2013.01); G10L 25/57 (2013.01); G10L 25/63 (2013.01); G10L 25/66 (2013.01); G06T 2207/10016 (2013.01);

Abstract

Disclosed is an automatic depression detection method using audio-video, including: acquiring original data containing two modalities of long-term audio file and long-term video file from an audio-video file; dividing the long-term audio file into several audio segments, and meanwhile dividing the long-term video file into a plurality of video segments; inputting each audio segment/each video segment into an audio feature extraction network/a video feature extraction network to obtain in-depth audio features/in-depth video features; calculating the in-depth audio features and the in-depth video features by using multi-head attention mechanism so as to obtain attention audio features and attention video features; aggregating the attention audio features and the attention video features into audio-video features; and inputting the audio-video features into a decision network to predict a depression level of an individual in the audio-video file.

Find Patent Forward Citations