The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

G10L 25/78 (2013.01); H04S 3/00 (2006.01); H04R 1/40 (2006.01); H04R 3/00 (2006.01); G10L 15/16 (2006.01); G10L 15/02 (2006.01); G10L 25/24 (2013.01); G10L 25/18 (2013.01); G06N 3/04 (2006.01); G06N 3/08 (2006.01); G10L 15/22 (2006.01);

U.S. Cl.

CPC ...

G10L 25/78 (2013.01); G06N 3/049 (2013.01); G06N 3/08 (2013.01); G10L 15/02 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G10L 25/18 (2013.01); G10L 25/24 (2013.01); H04R 1/406 (2013.01); H04R 3/005 (2013.01); H04S 3/008 (2013.01); H04S 2400/01 (2013.01);

Abstract

Embodiments of the disclosure provide systems and methods for speech detection. The method may include receiving a multichannel audio input that includes a set of audio signals from a set of audio channels in an audio detection array. The method may further include processing the multichannel audio input using a neural network classifier to generate a series of classification results in a series of time windows for the multichannel audio input. The neural network classifier includes a causal temporal convolutional network (TCN) configured to determine a classification result for each time window based on portions of the multichannel audio input in the corresponding time window and one or more time windows before the corresponding time window. The method may additionally include determining whether the multichannel audio input includes one or more speech segments in the series of time windows based on the series of classification results.

Find Patent Forward Citations