The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jul. 26, 2022

Filed:

Aug. 28, 2019
Applicant:

Alibaba Group Holding Limited, Grand Cayman, KY;

Inventors:

Biao Tian, Hangzhou, CN;

Zhaowei He, Hangzhou, CN;

Tao Yu, Hangzhou, CN;

Assignee:

ALIBABA GROUP HOLDING LIMITED, Grand Cayman, KY;

Attorneys:
Primary Examiner:
Int. Cl.
CPC ...
G10L 15/26 (2006.01); G06T 1/00 (2006.01); G10L 17/06 (2013.01); G10L 21/0364 (2013.01); G06F 3/01 (2006.01); G10L 15/25 (2013.01); H04R 1/40 (2006.01); G10L 15/02 (2006.01); G10L 15/05 (2013.01); G10L 25/78 (2013.01); G10L 15/22 (2006.01); H04R 3/00 (2006.01); G06V 40/16 (2022.01);
U.S. Cl.
CPC ...
G10L 15/25 (2013.01); G06V 40/171 (2022.01); G10L 15/02 (2013.01); G10L 15/05 (2013.01); G10L 15/22 (2013.01); G10L 25/78 (2013.01); H04R 1/406 (2013.01); H04R 3/005 (2013.01);
Abstract

The disclosed embodiments disclose methods, apparatuses, systems, devices and computer-readable storage media for processing speech signals. The method comprises: acquiring a real-time image by using an image capturing device, performing facial recognition by using the real-time image, and detecting a period during which a target user makes speech sounds based on a facial recognition result; locating a sound source in an audio signal received by a microphone array, and determining the orientation information of a sound source in the audio signal; and based on the period during which the target user in the real-time image makes the speech sounds and the orientation information of the sound source, performing a speech sound start and end point analysis to determine start and end time points of the speech sounds in the audio signal. The method for processing speech signals according to one embodiment can perform voice activity detection to the speech signal in noisy environments containing multiple sources of interference, thereby improving the anti-interference capability of the system.


Find Patent Forward Citations

Loading…