The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jan. 26, 2021
Filed:
Nov. 15, 2019
International Business Machines Corporation, Armonk, NY (US);
Dimitrios B. Dimitriadis, White Plains, NY (US);
David C. Haws, Yorktown Heights, NY (US);
Michael Picheny, White Plains, NY (US);
George Saon, Stamford, CT (US);
Samuel Thomas, White Plains, NY (US);
International Business Machines Corporation, Armonk, NY (US);
Abstract
Audio features, such as perceptual linear prediction (PLP) features and time derivatives thereof, are extracted from frames of training audio data including speech by multiple speakers, and silence, such as by using linear discriminant analysis (LDA). The frames are clustered into k-means clusters using distance measures, such as Mahalanobis distance measures, of means and variances of the extracted audio features of the frames. A recurrent neural network (RNN) is trained on the extracted audio features of the frames and cluster identifiers of the k-means clusters into which the frames have been clustered. The RNN is applied to audio data to segment audio data into segments that each correspond to one of the cluster identifiers. Each segment can be assigned a label corresponding to one of the cluster identifiers. Speech recognition can be performed on the segments.