The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Patent No.:

US 9355642 B1

Date of Patent:

May. 31, 2016

Filed:

Sep. 04, 2012

Speaker recognition method through emotional model synthesis based on neighbors preserving principle

Applicants:

Zhaohui Wu, Hangzhou, CN;

Yingchun Yang, Hangzhou, CN;

LI Chen, Hangzhou, CN;

Inventors:

Zhaohui Wu, Hangzhou, CN;

Yingchun Yang, Hangzhou, CN;

Li Chen, Hangzhou, CN;

Assignee:

ZHEJIANG UNIVERSITY, Hangzhou, CN;

Attorney:

Jiwen Chen

Primary Examiner:

Douglas Godbold

Int. Cl.

CPC ...

G10L 17/26 (2013.01); G10L 25/63 (2013.01); G10L 15/14 (2006.01); G10L 15/06 (2013.01);

U.S. Cl.

CPC ...

G10L 17/26 (2013.01); G10L 15/14 (2013.01); G10L 25/63 (2013.01); G10L 15/063 (2013.01);

Abstract

A speaker recognition method through emotional model synthesis based on Neighbors Preserving Principle is enclosed. The methods includes the following steps: (1) training the reference speaker's and user's speech models; (2) extracting the neutral-to-emotion transformation/mapping sets of GMM reference models; (3) extracting the emotion reference Gaussian components mapped by or corresponding to several Gaussian neutral reference Gaussian components close to the user's neutral training Gaussian component; (4) synthesizing the user's emotion training Gaussian component and then synthesizing the user's emotion training model; (5) synthesizing all user's GMM training models; (6) inputting test speech and conducting the identification. This invention extracts several reference speeches similar to the neutral training speech of a user from a speech library by employing neighbor preserving principles based on KL divergence and combines an emotion training speech of the user using the emotion reference speech in the reference speech, improving the performance of the speaker recognition system in the situation where the training speech and the test speech are mismatched, and the robustness of the speaker recognition system is increased.

Find Patent Forward Citations