The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
May. 31, 2016
Filed:
Sep. 04, 2012
Speaker recognition method through emotional model synthesis based on neighbors preserving principle
Zhaohui Wu, Hangzhou, CN;
Yingchun Yang, Hangzhou, CN;
LI Chen, Hangzhou, CN;
ZHEJIANG UNIVERSITY, Hangzhou, CN;
Abstract
A speaker recognition method through emotional model synthesis based on Neighbors Preserving Principle is enclosed. The methods includes the following steps: (1) training the reference speaker's and user's speech models; (2) extracting the neutral-to-emotion transformation/mapping sets of GMM reference models; (3) extracting the emotion reference Gaussian components mapped by or corresponding to several Gaussian neutral reference Gaussian components close to the user's neutral training Gaussian component; (4) synthesizing the user's emotion training Gaussian component and then synthesizing the user's emotion training model; (5) synthesizing all user's GMM training models; (6) inputting test speech and conducting the identification. This invention extracts several reference speeches similar to the neutral training speech of a user from a speech library by employing neighbor preserving principles based on KL divergence and combines an emotion training speech of the user using the emotion reference speech in the reference speech, improving the performance of the speaker recognition system in the situation where the training speech and the test speech are mismatched, and the robustness of the speaker recognition system is increased.