The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.

Date of Patent:
Jun. 14, 2022

Filed:

Dec. 24, 2021
Applicants:

Sandeep Dhawan, Westbury, NY (US);

Kapil Dhawan, Weston, FL (US);

Dennis Reutter, Holbrook, NY (US);

Chris Beckman, Shoreham, NY (US);

Ahsan Memon, Lahore, PK;

Inventors:

Sandeep Dhawan, Westbury, NY (US);

Kapil Dhawan, Weston, FL (US);

Dennis Reutter, Holbrook, NY (US);

Chris Beckman, Shoreham, NY (US);

Ahsan Memon, Lahore, PK;

Assignee:

Other;

Attorney:
Primary Examiner:
Int. Cl.
CPC ...
G10L 21/02 (2013.01); G10L 15/26 (2006.01); G10L 13/02 (2013.01); G10L 21/10 (2013.01);
U.S. Cl.
CPC ...
G10L 21/02 (2013.01); G10L 13/02 (2013.01); G10L 15/26 (2013.01); G10L 21/10 (2013.01);
Abstract

Information loss in speech to text conversion and Inability to preserve vocal emotion information without changing the artificial intelligence model infrastructure in a conventional speech to speech translation system are essential drawback of the conventional techniques. Embodiments of the invention provide direct speech to speech translation system is disclosed. Direct speech to speech translation system uses a one-tier approach, creating a unified-model for whole application. The single-model ecosystem takes in audio (mel spectrogram) as an input and gives out audio (mel spectrogram) as an output. This solves the bottleneck problem by not converting speech directly to text but having text as a byproduct of speech to speech translation, preserving phonetic information along the way. This model also uses pre-processing and post-processing scripts but only for the whole model. This model needs parallel audio samples in two languages. The training methodology involves augmenting or changing both sides of the audio equally.


Find Patent Forward Citations

Loading…