The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.
The patent badge is an abbreviated version of the USPTO patent document. The patent badge covers the following: Patent number, Date patent was issued, Date patent was filed, Title of the patent, Applicant, Inventor, Assignee, Attorney firm, Primary examiner, Assistant examiner, CPCs, and Abstract. The patent badge does contain a link to the full patent document (in Adobe Acrobat format, aka pdf). To download or print any patent click here.
Patent No.:
Date of Patent:
Jul. 30, 2024
Filed:
Nov. 23, 2021
Baidu Usa, Llc, Sunnyvale, CA (US);
Renjie Zheng, San Jose, CA (US);
Junkun Chen, Corvallis, OR (US);
Mingbo Ma, Milpitas, CA (US);
Liang Huang, Mountain View, CA (US);
Baidu USA LLC, Sunnyvale, CA (US);
Abstract
Representation learning for text and speech has improved many language-related tasks. However, existing methods only learn from one input modality, while a unified representation for both speech and text is needed for tasks such as end-to-end speech translation. Consequently, these methods cannot exploit various large-scale text and speech data and their performance is limited by the scarcity of parallel speech translation data. To address these problems, embodiments of a fused acoustic and text masked language model (FAT-MLM) are disclosed. FAT-MLM embodiments jointly learn a unified representation for both acoustic and text input from various types of corpora including parallel data for speech recognition and machine translation, and pure speech and text data. Within this cross-modal representation learning framework, an end-to-end model is further presented for fused acoustic and text speech translation. Experiments show that by fine-tuning from FAT-MLM, the speech translation model embodiments substantially improve translation quality.