The patent badge is an abbreviated version of the USPTO patent document. The patent badge does contain a link to the full patent document.

G10L 15/22 (2006.01); G10L 15/26 (2006.01); G10L 15/16 (2006.01); G10L 15/06 (2013.01); G06N 3/04 (2023.01); G06N 3/08 (2023.01);

U.S. Cl.

CPC ...

G10L 15/063 (2013.01); G06N 3/0445 (2013.01); G06N 3/08 (2013.01);

Abstract

Techniques performed by a data processing system for training a Recurrent Neural Network Transducer (RNN-T) herein include encoder pretraining by training a neural network-based token classification model using first token-aligned training data representing a plurality of utterances, where each utterance is associated with a plurality of frames of audio data and tokens representing each utterance are aligned with frame boundaries of the plurality of audio frames; obtaining first cross-entropy (CE) criterion from the token classification model, wherein the CE criterion represent a divergence between expected outputs and reference outputs of the model; pretraining an encoder of an RNN-T based on the first CE criterion; and training the RNN-T with second training data after pretraining the encoder of the RNN-T. These techniques also include whole-network pre-training of the RNN-T. A RNN-T pretrained using these techniques may be used to process audio data that includes spoken content to obtain a textual representation.

Find Patent Forward Citations