John Miller is a prominent inventor based in Palo Alto, CA, known for his contributions to the field of neural text-to-speech (TTS) systems. With a total of 6 patents, Miller has been at the forefront of developing advanced technologies that enhance the quality and efficiency of speech synthesis.

Latest Patents

Miller's latest patents include groundbreaking work in real-time neural text-to-speech systems. These systems utilize deep neural networks and consist of five major components: a segmentation model for phoneme boundary detection, a grapheme-to-phoneme conversion model, a phoneme duration prediction model, a fundamental frequency prediction model, and an audio synthesis model. The segmentation model employs Connectionist Temporal Classification (CTC) loss for phoneme boundary detection, while the audio synthesis model is a variant of WaveNet that is more efficient in terms of parameters and training speed. This innovative approach allows for faster-than-real-time inference, making it a significant advancement over traditional TTS systems.

Another notable patent focuses on multi-speaker neural text-to-speech technology. This system enhances neural speech synthesis networks by incorporating low-dimensional trainable speaker embeddings, enabling the generation of speech from various voices using a single model. The improved single-speaker model, referred to as Deep Voice 2, along with a post-processing neural vocoder for Tacotron, demonstrates the capability of neural TTS systems to learn hundreds of unique voices from minimal audio data.

Career Highlights

Miller is currently employed at Baidu USA LLC, where he continues to push the boundaries of speech synthesis technology. His work has significantly impacted the way neural networks are applied in TTS systems, making them more accessible and efficient.

Collaborations

Miller collaborates with talented individuals such as Jonathan Raiman and Sercan Omer Arik, contributing to the advancement of neural speech synthesis technologies.

Conclusion

John Miller's innovative work in neural text-to-speech technology has led to significant advancements in the field. His patents reflect a commitment to improving the efficiency and quality of speech synthesis, making a lasting impact on the industry.

This text is generated by artificial intelligence and may not be accurate.

Please report any incorrect information to support@idiyas.com