Patents by Inventor Khe Chai Sim

Khe Chai Sim has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ON-DEVICE SPEECH SYNTHESIS OF TEXTUAL SEGMENTS FOR TRAINING OF ON-DEVICE SPEECH RECOGNITION MODEL

Publication number: 20210104223

Abstract: Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using a speech synthesis model stored locally at the client device, to generate synthesized speech audio data that includes synthesized speech of the identified textual segment; process the synthesized speech, using an on-device speech recognition model that is stored locally at the client device, to generate predicted output; and generate a gradient based on comparing the predicted output to ground truth output that corresponds to the textual segment. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.

Type: Application

Filed: October 2, 2019

Publication date: April 8, 2021

Inventors: Françoise Beaufays, Johan Schalkwyk, Khe Chai Sim
MULTI-DIALECT AND MULTILINGUAL SPEECH RECOGNITION

Publication number: 20200160836

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer-readable media, for speech recognition using multi-dialect and multilingual models. In some implementations, audio data indicating audio characteristics of an utterance is received. Input features determined based on the audio data are provided to a speech recognition model that has been trained to output score indicating the likelihood of linguistic units for each of multiple different language or dialects. The speech recognition model can be one that has been trained using cluster adaptive training. Output that the speech recognition model generated in response to receiving the input features determined based on the audio data is received. A transcription of the utterance generated based on the output of the speech recognition model is provided.

Type: Application

Filed: November 14, 2019

Publication date: May 21, 2020

Inventors: Zhifeng Chen, Bo Li, Eugene Weinstein, Yonghui Wu, Pedro J. Moreno Mengibar, Ron J. Weiss, Khe Chai Sim, Tara N. Sainath, Patrick An Phu Nguyen

prev 1 2

ON-DEVICE SPEECH SYNTHESIS OF TEXTUAL SEGMENTS FOR TRAINING OF ON-DEVICE SPEECH RECOGNITION MODEL

MULTI-DIALECT AND MULTILINGUAL SPEECH RECOGNITION