Patents by Inventor Gabor Simko

Gabor Simko has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Joint Endpointing And Automatic Speech Recognition

Publication number: 20200335091

Abstract: A method includes receiving audio data of an utterance and processing the audio data to obtain, as output from a speech recognition model configured to jointly perform speech decoding and endpointing of utterances: partial speech recognition results for the utterance; and an endpoint indication indicating when the utterance has ended. While processing the audio data, the method also includes detecting, based on the endpoint indication, the end of the utterance. In response to detecting the end of the utterance, the method also includes terminating the processing of any subsequent audio data received after the end of the utterance was detected.

Type: Application

Filed: March 4, 2020

Publication date: October 22, 2020

Applicant: Google LLC

Inventors: Shuo-yiin Chang, Rohit Prakash Prabhavalkar, Gabor Simko, Tara N. Sainath, Bo Li, Yangzhang He
DETECTING CONTINUING CONVERSATIONS WITH COMPUTING DEVICES

Publication number: 20200272690

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting a continued conversation are disclosed. In one aspect, a method includes the actions of receiving first audio data of a first utterance. The actions further include obtaining a first transcription of the first utterance. The actions further include receiving second audio data of a second utterance. The actions further include obtaining a second transcription of the second utterance. The actions further include determining whether the second utterance includes a query directed to a query processing system based on analysis of the second transcription and the first transcription or a response to the first query. The actions further include configuring the data routing component to provide the second transcription of the second utterance to the query processing system as a second query or bypass routing the second transcription.

Type: Application

Filed: November 27, 2019

Publication date: August 27, 2020

Inventors: Nathan David Howard, Gabor Simko, Andrei Giurgiu, Behshad Behzadi, Marcin M. Nowak-Przygodzki
END OF QUERY DETECTION

Publication number: 20200168242

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting an end of a query are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance spoken by a user. The actions further include applying, to the audio data, an end of query model. The actions further include determining the confidence score that reflects a likelihood that the utterance is a complete utterance. The actions further include comparing the confidence score that reflects the likelihood that the utterance is a complete utterance to a confidence score threshold. The actions further include determining whether the utterance is likely complete or likely incomplete. The actions further include providing, for output, an instruction to (i) maintain a microphone that is receiving the utterance in an active state or (ii) deactivate the microphone that is receiving the utterance.

Type: Application

Filed: January 31, 2020

Publication date: May 28, 2020

Applicant: Google LLC

Inventors: Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
Unified Endpointer Using Multitask and Multidomain Learning

Publication number: 20200117996

Abstract: A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.

Type: Application

Filed: December 11, 2019

Publication date: April 16, 2020

Applicant: Google LLC

Inventors: Shuo-yiin Chang, Bo Li, Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
End of query detection

Patent number: 10593352

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting an end of a query are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance spoken by a user. The actions further include applying, to the audio data, an end of query model. The actions further include determining the confidence score that reflects a likelihood that the utterance is a complete utterance. The actions further include comparing the confidence score that reflects the likelihood that the utterance is a complete utterance to a confidence score threshold. The actions further include determining whether the utterance is likely complete or likely incomplete. The actions further include providing, for output, an instruction to (i) maintain a microphone that is receiving the utterance in an active state or (ii) deactivate the microphone that is receiving the utterance.

Type: Grant

Filed: June 6, 2018

Date of Patent: March 17, 2020

Assignee: Google LLC

Inventors: Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
UTTERANCE CLASSIFIER

Publication number: 20190304459

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant.

Type: Application

Filed: May 2, 2019

Publication date: October 3, 2019

Inventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
Utterance classifier

Patent number: 10311872

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant.

Type: Grant

Filed: July 25, 2017

Date of Patent: June 4, 2019

Assignee: Google LLC

Inventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
Voice activity detection

Patent number: 10229700

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting voice activity. In one aspect, a method include actions of receiving, by a neural network included in an automated voice activity detection system, a raw audio waveform, processing, by the neural network, the raw audio waveform to determine whether the audio waveform includes speech, and provide, by the neural network, a classification of the raw audio waveform indicating whether the raw audio waveform includes speech.

Type: Grant

Filed: January 4, 2016

Date of Patent: March 12, 2019

Assignee: Google LLC

Inventors: Tara N. Sainath, Gabor Simko, Maria Carolina Parada San Martin, Ruben Zazo Candil
UTTERANCE CLASSIFIER

Publication number: 20190035390

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for classification using neural networks. One method includes receiving audio data corresponding to an utterance. Obtaining a transcription of the utterance. Generating a representation of the audio data. Generating a representation of the transcription of the utterance. Providing (i) the representation of the audio data and (ii) the representation of the transcription of the utterance to a classifier that, based on a given representation of the audio data and a given representation of the transcription of the utterance, is trained to output an indication of whether the utterance associated with the given representation is likely directed to an automated assistance or is likely not directed to an automated assistant.

Type: Application

Filed: July 25, 2017

Publication date: January 31, 2019

Inventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
END OF QUERY DETECTION

Publication number: 20180350395

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting an end of a query are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance spoken by a user. The actions further include applying, to the audio data, an end of query model. The actions further include determining the confidence score that reflects a likelihood that the utterance is a complete utterance. The actions further include comparing the confidence score that reflects the likelihood that the utterance is a complete utterance to a confidence score threshold. The actions further include determining whether the utterance is likely complete or likely incomplete. The actions further include providing, for output, an instruction to (i) maintain a microphone that is receiving the utterance in an active state or (ii) deactivate the microphone that is receiving the utterance.

Type: Application

Filed: June 6, 2018

Publication date: December 6, 2018

Inventors: Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
Voice Activity Detection

Publication number: 20170092297

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting voice activity. In one aspect, a method include actions of receiving, by a neural network included in an automated voice activity detection system, a raw audio waveform, processing, by the neural network, the raw audio waveform to determine whether the audio waveform includes speech, and provide, by the neural network, a classification of the raw audio waveform indicating whether the raw audio waveform includes speech.

Type: Application

Filed: January 4, 2016

Publication date: March 30, 2017

Inventors: Tara N. Sainath, Gabor Simko, Maria Carolina Parada San Martin

prev 1 2