Patents by Inventor Alexander H. Gruenstein

Alexander H. Gruenstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Training multiple neural networks with different accuracy

Patent number: 9484022

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes generating a plurality of feature vectors that each model a different portion of an audio waveform, generating a first posterior probability vector for a first feature vector using a first neural network, determining whether one of the scores in the first posterior probability vector satisfies a first threshold value, generating a second posterior probability vector for each subsequent feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases and includes more inner layer nodes than the first neural network, and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value.

Type: Grant

Filed: May 23, 2014

Date of Patent: November 1, 2016

Assignee: Google Inc.

Inventor: Alexander H. Gruenstein
PREDICTING AND LEARNING CARRIER PHRASES FOR SPEECH INPUT

Publication number: 20160314786

Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.

Type: Application

Filed: July 5, 2016

Publication date: October 27, 2016

Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas H. Beeferman
HOTWORD DETECTION ON MULTIPLE DEVICES

Publication number: 20160300571

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio data that corresponds to an utterance. The actions further include determining a likelihood that the utterance includes a hotword. The actions further include determining a loudness score for the audio data. The actions further include based on the loudness score, determining an amount of delay time. The actions further include, after the amount of delay time has elapsed, transmitting a signal that indicates that the computing device will initiate speech recognition processing on the audio data.

Type: Application

Filed: June 23, 2016

Publication date: October 13, 2016

Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein
Hotword detection on multiple devices

Patent number: 9424841

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio data that corresponds to an utterance. The actions further include determining a likelihood that the utterance includes a hotword. The actions further include determining a loudness score for the audio data. The actions further include based on the loudness score, determining an amount of delay time. The actions further include, after the amount of delay time has elapsed, transmitting a signal that indicates that the computing device will initiate speech recognition processing on the audio data.

Type: Grant

Filed: March 17, 2015

Date of Patent: August 23, 2016

Assignee: Google Inc.

Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein
Multi-stage hotword detection

Patent number: 9418656

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for multi-stage hotword detection are disclosed. In one aspect, a method includes the actions of receiving, by a second stage hotword detector of a multi-stage hotword detection system that includes at least a first stage hotword detector and the second stage hotword detector, audio data that corresponds to an initial portion of an utterance. The actions further include determining a likelihood that the initial portion of the utterance includes a hotword. The actions further include determining that the likelihood that the initial portion of the utterance includes the hotword satisfies a threshold. The actions further include, in response to determining that the likelihood satisfies the threshold, transmitting a request for the first stage hotword detector to cease providing additional audio data that corresponds to one or more subsequent portions of the utterance.

Type: Grant

Filed: March 13, 2015

Date of Patent: August 16, 2016

Assignee: Google Inc.

Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein, Diego Melendo Casado
Predicting and learning carrier phrases for speech input

Patent number: 9412360

Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.

Type: Grant

Filed: April 15, 2014

Date of Patent: August 9, 2016

Assignee: Google Inc.

Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas H. Beeferman
MULTI-STAGE HOTWORD DETECTION

Publication number: 20160125877

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for multi-stage hotword detection are disclosed. In one aspect, a method includes the actions of receiving, by a second stage hotword detector of a multi-stage hotword detection system that includes at least a first stage hotword detector and the second stage hotword detector, audio data that corresponds to an initial portion of an utterance. The actions further include determining a likelihood that the initial portion of the utterance includes a hotword. The actions further include determining that the likelihood that the initial portion of the utterance includes the hotword satisfies a threshold. The actions further include, in response to determining that the likelihood satisfies the threshold, transmitting a request for the first stage hotword detector to cease providing additional audio data that corresponds to one or more subsequent portions of the utterance.

Type: Application

Filed: March 13, 2015

Publication date: May 5, 2016

Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein, Diego Melendo Casado
HOTWORD DETECTION ON MULTIPLE DEVICES

Publication number: 20160104483

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio data that corresponds to an utterance. The actions further include determining a likelihood that the utterance includes a hotword. The actions further include determining a loudness score for the audio data. The actions further include based on the loudness score, determining an amount of delay time. The actions further include, after the amount of delay time has elapsed, transmitting a signal that indicates that the computing device will initiate speech recognition processing on the audio data.

Type: Application

Filed: March 17, 2015

Publication date: April 14, 2016

Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein
Key phrase detection

Patent number: 9202462

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for key phrase detection. One of the methods includes receiving a plurality of audio frame vectors that each model an audio waveform during a different period of time, generating an output feature vector for each of the audio frame vectors, wherein each output feature vector includes a set of scores that characterize an acoustic match between the corresponding audio frame vector and a set of expected event vectors, each of the expected event vectors corresponding to one of the scores and defining acoustic properties of at least a portion of a keyword, and providing each of the output feature vectors to a posterior handling module.

Type: Grant

Filed: September 30, 2013

Date of Patent: December 1, 2015

Assignee: Google Inc.

Inventors: Maria Carolina Parada San Martin, Alexander H. Gruenstein, Guoguo Chen
TRAINING MULTIPLE NEURAL NETWORKS WITH DIFFERENT ACCURACY

Publication number: 20150340032

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes generating a plurality of feature vectors that each model a different portion of an audio waveform, generating a first posterior probability vector for a first feature vector using a first neural network, determining whether one of the scores in the first posterior probability vector satisfies a first threshold value, generating a second posterior probability vector for each subsequent feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases and includes more inner layer nodes than the first neural network, and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value.

Type: Application

Filed: May 23, 2014

Publication date: November 26, 2015

Applicant: Google Inc.

Inventor: Alexander H. Gruenstein
SELECTING ALTERNATES IN SPEECH RECOGNITION

Publication number: 20150127346

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.

Type: Application

Filed: November 4, 2014

Publication date: May 7, 2015

Inventors: Alexander H. Gruenstein, Dave Harwath, Ian C. McGraw
KEY PHRASE DETECTION

Publication number: 20150095027

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for key phrase detection. One of the methods includes receiving a plurality of audio frame vectors that each model an audio waveform during a different period of time, generating an output feature vector for each of the audio frame vectors, wherein each output feature vector includes a set of scores that characterize an acoustic match between the corresponding audio frame vector and a set of expected event vectors, each of the expected event vectors corresponding to one of the scores and defining acoustic properties of at least a portion of a keyword, and providing each of the output feature vectors to a posterior handling module.

Type: Application

Filed: September 30, 2013

Publication date: April 2, 2015

Applicant: Google Inc.

Inventors: Maria Carolina Parada San Martin, Alexander H. Gruenstein, Guoguo Chen
Enhanced stability prediction for incrementally generated speech recognition hypotheses based on an age of a hypothesis

Patent number: 8909512

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for predicting the stability of speech recognition results. In one aspect, a method includes determining a length of time, or a number of occasions, in which a word has remained in an incremental speech recognizer's top hypothesis, and assigning a stability metric to the word based on the length of time or number of occasions.

Type: Grant

Filed: May 1, 2012

Date of Patent: December 9, 2014

Assignee: Google Inc.

Inventors: Ian C. McGraw, Alexander H. Gruenstein
PREDICTING AND LEARNING CARRIER PHRASES FOR SPEECH INPUT

Publication number: 20140229185

Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.

Type: Application

Filed: April 15, 2014

Publication date: August 14, 2014

Applicant: Google Inc.

Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas H. Beeferman
Predicting and learning carrier phrases for speech input

Patent number: 8738377

Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop of a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.

Type: Grant

Filed: June 7, 2010

Date of Patent: May 27, 2014

Assignee: Google Inc.

Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas Beeferman
Multi-dimensional disambiguation of voice commands

Patent number: 8626511

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing voice commands. In one aspect, a method includes receiving an audio signal at a server, performing, by the server, speech recognition on the audio signal to identify one or more candidate terms that match one or more portions of the audio signal, identifying one or more possible intended actions for each candidate term, providing information for display on a client device, the information specifying the candidate terms and the actions for each candidate term, receiving from the client device an indication of an action selected by a user, where the action was selected from among the actions included in the provided information, and invoking the action selected by the user.

Type: Grant

Filed: January 22, 2010

Date of Patent: January 7, 2014

Assignee: Google Inc.

Inventors: Michael J. LeBeau, William J. Byrne, Nicholas Jitkoff, Alexander H. Gruenstein
MIXED MODEL SPEECH RECOGNITION

Publication number: 20130346078

Abstract: In one aspect, a method comprises accessing audio data generated by a computing device based on audio input from a user, the audio data encoding one or more user utterances. The method further comprises generating a first transcription of the utterances by performing speech recognition on the audio data using a first speech recognizer that employs a language model based on user-specific data. The method further comprises generating a second transcription of the utterances by performing speech recognition on the audio data using a second speech recognizer that employs a language model independent of user-specific data. The method further comprises determining that the second transcription of the utterances includes a term from a predefined set of one or more terms. The method further comprises, based on determining that the second transcription of the utterance includes the term, providing an output of the first transcription of the utterance.

Type: Application

Filed: March 15, 2013

Publication date: December 26, 2013

Inventors: Alexander H. Gruenstein, Petar Aleksic
Disambiguation of spoken proper names

Patent number: 8600742

Abstract: A method is performed by a communication device that is configured to communicate with a server over a network. The method includes outputting, to the server, speech data for spoken words; receiving, from the server, speech recognition candidates for a spoken word in the speech data; checking the speech recognition candidates against a database on the communication device; and selecting one or more of the speech recognition candidates for use by the communication device based on the checking.

Type: Grant

Filed: September 30, 2011

Date of Patent: December 3, 2013

Assignee: Google Inc.

Inventor: Alexander H Gruenstein
Disambiguation of spoken proper names

Patent number: 8489398

Abstract: A method is performed by a communication device that is configured to communicate with a server over a network. The method includes outputting, to the server, speech data for spoken words; receiving, from the server, speech recognition candidates for a spoken word in the speech data; checking the speech recognition candidates against a database on the communication device; and selecting one or more of the speech recognition candidates for use by the communication device based on the checking.

Type: Grant

Filed: January 14, 2011

Date of Patent: July 16, 2013

Assignee: Google Inc.

Inventor: Alexander H. Gruenstein
Enhanced stability prediction for incrementally generated speech recognition hypotheses

Publication number: 20130110492

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for predicting the stability of speech recognition results. In one aspect, a method includes determining a length of time, or a number of occasions, in which a word has remained in an incremental speech recognizer's top hypothesis, and assigning a stability metric to the word based on the length of time or number of occasions.

Type: Application

Filed: May 1, 2012

Publication date: May 2, 2013

Applicant: GOOGLE INC.

Inventors: Ian C. McGraw, Alexander H. Gruenstein

prev 1 2 3 4 5 6 next