Patents by Inventor Alexander H. Gruenstein
Alexander H. Gruenstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9484022Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes generating a plurality of feature vectors that each model a different portion of an audio waveform, generating a first posterior probability vector for a first feature vector using a first neural network, determining whether one of the scores in the first posterior probability vector satisfies a first threshold value, generating a second posterior probability vector for each subsequent feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases and includes more inner layer nodes than the first neural network, and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value.Type: GrantFiled: May 23, 2014Date of Patent: November 1, 2016Assignee: Google Inc.Inventor: Alexander H. Gruenstein
-
Publication number: 20160314786Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.Type: ApplicationFiled: July 5, 2016Publication date: October 27, 2016Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas H. Beeferman
-
Publication number: 20160300571Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio data that corresponds to an utterance. The actions further include determining a likelihood that the utterance includes a hotword. The actions further include determining a loudness score for the audio data. The actions further include based on the loudness score, determining an amount of delay time. The actions further include, after the amount of delay time has elapsed, transmitting a signal that indicates that the computing device will initiate speech recognition processing on the audio data.Type: ApplicationFiled: June 23, 2016Publication date: October 13, 2016Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein
-
Patent number: 9424841Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio data that corresponds to an utterance. The actions further include determining a likelihood that the utterance includes a hotword. The actions further include determining a loudness score for the audio data. The actions further include based on the loudness score, determining an amount of delay time. The actions further include, after the amount of delay time has elapsed, transmitting a signal that indicates that the computing device will initiate speech recognition processing on the audio data.Type: GrantFiled: March 17, 2015Date of Patent: August 23, 2016Assignee: Google Inc.Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein
-
Patent number: 9418656Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for multi-stage hotword detection are disclosed. In one aspect, a method includes the actions of receiving, by a second stage hotword detector of a multi-stage hotword detection system that includes at least a first stage hotword detector and the second stage hotword detector, audio data that corresponds to an initial portion of an utterance. The actions further include determining a likelihood that the initial portion of the utterance includes a hotword. The actions further include determining that the likelihood that the initial portion of the utterance includes the hotword satisfies a threshold. The actions further include, in response to determining that the likelihood satisfies the threshold, transmitting a request for the first stage hotword detector to cease providing additional audio data that corresponds to one or more subsequent portions of the utterance.Type: GrantFiled: March 13, 2015Date of Patent: August 16, 2016Assignee: Google Inc.Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein, Diego Melendo Casado
-
Patent number: 9412360Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.Type: GrantFiled: April 15, 2014Date of Patent: August 9, 2016Assignee: Google Inc.Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas H. Beeferman
-
Publication number: 20160125877Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for multi-stage hotword detection are disclosed. In one aspect, a method includes the actions of receiving, by a second stage hotword detector of a multi-stage hotword detection system that includes at least a first stage hotword detector and the second stage hotword detector, audio data that corresponds to an initial portion of an utterance. The actions further include determining a likelihood that the initial portion of the utterance includes a hotword. The actions further include determining that the likelihood that the initial portion of the utterance includes the hotword satisfies a threshold. The actions further include, in response to determining that the likelihood satisfies the threshold, transmitting a request for the first stage hotword detector to cease providing additional audio data that corresponds to one or more subsequent portions of the utterance.Type: ApplicationFiled: March 13, 2015Publication date: May 5, 2016Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein, Diego Melendo Casado
-
Publication number: 20160104483Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio data that corresponds to an utterance. The actions further include determining a likelihood that the utterance includes a hotword. The actions further include determining a loudness score for the audio data. The actions further include based on the loudness score, determining an amount of delay time. The actions further include, after the amount of delay time has elapsed, transmitting a signal that indicates that the computing device will initiate speech recognition processing on the audio data.Type: ApplicationFiled: March 17, 2015Publication date: April 14, 2016Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein
-
Patent number: 9202462Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for key phrase detection. One of the methods includes receiving a plurality of audio frame vectors that each model an audio waveform during a different period of time, generating an output feature vector for each of the audio frame vectors, wherein each output feature vector includes a set of scores that characterize an acoustic match between the corresponding audio frame vector and a set of expected event vectors, each of the expected event vectors corresponding to one of the scores and defining acoustic properties of at least a portion of a keyword, and providing each of the output feature vectors to a posterior handling module.Type: GrantFiled: September 30, 2013Date of Patent: December 1, 2015Assignee: Google Inc.Inventors: Maria Carolina Parada San Martin, Alexander H. Gruenstein, Guoguo Chen
-
Publication number: 20150340032Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes generating a plurality of feature vectors that each model a different portion of an audio waveform, generating a first posterior probability vector for a first feature vector using a first neural network, determining whether one of the scores in the first posterior probability vector satisfies a first threshold value, generating a second posterior probability vector for each subsequent feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases and includes more inner layer nodes than the first neural network, and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value.Type: ApplicationFiled: May 23, 2014Publication date: November 26, 2015Applicant: Google Inc.Inventor: Alexander H. Gruenstein
-
Publication number: 20150127346Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.Type: ApplicationFiled: November 4, 2014Publication date: May 7, 2015Inventors: Alexander H. Gruenstein, Dave Harwath, Ian C. McGraw
-
Publication number: 20150095027Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for key phrase detection. One of the methods includes receiving a plurality of audio frame vectors that each model an audio waveform during a different period of time, generating an output feature vector for each of the audio frame vectors, wherein each output feature vector includes a set of scores that characterize an acoustic match between the corresponding audio frame vector and a set of expected event vectors, each of the expected event vectors corresponding to one of the scores and defining acoustic properties of at least a portion of a keyword, and providing each of the output feature vectors to a posterior handling module.Type: ApplicationFiled: September 30, 2013Publication date: April 2, 2015Applicant: Google Inc.Inventors: Maria Carolina Parada San Martin, Alexander H. Gruenstein, Guoguo Chen
-
Patent number: 8909512Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for predicting the stability of speech recognition results. In one aspect, a method includes determining a length of time, or a number of occasions, in which a word has remained in an incremental speech recognizer's top hypothesis, and assigning a stability metric to the word based on the length of time or number of occasions.Type: GrantFiled: May 1, 2012Date of Patent: December 9, 2014Assignee: Google Inc.Inventors: Ian C. McGraw, Alexander H. Gruenstein
-
Publication number: 20140229185Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.Type: ApplicationFiled: April 15, 2014Publication date: August 14, 2014Applicant: Google Inc.Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas H. Beeferman
-
Patent number: 8738377Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop of a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.Type: GrantFiled: June 7, 2010Date of Patent: May 27, 2014Assignee: Google Inc.Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas Beeferman
-
Patent number: 8626511Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing voice commands. In one aspect, a method includes receiving an audio signal at a server, performing, by the server, speech recognition on the audio signal to identify one or more candidate terms that match one or more portions of the audio signal, identifying one or more possible intended actions for each candidate term, providing information for display on a client device, the information specifying the candidate terms and the actions for each candidate term, receiving from the client device an indication of an action selected by a user, where the action was selected from among the actions included in the provided information, and invoking the action selected by the user.Type: GrantFiled: January 22, 2010Date of Patent: January 7, 2014Assignee: Google Inc.Inventors: Michael J. LeBeau, William J. Byrne, Nicholas Jitkoff, Alexander H. Gruenstein
-
Publication number: 20130346078Abstract: In one aspect, a method comprises accessing audio data generated by a computing device based on audio input from a user, the audio data encoding one or more user utterances. The method further comprises generating a first transcription of the utterances by performing speech recognition on the audio data using a first speech recognizer that employs a language model based on user-specific data. The method further comprises generating a second transcription of the utterances by performing speech recognition on the audio data using a second speech recognizer that employs a language model independent of user-specific data. The method further comprises determining that the second transcription of the utterances includes a term from a predefined set of one or more terms. The method further comprises, based on determining that the second transcription of the utterance includes the term, providing an output of the first transcription of the utterance.Type: ApplicationFiled: March 15, 2013Publication date: December 26, 2013Inventors: Alexander H. Gruenstein, Petar Aleksic
-
Patent number: 8600742Abstract: A method is performed by a communication device that is configured to communicate with a server over a network. The method includes outputting, to the server, speech data for spoken words; receiving, from the server, speech recognition candidates for a spoken word in the speech data; checking the speech recognition candidates against a database on the communication device; and selecting one or more of the speech recognition candidates for use by the communication device based on the checking.Type: GrantFiled: September 30, 2011Date of Patent: December 3, 2013Assignee: Google Inc.Inventor: Alexander H Gruenstein
-
Patent number: 8489398Abstract: A method is performed by a communication device that is configured to communicate with a server over a network. The method includes outputting, to the server, speech data for spoken words; receiving, from the server, speech recognition candidates for a spoken word in the speech data; checking the speech recognition candidates against a database on the communication device; and selecting one or more of the speech recognition candidates for use by the communication device based on the checking.Type: GrantFiled: January 14, 2011Date of Patent: July 16, 2013Assignee: Google Inc.Inventor: Alexander H. Gruenstein
-
Publication number: 20130110492Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for predicting the stability of speech recognition results. In one aspect, a method includes determining a length of time, or a number of occasions, in which a word has remained in an incremental speech recognizer's top hypothesis, and assigning a stability metric to the word based on the length of time or number of occasions.Type: ApplicationFiled: May 1, 2012Publication date: May 2, 2013Applicant: GOOGLE INC.Inventors: Ian C. McGraw, Alexander H. Gruenstein