Patents by Inventor Alexander H. Gruenstein

Alexander H. Gruenstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speech recognition using two language models

Patent number: 11341972

Abstract: In one aspect, a method comprises accessing audio data generated by a computing device based on audio input from a user, the audio data encoding one or more user utterances. The method further comprises generating a first transcription of the utterances by performing speech recognition on the audio data using a first speech recognizer that employs a language model based on user-specific data. The method further comprises generating a second transcription of the utterances by performing speech recognition on the audio data using a second speech recognizer that employs a language model independent of user-specific data. The method further comprises determining that the second transcription of the utterances includes a term from a predefined set of one or more terms. The method further comprises, based on determining that the second transcription of the utterance includes the term, providing an output of the first transcription of the utterance.

Type: Grant

Filed: October 22, 2020

Date of Patent: May 24, 2022

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Petar Aleksic
Detecting and suppressing voice queries

Patent number: 11341969

Abstract: A computing system receives requests from client devices to process voice queries that have been detected in local environments of the client devices. The system identifies that a value that is based on a number of requests to process voice queries received by the system during a specified time interval satisfies one or more criteria. In response, the system triggers analysis of at least some of the requests received during the specified time interval to trigger analysis of at least some received requests to determine a set of requests that each identify a common voice query. The system can generate an electronic fingerprint that indicates a distinctive model of the common voice query. The fingerprint can then be used to detect an illegitimate voice query identified in a request from a client device at a later time.

Type: Grant

Filed: May 27, 2020

Date of Patent: May 24, 2022

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Aleksandar Kracun, Matthew Sharifi
RECORDED MEDIA HOTWORD TRIGGER SUPPRESSION

Publication number: 20220157312

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword trigger suppression are disclosed. In one aspect, a method includes the actions of receiving, by a microphone of a computing device, audio corresponding to playback of an item of media content, the audio including an utterance of a predefined hotword that is associated with performing an operation on the computing device. The actions further include processing the audio. The actions further include in response to processing the audio, suppressing performance of the operation on the computing device.

Type: Application

Filed: February 7, 2022

Publication date: May 19, 2022

Applicant: Google LLC

Inventors: Alexander H. Gruenstein, Johan Schalkwyk, Matthew Sharifi
MULTI-USER AUTHENTICATION ON A DEVICE

Publication number: 20220148577

Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The speaker of the utterance is classified as not a known user of the device. A query that includes the authentication tokens that correspond to known users of the device, a representation of the utterance and an indication that the speaker was classified as not a known user of the device is provided to the server. A response to the query is received at the device and from the server based on the query.

Type: Application

Filed: January 26, 2022

Publication date: May 12, 2022

Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
Hotword detection on multiple devices

Patent number: 11276406

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance. The actions further include determining that the utterance likely includes a particular, predefined hotword. The actions further include transmitting (i) data indicating that the computing device likely received the particular, predefined hotword, (ii) data identifying the computing device, and (iii) data identifying a group of nearby computing devices that includes the computing device. The actions further include receiving an instruction to commence speech recognition processing on the audio data. The actions further include in response to receiving the instruction to commence speech recognition processing on the audio data, processing at least a portion of the audio data using an automated speech recognizer on the computing device.

Type: Grant

Filed: May 28, 2020

Date of Patent: March 15, 2022

Assignee: Google LLC

Inventors: Diego Melendo Casado, Alexander H. Gruenstein, Jakob Nicolaus Foerster
Recorded media hotword trigger suppression

Patent number: 11257498

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword trigger suppression are disclosed. In one aspect, a method includes the actions of receiving, by a microphone of a computing device, audio corresponding to playback of an item of media content, the audio including an utterance of a predefined hotword that is associated with performing an operation on the computing device. The actions further include processing the audio. The actions further include in response to processing the audio, suppressing performance of the operation on the computing device.

Type: Grant

Filed: November 20, 2020

Date of Patent: February 22, 2022

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Johan Schalkwyk, Matthew Sharifi
Multi-user authentication on a device

Patent number: 11238848

Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The speaker of the utterance is classified as not a known user of the device. A query that includes the authentication tokens that correspond to known users of the device, a representation of the utterance, and an indication that the speaker was classified as not a known user of the device is provided to the server. A response to the query is received at the device and from the server based on the query.

Type: Grant

Filed: December 10, 2019

Date of Patent: February 1, 2022

Assignee: Google LLC

Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
Automatic Hotword Threshold Tuning

Publication number: 20210390948

Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.

Type: Application

Filed: June 10, 2020

Publication date: December 16, 2021

Applicant: Google LLC

Inventors: Aishanee Shah, Alexander H. Gruenstein, Ian C. Mcgraw
SERVER SIDE HOTWORDING

Publication number: 20210287678

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting hotwords using a server. One of the methods includes receiving an audio signal encoding one or more utterances including a first utterance; determining whether at least a portion of the first utterance satisfies a first threshold of being at least a portion of a key phrase; in response to determining that at least the portion of the first utterance satisfies the first threshold of being at least a portion of a key phrase, sending the audio signal to a server system that determines whether the first utterance satisfies a second threshold of being the key phrase, the second threshold being more restrictive than the first threshold; and receiving tagged text data representing the one or more utterances encoded in the audio signal when the server system determines that the first utterance satisfies the second threshold.

Type: Application

Filed: June 2, 2021

Publication date: September 16, 2021

Applicant: GOOGLE LLC

Inventors: Alexander H. Gruenstein, Petar Aleksic, Johan Schalkwyk, Pedro J. Moreno Mengibar
HOTWORD DETECTION ON MULTIPLE DEVICES

Publication number: 20210249016

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio data that corresponds to an utterance. The actions further include determining a likelihood that the utterance includes a hotword. The actions further include determining a loudness score for the audio data. The actions further include based on the loudness score, determining an amount of delay time. The actions further include, after the amount of delay time has elapsed, transmitting a signal that indicates that the computing device will initiate speech recognition processing on the audio data.

Type: Application

Filed: April 28, 2021

Publication date: August 12, 2021

Applicant: GOOGLE LLC

Inventors: Jakob Nicolaus FOERSTER, Alexander H. Gruenstein
Server side hotwording

Patent number: 11049504

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting hotwords using a server. One of the methods includes receiving an audio signal encoding one or more utterances including a first utterance; determining whether at least a portion of the first utterance satisfies a first threshold of being at least a portion of a key phrase; in response to determining that at least the portion of the first utterance satisfies the first threshold of being at least a portion of a key phrase, sending the audio signal to a server system that determines whether the first utterance satisfies a second threshold of being the key phrase, the second threshold being more restrictive than the first threshold; and receiving tagged text data representing the one or more utterances encoded in the audio signal when the server system determines that the first utterance satisfies the second threshold.

Type: Grant

Filed: May 27, 2020

Date of Patent: June 29, 2021

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Petar Aleksic, Johan Schalkwyk, Pedro J. Moreno Mengibar
Hotword detection on multiple devices

Patent number: 11024313

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio data that corresponds to an utterance. The actions further include determining a likelihood that the utterance includes a hotword. The actions further include determining a loudness score for the audio data. The actions further include based on the loudness score, determining an amount of delay time. The actions further include, after the amount of delay time has elapsed, transmitting a signal that indicates that the computing device will initiate speech recognition processing on the audio data.

Type: Grant

Filed: April 28, 2020

Date of Patent: June 1, 2021

Assignee: Google LLC

Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein
TRAINING MULTIPLE NEURAL NETWORKS WITH DIFFERENT ACCURACY

Publication number: 20210117797

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes generating a plurality of feature vectors that each model a different portion of an audio waveform, generating a first posterior probability vector for a first feature vector using a first neural network, determining whether one of the scores in the first posterior probability vector satisfies a first threshold value, generating a second posterior probability vector for each subsequent feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases and includes more inner layer nodes than the first neural network, and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value.

Type: Application

Filed: December 29, 2020

Publication date: April 22, 2021

Applicant: Google LLC

Inventor: Alexander H. Gruenstein
RECORDED MEDIA HOTWORD TRIGGER SUPPRESSION

Publication number: 20210074292

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword trigger suppression are disclosed. In one aspect, a method includes the actions of receiving, by a microphone of a computing device, audio corresponding to playback of an item of media content, the audio including an utterance of a predefined hotword that is associated with performing an operation on the computing device. The actions further include processing the audio. The actions further include in response to processing the audio, suppressing performance of the operation on the computing device.

Type: Application

Filed: November 20, 2020

Publication date: March 11, 2021

Applicant: Google LLC

Inventors: Alexander H. Gruenstein, Johan Schalkwyk, Matthew Sharifi
MIXED MODEL SPEECH RECOGNITION

Publication number: 20210043212

Abstract: In one aspect, a method comprises accessing audio data generated by a computing device based on audio input from a user, the audio data encoding one or more user utterances. The method further comprises generating a first transcription of the utterances by performing speech recognition on the audio data using a first speech recognizer that employs a language model based on user-specific data. The method further comprises generating a second transcription of the utterances by performing speech recognition on the audio data using a second speech recognizer that employs a language model independent of user-specific data. The method further comprises determining that the second transcription of the utterances includes a term from a predefined set of one or more terms. The method further comprises, based on determining that the second transcription of the utterance includes the term, providing an output of the first transcription of the utterance.

Type: Application

Filed: October 22, 2020

Publication date: February 11, 2021

Applicant: Google LLC

Inventors: Alexander H. Gruenstein, Petar Aleksic
Training multiple neural networks with different accuracy

Patent number: 10909456

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes generating a plurality of feature vectors that each model a different portion of an audio waveform, generating a first posterior probability vector for a first feature vector using a first neural network, determining whether one of the scores in the first posterior probability vector satisfies a first threshold value, generating a second posterior probability vector for each subsequent feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases and includes more inner layer nodes than the first neural network, and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value.

Type: Grant

Filed: October 21, 2019

Date of Patent: February 2, 2021

Assignee: Google LLC

Inventor: Alexander H. Gruenstein
Recorded media hotword trigger suppression

Patent number: 10867600

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword trigger suppression are disclosed. In one aspect, a method includes the actions of receiving, by a microphone of a computing device, audio corresponding to playback of an item of media content, the audio including an utterance of a predefined hotword that is associated with performing an operation on the computing device. The actions further include processing the audio. The actions further include in response to processing the audio, suppressing performance of the operation on the computing device.

Type: Grant

Filed: October 31, 2017

Date of Patent: December 15, 2020

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Johan Schalkwyk, Matthew Sharifi
Using two automated speech recognizers for speech recognition

Patent number: 10847160

Abstract: A method includes receiving audio data corresponding to an utterance and generating, by an automated speech recognizer, a personalized transcription associated with a voice action. The personalized transcription includes one or more of one or more terms that are not included in a vocabulary of a cloud-based automated speech recognizer. The method also includes transmitting the audio data to the cloud-based automated speech recognizer. The cloud-based automated speech recognizer is configured to generate a mistranscription of the utterance and transmit the mistranscription of the utterance to a mobile computing device or a digital assistant device. When the mistranscription of the utterance includes a term associated with the voice action, the method also includes providing a search results page that includes a control for initiating the voice action and one or more search results that are generated based on the mistranscription of the utterance generated by the cloud-based automated speech recognizer.

Type: Grant

Filed: March 6, 2018

Date of Patent: November 24, 2020

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Petar Aleksic
HOTWORD DETECTION ON MULTIPLE DEVICES

Publication number: 20200365159

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance. The actions further include determining that the utterance likely includes a particular, predefined hotword. The actions further include transmitting (i) data indicating that the computing device likely received the particular, predefined hotword, (ii) data identifying the computing device, and (iii) data identifying a group of nearby computing devices that includes the computing device. The actions further include receiving an instruction to commence speech recognition processing on the audio data. The actions further include in response to receiving the instruction to commence speech recognition processing on the audio data, processing at least a portion of the audio data using an automated speech recognizer on the computing device.

Type: Application

Filed: May 28, 2020

Publication date: November 19, 2020

Applicant: Google LLC

Inventors: Diego Melendo Casado, Alexander H. Gruenstein, Jakob Nicolaus Foerster
SERVER SIDE HOTWORDING

Publication number: 20200365158

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting hotwords using a server. One of the methods includes receiving an audio signal encoding one or more utterances including a first utterance; determining whether at least a portion of the first utterance satisfies a first threshold of being at least a portion of a key phrase; in response to determining that at least the portion of the first utterance satisfies the first threshold of being at least a portion of a key phrase, sending the audio signal to a server system that determines whether the first utterance satisfies a second threshold of being the key phrase, the second threshold being more restrictive than the first threshold; and receiving tagged text data representing the one or more utterances encoded in the audio signal when the server system determines that the first utterance satisfies the second threshold.

Type: Application

Filed: May 27, 2020

Publication date: November 19, 2020

Applicant: Google LLC

Inventors: Alexander H. Gruenstein, Petar Aleksic, Johan Schalkwyk, Pedro J. Moreno Mengibar

prev 1 2 3 4 5 6 next