Patents by Inventor Alexander H. Gruenstein

Alexander H. Gruenstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

HOTWORD DETECTION ON MULTIPLE DEVICES

Publication number: 20240153507

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance. The actions further include determining that the utterance likely includes a particular, predefined hotword. The actions further include transmitting (i) data indicating that the computing device likely received the particular, predefined hotword, (ii) data identifying the computing device, and (iii) data identifying a group of nearby computing devices that includes the computing device. The actions further include receiving an instruction to commence speech recognition processing on the audio data. The actions further include in response to receiving the instruction to commence speech recognition processing on the audio data, processing at least a portion of the audio data using an automated speech recognizer on the computing device.

Type: Application

Filed: January 18, 2024

Publication date: May 9, 2024

Applicant: GOOGLE LLC

Inventors: Diego Melendo Casado, Alexander H. Gruenstein, Jakob Nicolaus Foerster
Hotword suppression

Patent number: 11967323

Abstract: A method includes adding, by a first computing device, a first audio watermark to first speech data corresponding to playback of a first utterance including a hotword used to invoke an attention of a second computing device. The method includes outputting, by the first computing device, the playback of the first utterance corresponding to the watermarked first speech data. The second computing device is configured to receive the watermarked first speech data and determine to cease processing of the watermarked first speech data.

Type: Grant

Filed: June 24, 2022

Date of Patent: April 23, 2024

Assignee: GOOGLE LLC

Inventors: Alexander H. Gruenstein, Taral Pradeep Joglekar, Vijayaditya Peddinti, Michiel A. U. Bacchiani
Hotword detection on multiple devices

Patent number: 11955121

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio data that corresponds to an utterance. The actions further include determining a likelihood that the utterance includes a hotword. The actions further include determining a loudness score for the audio data. The actions further include based on the loudness score, determining an amount of delay time. The actions further include, after the amount of delay time has elapsed, transmitting a signal that indicates that the computing device will initiate speech recognition processing on the audio data.

Type: Grant

Filed: April 28, 2021

Date of Patent: April 9, 2024

Assignee: GOOGLE LLC

Inventors: Jakob Nicolaus Foerster, Alexander H. Gruenstein
Hotword detection on multiple devices

Patent number: 11887603

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed, In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance. The actions further include determining that the utterance likely includes a particular, predefined hotword. The actions further include transmitting (i) data indicating that the computing device likely received the particular, predefined hotword, (ii) data identifying the computing device, and (iii) data identifying a group of nearby computing devices that includes the computing device. The actions further include receiving an instruction to commence speech recognition processing on the audio data. The actions further include in response to receiving the instruction to commence speech recognition processing on the audio data, processing at least a portion of the audio data using an automated speech recognizer on the computing device.

Type: Grant

Filed: March 10, 2022

Date of Patent: January 30, 2024

Assignee: GOOGLE LLC

Inventors: Diego Melendo Casado, Alexander H. Gruenstein, Jakob Nicolaus Foerster
SERVER SIDE HOTWORDING

Publication number: 20230343340

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting hotwords using a server. One of the methods includes receiving an audio signal encoding one or more utterances including a first utterance; determining whether at least a portion of the first utterance satisfies a first threshold of being at least a portion of a key phrase; in response to determining that at least the portion of the first utterance satisfies the first threshold of being at least a portion of a key phrase, sending the audio signal to a server system that determines whether the first utterance satisfies a second threshold of being the key phrase, the second threshold being more restrictive than the first threshold; and receiving tagged text data representing the one or more utterances encoded in the audio signal when the server system determines that the first utterance satisfies the second threshold.

Type: Application

Filed: June 30, 2023

Publication date: October 26, 2023

Applicant: GOOGLE LLC

Inventors: Alexander H. Gruenstein, Petar Aleksic, Johan Schalkwyk, Pedro J. Moreno Mengibar
Recorded media hotword trigger suppression

Patent number: 11798557

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword trigger suppression are disclosed. In one aspect, a method includes the actions of receiving, by a microphone of a computing device, audio corresponding to playback of an item of media content, the audio including an utterance of a predefined hotword that is associated with performing an operation on the computing device. The actions further include processing the audio. The actions further include in response to processing the audio, suppressing performance of the operation on the computing device.

Type: Grant

Filed: February 7, 2022

Date of Patent: October 24, 2023

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Johan Schalkwyk, Matthew Sharifi
MULTI-USER AUTHENTICATION ON A DEVICE

Publication number: 20230335116

Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.

Type: Application

Filed: June 16, 2023

Publication date: October 19, 2023

Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
Device Leadership Negotiation Among Voice Interface Devices

Publication number: 20230274741

Abstract: The various implementations described herein include methods and systems for determining device leadership among voice interface devices. In one aspect, a method is performed at a first electronic device of a plurality of electronic devices, each having microphones, a speaker, processors, and memory storing programs for execution by the processors. The first device detects a voice input. It determines a device state and a relevance of the voice input. It identifies a subset of electronic devices from the plurality to which the voice input is relevant. In accordance with a determination that the subset includes the first device, the first device determines a first score of a criterion associated with the voice input and receives second scores of the criterion from other devices in the subset. In accordance with a determination that the first score is higher than the second scores, the first device responds to the detected input.

Type: Application

Filed: May 4, 2023

Publication date: August 31, 2023

Applicant: Google LLC

Inventors: Kenneth Mixter, Diego Melendo Casado, Alexander H. Gruenstein, Terry Tai, Christopher Thaddeus Hughes, Matthew Nirvan Sharifi
Multi-user authentication on a device

Patent number: 11721326

Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker is not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.

Type: Grant

Filed: January 26, 2022

Date of Patent: August 8, 2023

Assignee: GOOGLE LLC

Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
Server side hotwording

Patent number: 11699443

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting hotwords using a server. One of the methods includes receiving an audio signal encoding one or more utterances including a first utterance; determining whether at least a portion of the first utterance satisfies a first threshold of being at least a portion of a key phrase; in response to determining that at least the portion of the first utterance satisfies the first threshold of being at least a portion of a key phrase, sending the audio signal to a server system that determines whether the first utterance satisfies a second threshold of being the key phrase, the second threshold being more restrictive than the first threshold; and receiving tagged text data representing the one or more utterances encoded in the audio signal when the server system determines that the first utterance satisfies the second threshold.

Type: Grant

Filed: June 2, 2021

Date of Patent: July 11, 2023

Assignee: GOOGLE LLC

Inventors: Alexander H. Gruenstein, Petar Aleksic, Johan Schalkwyk, Pedro J. Moreno Mengibar
Automatic Hotword Threshold Tuning

Publication number: 20230206908

Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.

Type: Application

Filed: March 10, 2023

Publication date: June 29, 2023

Applicant: Google LLC

Inventors: Aishanee Shah, Alexander H. Gruenstein, Ian C. McGraw
Cascade Architecture for Noise-Robust Keyword Spotting

Publication number: 20230097197

Abstract: A method (400) includes receiving, at a first processor (110) of a user device (102), streaming multi-channel audio (118) captured by an array of microphones (107), each channel (119) including respective audio features. For each channel, the method also includes processing, by the first processor, using a first stage hotword detector (210), the respective audio features to determine whether a hotword is detected. When the first stage hotword detector detects the hotword, the method also includes the first processor providing chomped raw audio data (212) to a second processor that processes, using a first noise cleaning algorithm (250), the chomped raw audio data to generate a clean monophonic audio chomp (260). The method also includes processing, by the second processor using a second stage hotword detector (220), the clean monophonic audio chomp to detect the hotword.

Type: Application

Filed: April 8, 2020

Publication date: March 30, 2023

Applicant: Google LLC

Inventors: Yiteng Huang, Alexander H. Gruenstein
Automatic hotword threshold tuning

Patent number: 11610578

Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.

Type: Grant

Filed: June 10, 2020

Date of Patent: March 21, 2023

Assignee: Google LLC

Inventors: Aishanee Shah, Alexander H. Gruenstein, Ian C. Mcgraw
Training multiple neural networks with different accuracy

Patent number: 11556793

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes generating a plurality of feature vectors that each model a different portion of an audio waveform, generating a first posterior probability vector for a first feature vector using a first neural network, determining whether one of the scores in the first posterior probability vector satisfies a first threshold value, generating a second posterior probability vector for each subsequent feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases and includes more inner layer nodes than the first neural network, and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value.

Type: Grant

Filed: December 29, 2020

Date of Patent: January 17, 2023

Assignee: Google LLC

Inventor: Alexander H. Gruenstein
HOTWORD SUPPRESSION

Publication number: 20220319519

Abstract: A method includes adding, by a first computing device, a first audio watermark to first speech data corresponding to playback of a first utterance including a hotword used to invoke an attention of a second computing device. The method includes outputting, by the first computing device, the playback of the first utterance corresponding to the watermarked first speech data. The second computing device is configured to receive the watermarked first speech data and determine to cease processing of the watermarked first speech data.

Type: Application

Filed: June 24, 2022

Publication date: October 6, 2022

Applicant: GOOGLE LLC

Inventors: Alexander H. GRUENSTEIN, Taral Pradeep JOGLEKAR, Vijayaditya PEDDINTI, Michiel A.U. BACCHIANI
DETECTING AND SUPPRESSING VOICE QUERIES

Publication number: 20220284899

Abstract: A computing system receives requests from client devices to process voice queries that have been detected in local environments of the client devices. The system identifies that a value that is based on a number of requests to process voice queries received by the system during a specified time interval satisfies one or more criteria. In response, the system triggers analysis of at least some of the requests received during the specified time interval to trigger analysis of at least some received requests to determine a set of requests that each identify a common voice query. The system can generate an electronic fingerprint that indicates a distinctive model of the common voice query. The fingerprint can then be used to detect an illegitimate voice query identified in a request from a client device at a later time.

Type: Application

Filed: May 20, 2022

Publication date: September 8, 2022

Applicant: GOOGLE LLC

Inventors: Alexander H. GRUENSTEIN, Aleksandar KRACUN, Matthew SHARIFI
Predicting and learning carrier phrases for speech input

Patent number: 11423888

Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.

Type: Grant

Filed: April 4, 2019

Date of Patent: August 23, 2022

Assignee: Google LLC

Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas H. Beeferman
MIXED MODEL SPEECH RECOGNITION

Publication number: 20220262365

Abstract: In one aspect, a method comprises accessing audio data generated by a computing device based on audio input from a user, the audio data encoding one or more user utterances. The method further comprises generating a first transcription of the utterances by performing speech recognition on the audio data using a first speech recognizer that employs a language model based on user-specific data. The method further comprises generating a second transcription of the utterances by performing speech recognition on the audio data using a second speech recognizer that employs a language model independent of user-specific data. The method further comprises determining that the second transcription of the utterances includes a term from a predefined set of one or more terms. The method further comprises, based on determining that the second transcription of the utterance includes the term, providing an output of the first transcription of the utterance.

Type: Application

Filed: May 3, 2022

Publication date: August 18, 2022

Applicant: Google LLC

Inventors: Alexander H. Gruenstein, Petar Aleksic
Hotword suppression

Patent number: 11373652

Abstract: A method includes obtaining, by data processing hardware, a plurality of non-watermarked speech samples. Each non-watermarked speech does not include an audio watermark sample. The method includes, from each non-watermarked speech sample of the plurality of non-watermarked speech samples, generating one or more corresponding watermarked speech samples that each include at least one audio watermark. The method includes training, using the plurality of non-watermarked speech samples and corresponding watermarked speech samples, a model to determine whether a given audio data sample includes an audio watermark, and after training the model, transmitting the trained model to a user computing device.

Type: Grant

Filed: May 14, 2020

Date of Patent: June 28, 2022

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Taral Pradeep Joglekar, Vijayaditya Peddinti, Michiel A. u. Bacchiani
HOTWORD DETECTION ON MULTIPLE DEVICES

Publication number: 20220199090

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed, In one aspect, a method includes the actions of receiving audio data that corresponds to an utterance. The actions further include determining that the utterance likely includes a particular, predefined hotword. The actions further include transmitting (i) data indicating that the computing device likely received the particular, predefined hotword, (ii) data identifying the computing device, and (iii) data identifying a group of nearby computing devices that includes the computing device. The actions further include receiving an instruction to commence speech recognition processing on the audio data. The actions further include in response to receiving the instruction to commence speech recognition processing on the audio data, processing at least a portion of the audio data using an automated speech recognizer on the computing device.

Type: Application

Filed: March 10, 2022

Publication date: June 23, 2022

Applicant: GOOGLE LLC

Inventors: Diego Melendo Casado, Alexander H. Gruenstein, Jakob Nicolaus Foerster

1 2 3 4 5 … next