Patents by Inventor Ian C. McGraw

Ian C. McGraw has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Automatic hotword threshold tuning

Patent number: 12277928

Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.

Type: Grant

Filed: March 10, 2023

Date of Patent: April 15, 2025

Assignee: Google LLC

Inventors: Aishanee Shah, Alexander H. Gruenstein, Ian C. McGraw
VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION

Publication number: 20240363122

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.

Type: Application

Filed: July 5, 2024

Publication date: October 31, 2024

Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
Voice shortcut detection with speaker verification

Patent number: 12033641

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.

Type: Grant

Filed: January 30, 2023

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
KEY PHRASE SPOTTING

Publication number: 20240221750

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.

Type: Application

Filed: March 19, 2024

Publication date: July 4, 2024

Applicant: Google LLC

Inventors: Wei Li, Rohit Prakash Prabhavalkar, Kanury Kanishka Rao, Yanzhang He, Ian C. McGraw, Anton Bakhtin
Compressed recurrent neural network models

Patent number: 11948062

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a compressed recurrent neural network (RNN). One of the systems includes a compressed RNN, the compressed RNN comprising a plurality of recurrent layers, wherein each of the recurrent layers has a respective recurrent weight matrix and a respective inter-layer weight matrix, and wherein at least one of recurrent layers is compressed such that a respective recurrent weight matrix of the compressed layer is defined by a first compressed weight matrix and a projection matrix and a respective inter-layer weight matrix of the compressed layer is defined by a second compressed weight matrix and the projection matrix.

Type: Grant

Filed: December 4, 2020

Date of Patent: April 2, 2024

Assignee: Google LLC

Inventors: Ouais Alsharif, Rohit Prakash Prabhavalkar, Ian C. McGraw, Antoine Jean Bruguier
Key phrase spotting

Patent number: 11948570

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.

Type: Grant

Filed: March 9, 2022

Date of Patent: April 2, 2024

Assignee: Google LLC

Inventors: Wei Li, Rohit Prakash Prabhavalkar, Kanury Kanishka Rao, Yanzhang He, Ian C. Mcgraw, Anton Bakhtin
Automatic Hotword Threshold Tuning

Publication number: 20230206908

Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.

Type: Application

Filed: March 10, 2023

Publication date: June 29, 2023

Applicant: Google LLC

Inventors: Aishanee Shah, Alexander H. Gruenstein, Ian C. McGraw
VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION

Publication number: 20230169984

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.

Type: Application

Filed: January 30, 2023

Publication date: June 1, 2023

Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
Automatic hotword threshold tuning

Patent number: 11610578

Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.

Type: Grant

Filed: June 10, 2020

Date of Patent: March 21, 2023

Assignee: Google LLC

Inventors: Aishanee Shah, Alexander H. Gruenstein, Ian C. Mcgraw
Voice shortcut detection with speaker verification

Patent number: 11568878

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.

Type: Grant

Filed: April 16, 2021

Date of Patent: January 31, 2023

Assignee: GOOGLE LLC

Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION

Publication number: 20220335953

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.

Type: Application

Filed: April 16, 2021

Publication date: October 20, 2022

Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
TWO-PASS END TO END SPEECH RECOGNITION

Publication number: 20220310072

Abstract: Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.

Type: Application

Filed: June 3, 2020

Publication date: September 29, 2022

Inventors: Tara N. Sainath, Ruoming Pang, David Rybach, Yanzhang He, Rohit Prabhavalkar, Wei Li, Mirkó Visontai, Qiao Liang, Trevor Strohman, Yonghui Wu, Ian C. McGraw, Chung-Cheng Chiu
KEY PHRASE SPOTTING

Publication number: 20220199084

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers, generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.

Type: Application

Filed: March 9, 2022

Publication date: June 23, 2022

Applicant: Google LLC

Inventors: Wei Li, Rohit Prakash Prabhavalkar, Kanury Kanishka Rao, Yanzhang He, Ian C. McGraw, Anton Bakhtin
Key phrase spotting

Patent number: 11295739

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.

Type: Grant

Filed: July 31, 2019

Date of Patent: April 5, 2022

Inventors: Wei Li, Rohit Prakash Prabhavalkar, Kanury Kanishka Rao, Yanzhang He, Ian C. McGraw, Anton Bakhtin
Automatic Hotword Threshold Tuning

Publication number: 20210390948

Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.

Type: Application

Filed: June 10, 2020

Publication date: December 16, 2021

Applicant: Google LLC

Inventors: Aishanee Shah, Alexander H. Gruenstein, Ian C. Mcgraw
COMPRESSED RECURRENT NEURAL NETWORK MODELS

Publication number: 20210089916

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a compressed recurrent neural network (RNN). One of the systems includes a compressed RNN, the compressed RNN comprising a plurality of recurrent layers, wherein each of the recurrent layers has a respective recurrent weight matrix and a respective inter-layer weight matrix, and wherein at least one of recurrent layers is compressed such that a respective recurrent weight matrix of the compressed layer is defined by a first compressed weight matrix and a projection matrix and a respective inter-layer weight matrix of the compressed layer is defined by a second compressed weight matrix and the projection matrix.

Type: Application

Filed: December 4, 2020

Publication date: March 25, 2021

Applicant: Google LLC

Inventors: Ouais Alsharif, Rohit Prakash Prabhavalkar, Ian C. McGraw, Antoine Jean Bruguier
Compressed recurrent neural network models

Patent number: 10878319

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a compressed recurrent neural network (RNN). One of the systems includes a compressed RNN, the compressed RNN comprising a plurality of recurrent layers, wherein each of the recurrent layers has a respective recurrent weight matrix and a respective inter-layer weight matrix, and wherein at least one of recurrent layers is compressed such that a respective recurrent weight matrix of the compressed layer is defined by a first compressed weight matrix and a projection matrix and a respective inter-layer weight matrix of the compressed layer is defined by a second compressed weight matrix and the projection matrix.

Type: Grant

Filed: December 29, 2016

Date of Patent: December 29, 2020

Assignee: Google LLC

Inventors: Ouais Alsharif, Rohit Prakash Prabhavalkar, Ian C. McGraw, Antoine Jean Bruguier
KEY PHRASE SPOTTING

Publication number: 20200066271

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.

Type: Application

Filed: July 31, 2019

Publication date: February 27, 2020

Inventors: Wei Li, Rohit Prakash Prabhavalkar, Kanury Kanishka Rao, Yanzhang He, Ian C. McGraw, Anton Bakhtin
Selecting alternates in speech recognition

Patent number: 10140978

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.

Type: Grant

Filed: September 13, 2017

Date of Patent: November 27, 2018

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Dave Harwath, Ian C. McGraw
SELECTING ALTERNATES IN SPEECH RECOGNITION

Publication number: 20180012592

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.

Type: Application

Filed: September 13, 2017

Publication date: January 11, 2018

Inventors: Alexander H. Gruenstein, Dave Harwath, Ian C. McGraw

1 2 next