Patents by Inventor Ian C. McGraw

Ian C. McGraw has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11948062
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a compressed recurrent neural network (RNN). One of the systems includes a compressed RNN, the compressed RNN comprising a plurality of recurrent layers, wherein each of the recurrent layers has a respective recurrent weight matrix and a respective inter-layer weight matrix, and wherein at least one of recurrent layers is compressed such that a respective recurrent weight matrix of the compressed layer is defined by a first compressed weight matrix and a projection matrix and a respective inter-layer weight matrix of the compressed layer is defined by a second compressed weight matrix and the projection matrix.
    Type: Grant
    Filed: December 4, 2020
    Date of Patent: April 2, 2024
    Assignee: Google LLC
    Inventors: Ouais Alsharif, Rohit Prakash Prabhavalkar, Ian C. McGraw, Antoine Jean Bruguier
  • Patent number: 11948570
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.
    Type: Grant
    Filed: March 9, 2022
    Date of Patent: April 2, 2024
    Assignee: Google LLC
    Inventors: Wei Li, Rohit Prakash Prabhavalkar, Kanury Kanishka Rao, Yanzhang He, Ian C. Mcgraw, Anton Bakhtin
  • Publication number: 20230206908
    Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.
    Type: Application
    Filed: March 10, 2023
    Publication date: June 29, 2023
    Applicant: Google LLC
    Inventors: Aishanee Shah, Alexander H. Gruenstein, Ian C. McGraw
  • Publication number: 20230169984
    Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.
    Type: Application
    Filed: January 30, 2023
    Publication date: June 1, 2023
    Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
  • Patent number: 11610578
    Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.
    Type: Grant
    Filed: June 10, 2020
    Date of Patent: March 21, 2023
    Assignee: Google LLC
    Inventors: Aishanee Shah, Alexander H. Gruenstein, Ian C. Mcgraw
  • Patent number: 11568878
    Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.
    Type: Grant
    Filed: April 16, 2021
    Date of Patent: January 31, 2023
    Assignee: GOOGLE LLC
    Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
  • Publication number: 20220335953
    Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.
    Type: Application
    Filed: April 16, 2021
    Publication date: October 20, 2022
    Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
  • Publication number: 20220310072
    Abstract: Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.
    Type: Application
    Filed: June 3, 2020
    Publication date: September 29, 2022
    Inventors: Tara N. Sainath, Ruoming Pang, David Rybach, Yanzhang He, Rohit Prabhavalkar, Wei Li, Mirkó Visontai, Qiao Liang, Trevor Strohman, Yonghui Wu, Ian C. McGraw, Chung-Cheng Chiu
  • Publication number: 20220199084
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers, generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.
    Type: Application
    Filed: March 9, 2022
    Publication date: June 23, 2022
    Applicant: Google LLC
    Inventors: Wei Li, Rohit Prakash Prabhavalkar, Kanury Kanishka Rao, Yanzhang He, Ian C. McGraw, Anton Bakhtin
  • Patent number: 11295739
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: April 5, 2022
    Inventors: Wei Li, Rohit Prakash Prabhavalkar, Kanury Kanishka Rao, Yanzhang He, Ian C. McGraw, Anton Bakhtin
  • Publication number: 20210390948
    Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.
    Type: Application
    Filed: June 10, 2020
    Publication date: December 16, 2021
    Applicant: Google LLC
    Inventors: Aishanee Shah, Alexander H. Gruenstein, Ian C. Mcgraw
  • Publication number: 20210089916
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a compressed recurrent neural network (RNN). One of the systems includes a compressed RNN, the compressed RNN comprising a plurality of recurrent layers, wherein each of the recurrent layers has a respective recurrent weight matrix and a respective inter-layer weight matrix, and wherein at least one of recurrent layers is compressed such that a respective recurrent weight matrix of the compressed layer is defined by a first compressed weight matrix and a projection matrix and a respective inter-layer weight matrix of the compressed layer is defined by a second compressed weight matrix and the projection matrix.
    Type: Application
    Filed: December 4, 2020
    Publication date: March 25, 2021
    Applicant: Google LLC
    Inventors: Ouais Alsharif, Rohit Prakash Prabhavalkar, Ian C. McGraw, Antoine Jean Bruguier
  • Patent number: 10878319
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a compressed recurrent neural network (RNN). One of the systems includes a compressed RNN, the compressed RNN comprising a plurality of recurrent layers, wherein each of the recurrent layers has a respective recurrent weight matrix and a respective inter-layer weight matrix, and wherein at least one of recurrent layers is compressed such that a respective recurrent weight matrix of the compressed layer is defined by a first compressed weight matrix and a projection matrix and a respective inter-layer weight matrix of the compressed layer is defined by a second compressed weight matrix and the projection matrix.
    Type: Grant
    Filed: December 29, 2016
    Date of Patent: December 29, 2020
    Assignee: Google LLC
    Inventors: Ouais Alsharif, Rohit Prakash Prabhavalkar, Ian C. McGraw, Antoine Jean Bruguier
  • Publication number: 20200066271
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.
    Type: Application
    Filed: July 31, 2019
    Publication date: February 27, 2020
    Inventors: Wei Li, Rohit Prakash Prabhavalkar, Kanury Kanishka Rao, Yanzhang He, Ian C. McGraw, Anton Bakhtin
  • Patent number: 10140978
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.
    Type: Grant
    Filed: September 13, 2017
    Date of Patent: November 27, 2018
    Assignee: Google LLC
    Inventors: Alexander H. Gruenstein, Dave Harwath, Ian C. McGraw
  • Publication number: 20180012592
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.
    Type: Application
    Filed: September 13, 2017
    Publication date: January 11, 2018
    Inventors: Alexander H. Gruenstein, Dave Harwath, Ian C. McGraw
  • Patent number: 9779724
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.
    Type: Grant
    Filed: November 4, 2014
    Date of Patent: October 3, 2017
    Assignee: Google Inc.
    Inventors: Alexander H. Gruenstein, Dave Harwath, Ian C. McGraw
  • Publication number: 20170220925
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a compressed recurrent neural network (RNN). One of the systems includes a compressed RNN, the compressed RNN comprising a plurality of recurrent layers, wherein each of the recurrent layers has a respective recurrent weight matrix and a respective inter-layer weight matrix, and wherein at least one of recurrent layers is compressed such that a respective recurrent weight matrix of the compressed layer is defined by a first compressed weight matrix and a projection matrix and a respective inter-layer weight matrix of the compressed layer is defined by a second compressed weight matrix and the projection matrix.
    Type: Application
    Filed: December 29, 2016
    Publication date: August 3, 2017
    Inventors: Ouais Alsharif, Rohit Prakash Prabhavalkar, Ian C. McGraw, Antoine Jean Bruguier
  • Publication number: 20150127346
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.
    Type: Application
    Filed: November 4, 2014
    Publication date: May 7, 2015
    Inventors: Alexander H. Gruenstein, Dave Harwath, Ian C. McGraw
  • Patent number: 8909512
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for predicting the stability of speech recognition results. In one aspect, a method includes determining a length of time, or a number of occasions, in which a word has remained in an incremental speech recognizer's top hypothesis, and assigning a stability metric to the word based on the length of time or number of occasions.
    Type: Grant
    Filed: May 1, 2012
    Date of Patent: December 9, 2014
    Assignee: Google Inc.
    Inventors: Ian C. McGraw, Alexander H. Gruenstein