Patents by Inventor Alexander Gruenstein

Alexander Gruenstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11854533
    Abstract: Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.
    Type: Grant
    Filed: January 28, 2022
    Date of Patent: December 26, 2023
    Assignee: GOOGLE LLC
    Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Li Wan, Alexander Gruenstein, Hakan Erdogan
  • Publication number: 20220157298
    Abstract: Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.
    Type: Application
    Filed: January 28, 2022
    Publication date: May 19, 2022
    Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Li Wan, Alexander Gruenstein, Hakan Erdogan
  • Patent number: 11238847
    Abstract: Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.
    Type: Grant
    Filed: December 4, 2019
    Date of Patent: February 1, 2022
    Assignee: Google LLC
    Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Li Wan, Alexander Gruenstein, Hakan Erdogan
  • Publication number: 20210312907
    Abstract: Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.
    Type: Application
    Filed: December 4, 2019
    Publication date: October 7, 2021
    Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Li Wan, Alexander Gruenstein, Hakan Erdogan
  • Publication number: 20150279354
    Abstract: An apparatus to personalize voice recognition on a client device includes a microphone, an embedded speech recognizer, a tag comparator, a client query manager, a user interface and a tag generator. An embedded speech recognizer receives an audio input from a user and generates recognition candidates, selecting one recognition candidate from the generated candidates. A tag comparator compares the audio stream with a first stored audio tag. The client query manager receives the selected recognition candidate and if the tag comparator matches the audio stream with the first audio tag then the client query manager executes an associated query. If no tag match is found, then the client query manager executes a query using the selected recognition candidate. After an indication from the user of a selected result, a tag generator stores a second audio tag in the storage based on the selected recognition candidate and the selected result.
    Type: Application
    Filed: September 30, 2011
    Publication date: October 1, 2015
    Applicant: Google Inc.
    Inventors: Alexander Gruenstein, William J. Byrne
  • Patent number: 8868428
    Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.
    Type: Grant
    Filed: August 14, 2012
    Date of Patent: October 21, 2014
    Assignee: Google Inc.
    Inventors: Alexander Gruenstein, William J. Byrne
  • Patent number: 8412532
    Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.
    Type: Grant
    Filed: November 2, 2011
    Date of Patent: April 2, 2013
    Assignee: Google Inc.
    Inventors: Alexander Gruenstein, William J. Byrne
  • Publication number: 20120310645
    Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.
    Type: Application
    Filed: August 14, 2012
    Publication date: December 6, 2012
    Applicant: GOOGLE INC.
    Inventors: Alexander Gruenstein, William J. Byrne
  • Publication number: 20120084079
    Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.
    Type: Application
    Filed: November 2, 2011
    Publication date: April 5, 2012
    Applicant: Google Inc.
    Inventors: Alexander Gruenstein, William J. Byrne
  • Publication number: 20110184740
    Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.
    Type: Application
    Filed: June 7, 2010
    Publication date: July 28, 2011
    Applicant: Google Inc.
    Inventors: Alexander GRUENSTEIN, William J. Byrne
  • Patent number: 7716056
    Abstract: A system and method to interactively converse with a cognitively overloaded user of a device, includes maintaining a knowledge base of information regarding the device and a domain, organizing the information in at least one of a relational manner and an ontological manner, receiving speech from the user, converting the speech into a word sequence, recognizing a partial proper name in the word sequence, identifying meaning structures from the word sequence using a model of the domain information, adjusting a boundary of the partial proper names to enhance an accuracy of the meaning structures, interpreting the meaning structures in a context of the conversation with the cognitively overloaded user using the knowledge base, selecting a content for a response to the cognitively overloaded user, generating the response based on the selected content, the context of the conversation, and grammatical rules, and synthesizing speech wave forms for the response.
    Type: Grant
    Filed: September 27, 2004
    Date of Patent: May 11, 2010
    Assignees: Robert Bosch Corporation, Volkswagen of America
    Inventors: Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Laura Hiatt, Hauke Schmidt, Alexander Gruenstein, Stanley Peters
  • Publication number: 20060074670
    Abstract: A system and method to interactively converse with a cognitively overloaded user of a device, includes maintaining a knowledge base of information regarding the device and a domain, organizing the information in at least one of a relational manner and an ontological manner, receiving speech from the user, converting the speech into a word sequence, recognizing a partial proper name in the word sequence, identifying meaning structures from the word sequence using a model of the domain information, adjusting a boundary of the partial proper names to enhance an accuracy of the meaning structures, interpreting the meaning structures in a context of the conversation with the cognitively overloaded user using the knowledge base, selecting a content for a response to the cognitively overloaded user, generating the response based on the selected content, the context of the conversation, and grammatical rules, and synthesizing speech wave forms for the response.
    Type: Application
    Filed: September 27, 2004
    Publication date: April 6, 2006
    Inventors: Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Laura Hiatt, Hauke Schmidt, Alexander Gruenstein, Stanley Peters