Patents by Inventor Alexander Gruenstein

Alexander Gruenstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speaker awareness using speaker dependent speech model(s)

Patent number: 11854533

Abstract: Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.

Type: Grant

Filed: January 28, 2022

Date of Patent: December 26, 2023

Assignee: GOOGLE LLC

Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Li Wan, Alexander Gruenstein, Hakan Erdogan
SPEAKER AWARENESS USING SPEAKER DEPENDENT SPEECH MODEL(S)

Publication number: 20220157298

Abstract: Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.

Type: Application

Filed: January 28, 2022

Publication date: May 19, 2022

Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Li Wan, Alexander Gruenstein, Hakan Erdogan
Speaker awareness using speaker dependent speech model(s)

Patent number: 11238847

Abstract: Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.

Type: Grant

Filed: December 4, 2019

Date of Patent: February 1, 2022

Assignee: Google LLC

Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Li Wan, Alexander Gruenstein, Hakan Erdogan
SPEAKER AWARENESS USING SPEAKER DEPENDENT SPEECH MODEL(S)

Publication number: 20210312907

Abstract: Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.

Type: Application

Filed: December 4, 2019

Publication date: October 7, 2021

Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Li Wan, Alexander Gruenstein, Hakan Erdogan
Personalization and Latency Reduction for Voice-Activated Commands

Publication number: 20150279354

Abstract: An apparatus to personalize voice recognition on a client device includes a microphone, an embedded speech recognizer, a tag comparator, a client query manager, a user interface and a tag generator. An embedded speech recognizer receives an audio input from a user and generates recognition candidates, selecting one recognition candidate from the generated candidates. A tag comparator compares the audio stream with a first stored audio tag. The client query manager receives the selected recognition candidate and if the tag comparator matches the audio stream with the first audio tag then the client query manager executes an associated query. If no tag match is found, then the client query manager executes a query using the selected recognition candidate. After an indication from the user of a selected result, a tag generator stores a second audio tag in the storage based on the selected recognition candidate and the selected result.

Type: Application

Filed: September 30, 2011

Publication date: October 1, 2015

Applicant: Google Inc.

Inventors: Alexander Gruenstein, William J. Byrne
Integration of embedded and network speech recognizers

Patent number: 8868428

Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.

Type: Grant

Filed: August 14, 2012

Date of Patent: October 21, 2014

Assignee: Google Inc.

Inventors: Alexander Gruenstein, William J. Byrne
Integration of embedded and network speech recognizers

Patent number: 8412532

Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.

Type: Grant

Filed: November 2, 2011

Date of Patent: April 2, 2013

Assignee: Google Inc.

Inventors: Alexander Gruenstein, William J. Byrne
INTEGRATION OF EMBEDDED AND NETWORK SPEECH RECOGNIZERS

Publication number: 20120310645

Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.

Type: Application

Filed: August 14, 2012

Publication date: December 6, 2012

Applicant: GOOGLE INC.

Inventors: Alexander Gruenstein, William J. Byrne
Integration of Embedded and Network Speech Recognizers

Publication number: 20120084079

Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.

Type: Application

Filed: November 2, 2011

Publication date: April 5, 2012

Applicant: Google Inc.

Inventors: Alexander Gruenstein, William J. Byrne
Integration of Embedded and Network Speech Recognizers

Publication number: 20110184740

Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.

Type: Application

Filed: June 7, 2010

Publication date: July 28, 2011

Applicant: Google Inc.

Inventors: Alexander GRUENSTEIN, William J. Byrne
Method and system for interactive conversational dialogue for cognitively overloaded device users

Patent number: 7716056

Abstract: A system and method to interactively converse with a cognitively overloaded user of a device, includes maintaining a knowledge base of information regarding the device and a domain, organizing the information in at least one of a relational manner and an ontological manner, receiving speech from the user, converting the speech into a word sequence, recognizing a partial proper name in the word sequence, identifying meaning structures from the word sequence using a model of the domain information, adjusting a boundary of the partial proper names to enhance an accuracy of the meaning structures, interpreting the meaning structures in a context of the conversation with the cognitively overloaded user using the knowledge base, selecting a content for a response to the cognitively overloaded user, generating the response based on the selected content, the context of the conversation, and grammatical rules, and synthesizing speech wave forms for the response.

Type: Grant

Filed: September 27, 2004

Date of Patent: May 11, 2010

Assignees: Robert Bosch Corporation, Volkswagen of America

Inventors: Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Laura Hiatt, Hauke Schmidt, Alexander Gruenstein, Stanley Peters
Method and system for interactive conversational dialogue for cognitively overloaded device users

Publication number: 20060074670

Abstract: A system and method to interactively converse with a cognitively overloaded user of a device, includes maintaining a knowledge base of information regarding the device and a domain, organizing the information in at least one of a relational manner and an ontological manner, receiving speech from the user, converting the speech into a word sequence, recognizing a partial proper name in the word sequence, identifying meaning structures from the word sequence using a model of the domain information, adjusting a boundary of the partial proper names to enhance an accuracy of the meaning structures, interpreting the meaning structures in a context of the conversation with the cognitively overloaded user using the knowledge base, selecting a content for a response to the cognitively overloaded user, generating the response based on the selected content, the context of the conversation, and grammatical rules, and synthesizing speech wave forms for the response.

Type: Application

Filed: September 27, 2004

Publication date: April 6, 2006

Inventors: Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Laura Hiatt, Hauke Schmidt, Alexander Gruenstein, Stanley Peters