Patents by Inventor Pu-sen Chao

Pu-sen Chao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Assessing Speaker Recognition Performance

Publication number: 20240079013

Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.

Type: Application

Filed: November 9, 2023

Publication date: March 7, 2024

Applicant: Google LLC

Inventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
AUTOMATICALLY DETERMINING LANGUAGE FOR SPEECH RECOGNITION OF SPOKEN UTTERANCE RECEIVED VIA AN AUTOMATED ASSISTANT INTERFACE

Publication number: 20240054997

Abstract: Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user.

Type: Application

Filed: October 23, 2023

Publication date: February 15, 2024

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno
NON-WAKE WORD INVOCATION OF AN AUTOMATED ASSISTANT FROM CERTAIN UTTERANCES RELATED TO DISPLAY CONTENT

Publication number: 20240038246

Abstract: Implementations relate to an automated assistant that is responsive, without requiring an invocation phrase or other invocation input(s), to certain spoken utterances when certain display content is being accessed by a user. The display content can be processed to identify certain inputs and/or other intents and parameters that are associated with assistant operations and are relevant to the display content. Thereafter, the automated assistant can determine whether any spoken utterances from the user correspond to those certain inputs, intents, and/or parameters. In response to receiving such a spoken utterance, the automated assistant can initialize performance of the relevant operation without necessitating that the user provides a preceding invocation phrase or other invocation input(s). When other display content is being accessed, the automated assistant can repeat the process for other inputs and operations.

Type: Application

Filed: July 28, 2022

Publication date: February 1, 2024

Inventors: Pu-sen Chao, Alex Fandrianto, Muhammad Umair
HOT-WORD FREE PRE-EMPTION OF AUTOMATED ASSISTANT RESPONSE PRESENTATION

Publication number: 20230395066

Abstract: The presentation of an automated assistant response may be selectively pre-empted in response to a hot-word free utterance that is received during the presentation and that is determined to be likely directed to the automated assistant. The determination that the utterance is likely directed to the automated assistant may be performed, for example, using an utterance classification operation that is performed on audio data received during presentation of the response, and based upon such a determination, the response may be pre-empted with another response associated with the later-received utterance. In addition, the duration that is used to determine when a session should be terminated at the conclusion of a conversation between a user and an automated assistant may be dynamically controlled based upon when the presentation of a response has completed.

Type: Application

Filed: August 18, 2023

Publication date: December 7, 2023

Inventors: Pu-sen Chao, Alex Fandrianto
Assessing speaker recognition performance

Patent number: 11837238

Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.

Type: Grant

Filed: October 21, 2020

Date of Patent: December 5, 2023

Assignee: Google LLC

Inventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
AUTOMATICALLY DETERMINING LANGUAGE FOR SPEECH RECOGNITION OF SPOKEN UTTERANCE RECEIVED VIA AN AUTOMATED ASSISTANT INTERFACE

Publication number: 20230368784

Abstract: Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user.

Type: Application

Filed: July 28, 2023

Publication date: November 16, 2023

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno, William Zhang
Automatically determining language for speech recognition of spoken utterance received via an automated assistant interface

Patent number: 11817085

Abstract: Implementations relate to determining a language for speech recognition of a spoken utterance, received via an automated assistant interface, for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Selection of a speech recognition model for a particular language can based on one or more interaction characteristics exhibited during a dialog session between a user and an automated assistant. Such interaction characteristics can include anticipated user input types, anticipated user input durations, a duration for monitoring for a user response, and/or an actual duration of a provided user response.

Type: Grant

Filed: December 14, 2020

Date of Patent: November 14, 2023

Assignee: GOOGLE LLC

Inventors: Pu-Sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno
Adaptive interface in a voice-based networked system

Patent number: 11817084

Abstract: The present disclosure relates generally to determining a language for speech recognition of a spoken utterance, received via an automated assistant interface, for interacting with an automated assistant. The system can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Selection of a speech recognition model for a particular language can based on one or more interaction characteristics exhibited during a dialog session between a user and an automated assistant. Such interaction characteristics can include anticipated user input types, anticipated user input durations, a duration for monitoring for a user response, and/or an actual duration of a provided user response.

Type: Grant

Filed: May 21, 2020

Date of Patent: November 14, 2023

Assignee: GOOGLE LLC

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno
Automatically determining language for speech recognition of spoken utterance received via an automated assistant interface

Patent number: 11798541

Abstract: Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user.

Type: Grant

Filed: November 16, 2020

Date of Patent: October 24, 2023

Assignee: GOOGLE LLC

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno
MULTI-USER AUTHENTICATION ON A DEVICE

Publication number: 20230335116

Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.

Type: Application

Filed: June 16, 2023

Publication date: October 19, 2023

Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
Hot-word free pre-emption of automated assistant response presentation

Patent number: 11756533

Abstract: The presentation of an automated assistant response may be selectively pre-empted in response to a hot-word free utterance that is received during the presentation and that is determined to be likely directed to the automated assistant. The determination that the utterance is likely directed to the automated assistant may be performed, for example, using an utterance classification operation that is performed on audio data received during presentation of the response, and based upon such a determination, the response may be pre-empted with another response associated with the later-received utterance. In addition, the duration that is used to determine when a session should be terminated at the conclusion of a conversation between a user and an automated assistant may be dynamically controlled based upon when the presentation of a response has completed.

Type: Grant

Filed: May 15, 2020

Date of Patent: September 12, 2023

Assignee: GOOGLE LLC

Inventors: Pu-sen Chao, Alex Fandrianto
Automatically determining language for speech recognition of spoken utterance received via an automated assistant interface

Patent number: 11735173

Abstract: Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user.

Type: Grant

Filed: May 24, 2021

Date of Patent: August 22, 2023

Assignee: GOOGLE LLC

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno, William Zhang
Multi-user authentication on a device

Patent number: 11721326

Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker is not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.

Type: Grant

Filed: January 26, 2022

Date of Patent: August 8, 2023

Assignee: GOOGLE LLC

Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
TEXT INDEPENDENT SPEAKER RECOGNITION

Publication number: 20230113617

Abstract: Text independent speaker recognition models can be utilized by an automated assistant to verify a particular user spoke a spoken utterance and/or to identify the user who spoke a spoken utterance. Implementations can include automatically updating a speaker embedding for a particular user based on previous utterances by the particular user. Additionally or alternatively, implementations can include verifying a particular user spoke a spoken utterance using output generated by both a text independent speaker recognition model as well as a text dependent speaker recognition model. Furthermore, implementations can additionally or alternatively include prefetching content for several users associated with a spoken utterance prior to determining which user spoke the spoken utterance.

Type: Application

Filed: December 9, 2022

Publication date: April 13, 2023

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno, Quan Wang
Text independent speaker recognition

Patent number: 11527235

Abstract: Text independent speaker recognition models can be utilized by an automated assistant to verify a particular user spoke a spoken utterance and/or to identify the user who spoke a spoken utterance. Implementations can include automatically updating a speaker embedding for a particular user based on previous utterances by the particular user. Additionally or alternatively, implementations can include verifying a particular user spoke a spoken utterance using output generated by both a text independent speaker recognition model as well as a text dependent speaker recognition model. Furthermore, implementations can additionally or alternatively include prefetching content for several users associated with a spoken utterance prior to determining which user spoke the spoken utterance.

Type: Grant

Filed: December 2, 2019

Date of Patent: December 13, 2022

Assignee: GOOGLE LLC

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno, Quan Wang
MULTI-USER AUTHENTICATION ON A DEVICE

Publication number: 20220148577

Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The speaker of the utterance is classified as not a known user of the device. A query that includes the authentication tokens that correspond to known users of the device, a representation of the utterance and an indication that the speaker was classified as not a known user of the device is provided to the server. A response to the query is received at the device and from the server based on the query.

Type: Application

Filed: January 26, 2022

Publication date: May 12, 2022

Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
Assessing Speaker Recognition Performance

Publication number: 20220122614

Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.

Type: Application

Filed: October 21, 2020

Publication date: April 21, 2022

Applicant: Google LLC

Inventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
Multi-user authentication on a device

Patent number: 11238848

Abstract: In some implementations, authentication tokens corresponding to known users of a device are stored on the device. An utterance from a speaker is received. The speaker of the utterance is classified as not a known user of the device. A query that includes the authentication tokens that correspond to known users of the device, a representation of the utterance, and an indication that the speaker was classified as not a known user of the device is provided to the server. A response to the query is received at the device and from the server based on the query.

Type: Grant

Filed: December 10, 2019

Date of Patent: February 1, 2022

Assignee: Google LLC

Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
HOT-WORD FREE PRE-EMPTION OF AUTOMATED ASSISTANT RESPONSE PRESENTATION

Publication number: 20210358483

Abstract: The presentation of an automated assistant response may be selectively pre-empted in response to a hot-word free utterance that is received during the presentation and that is determined to be likely directed to the automated assistant. The determination that the utterance is likely directed to the automated assistant may be performed, for example, using an utterance classification operation that is performed on audio data received during presentation of the response, and based upon such a determination, the response may be pre-empted with another response associated with the later-received utterance. In addition, the duration that is used to determine when a session should be terminated at the conclusion of a conversation between a user and an automated assistant may be dynamically controlled based upon when the presentation of a response has completed.

Type: Application

Filed: May 15, 2020

Publication date: November 18, 2021

Inventors: Pu-sen Chao, Alex Fandrianto
AUTOMATICALLY DETERMINING LANGUAGE FOR SPEECH RECOGNITION OF SPOKEN UTTERANCE RECEIVED VIA AN AUTOMATED ASSISTANT INTERFACE

Publication number: 20210280177

Abstract: Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user.

Type: Application

Filed: May 24, 2021

Publication date: September 9, 2021

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno, William Zhang

1 2 next