Patents by Inventor Rajeev Rikhye

Rajeev Rikhye has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION

Publication number: 20240363122

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.

Type: Application

Filed: July 5, 2024

Publication date: October 31, 2024

Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
Voice shortcut detection with speaker verification

Patent number: 12033641

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.

Type: Grant

Filed: January 30, 2023

Date of Patent: July 9, 2024

Assignee: GOOGLE LLC

Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
Optimizing Personal VAD for On-Device Speech Recognition

Publication number: 20230298591

Abstract: A computer-implemented method includes receiving a sequence of acoustic frames corresponding to an utterance and generating a reference speaker embedding for the utterance. The method also includes receiving a target speaker embedding for a target speaker and generating feature-wise linear modulation (FiLM) parameters including a scaling vector and a shifting vector based on the target speaker embedding. The method also includes generating an affine transformation output that scales and shifts the reference speaker embedding based on the FiLM parameters. The method also includes generating a classification output indicating whether the utterance was spoken by the target speaker based on the affine transformation output.

Type: Application

Filed: March 17, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Shaojin Ding, Rajeev Rikhye, Qiao Liang, Yanzhang He, Quan Wang, Arun Narayanan, Tom O'Malley, Ian McGraw
VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION

Publication number: 20230169984

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.

Type: Application

Filed: January 30, 2023

Publication date: June 1, 2023

Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
Voice shortcut detection with speaker verification

Patent number: 11568878

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.

Type: Grant

Filed: April 16, 2021

Date of Patent: January 31, 2023

Assignee: GOOGLE LLC

Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION

Publication number: 20220335953

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.

Type: Application

Filed: April 16, 2021

Publication date: October 20, 2022

Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw

VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION

Voice shortcut detection with speaker verification

Optimizing Personal VAD for On-Device Speech Recognition

VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION

Voice shortcut detection with speaker verification

VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION