Patents by Inventor Yash Sheth

Yash Sheth has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speaker diartzation using an end-to-end model

Patent number: 11545157

Abstract: Techniques are described for training and/or utilizing an end-to-end speaker diarization model. In various implementations, the model is a recurrent neural network (RNN) model, such as an RNN model that includes at least one memory layer, such as a long short-term memory (LSTM) layer. Audio features of audio data can be applied as input to an end-to-end speaker diarization model trained according to implementations disclosed herein, and the model utilized to process the audio features to generate, as direct output over the model, speaker diarization results. Further, the end-to-end speaker diarization model can be a sequence-to-sequence model, where the sequence can have variable length. Accordingly, the model can be utilized to generate speaker diarization results for any of various length audio segments.

Type: Grant

Filed: April 15, 2019

Date of Patent: January 3, 2023

Assignee: GOOGLE LLC

Inventors: Quan Wang, Yash Sheth, Ignacio Lopez Moreno, Li Wan
SPEAKER DIARIZATION USING AN END-TO-END MODEL

Publication number: 20200152207

Abstract: Techniques are described for training and/or utilizing an end-to-end speaker diarization model. In various implementations, the model is a recurrent neural network (RNN) model, such as an RNN model that includes at least one memory layer, such as a long short-term memory (LSTM) layer. Audio features of audio data can be applied as input to an end-to-end speaker diarization model trained according to implementations disclosed herein, and the model utilized to process the audio features to generate, as direct output over the model, speaker diarization results. Further, the end-to-end speaker diarization model can be a sequence-to-sequence model, where the sequence can have variable length. Accordingly, the model can be utilized to generate speaker diarization results for any of various length audio segments.

Type: Application

Filed: April 15, 2019

Publication date: May 14, 2020

Inventors: Quan Wang, Yash Sheth, Ignacio Lopez Moreno, Li Wan

Speaker diartzation using an end-to-end model

SPEAKER DIARIZATION USING AN END-TO-END MODEL