Patents by Inventor David Grangier

David Grangier has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

END-TO-END SPEECH DIARIZATION VIA ITERATIVE SPEAKER EMBEDDING

Publication number: 20240144957

Abstract: A method includes receiving an input audio signal corresponding to utterances spoken by multiple speakers. The method also includes encoding the input audio signal into a sequence of T temporal embeddings. During each of a plurality of iterations each corresponding to a respective speaker of the multiple speakers, the method includes selecting a respective speaker embedding for the respective speaker by determining a probability that the corresponding temporal embedding includes a presence of voice activity by a single new speaker for which a speaker embedding was not previously selected during a previous iteration and selecting the respective speaker embedding for the respective speaker as the temporal embedding. The method also includes, at each time step, predicting a respective voice activity indicator for each respective speaker of the multiple speakers based on the respective speaker embeddings selected during the plurality of iterations and the temporal embedding.

Type: Application

Filed: December 19, 2023

Publication date: May 2, 2024

Applicant: Google LLC

Inventors: David Grangier, Neil Zeghidour, Oliver Teboul
GENERATING AUDIO USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

Publication number: 20240078412

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal; obtaining a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.

Type: Application

Filed: September 7, 2023

Publication date: March 7, 2024

Inventors: Neil Zeghidour, David Grangier, Marco Tagliasacchi, Raphaël Marinier, Olivier Teboul, Zalán Borsos
End-to-end speech diarization via iterative speaker embedding

Patent number: 11887623

Abstract: A method includes receiving an input audio signal corresponding to utterances spoken by multiple speakers. The method also includes encoding the input audio signal into a sequence of T temporal embeddings. During each of a plurality of iterations each corresponding to a respective speaker of the multiple speakers, the method includes selecting a respective speaker embedding for the respective speaker by determining a probability that the corresponding temporal embedding includes a presence of voice activity by a single new speaker for which a speaker embedding was not previously selected during a previous iteration and selecting the respective speaker embedding for the respective speaker as the temporal embedding. The method also includes, at each time step, predicting a respective voice activity indicator for each respective speaker of the multiple speakers based on the respective speaker embeddings selected during the plurality of iterations and the temporal embedding.

Type: Grant

Filed: June 22, 2021

Date of Patent: January 30, 2024

Assignee: Google LLC

Inventors: David Grangier, Neil Zeghidour, Oliver Teboul
Minimum Bayes Risk Decoding with Neural Quality Metrics

Publication number: 20230259759

Abstract: Provided are systems and methods for sequence-to-sequence modeling with neural quality metrics. More particularly, example aspects of the present disclosure relate to minimum bayes risk (MBR) decoding with neural metrics for machine translation. According to example aspects of the present disclosure, a set of candidate outputs can be sampled from a machine translation model given a source sequence. Given the set of candidate outputs, systems and methods according to example aspects of the present disclosure can select a hypothesis with high expected utility with respect to the distribution over a set of pseudo-references from the machine translation model.

Type: Application

Filed: February 16, 2022

Publication date: August 17, 2023

Inventors: Qijun Tan, Markus Freitag, David Grangier
SEPARATING SPEECH BY SOURCE IN AUDIO RECORDINGS BY PREDICTING ISOLATED AUDIO SIGNALS CONDITIONED ON SPEAKER REPRESENTATIONS

Publication number: 20230112265

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech separation. One of the methods includes obtaining a recording comprising speech from a plurality of speakers; processing the recording using a speaker neural network having speaker parameter values and configured to process the recording in accordance with the speaker parameter values to generate a plurality of per-recording speaker representations, each speaker representation representing features of a respective identified speaker in the recording; and processing the per-recording speaker representations and the recording using a separation neural network having separation parameter values and configured to process the recording and the speaker representations in accordance with the separation parameter values to generate, for each speaker representation, a respective predicted isolated audio signal that corresponds to speech of one of the speakers in the recording.

Type: Application

Filed: October 17, 2022

Publication date: April 13, 2023

Inventors: Neil Zeghidour, David Grangier
End-To-End Speech Diarization Via Iterative Speaker Embedding

Publication number: 20220375492

Abstract: A method includes receiving an input audio signal corresponding to utterances spoken by multiple speakers. The method also includes encoding the input audio signal into a sequence of T temporal embeddings. During each of a plurality of iterations each corresponding to a respective speaker of the multiple speakers, the method includes selecting a respective speaker embedding for the respective speaker by determining a probability that the corresponding temporal embedding includes a presence of voice activity by a single new speaker for which a speaker embedding was not previously selected during a previous iteration and selecting the respective speaker embedding for the respective speaker as the temporal embedding. The method also includes, at each time step, predicting a respective voice activity indicator for each respective speaker of the multiple speakers based on the respective speaker embeddings selected during the plurality of iterations and the temporal embedding.

Type: Application

Filed: June 22, 2021

Publication date: November 24, 2022

Applicant: Google LLC

Inventors: David Grangier, Neil Zeghidour, Oliver Teboul
Separating speech by source in audio recordings by predicting isolated audio signals conditioned on speaker representations

Patent number: 11475909

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech separation. One of the methods includes obtaining a recording comprising speech from a plurality of speakers; processing the recording using a speaker neural network having speaker parameter values and configured to process the recording in accordance with the speaker parameter values to generate a plurality of per-recording speaker representations, each speaker representation representing features of a respective identified speaker in the recording; and processing the per-recording speaker representations and the recording using a separation neural network having separation parameter values and configured to process the recording and the speaker representations in accordance with the separation parameter values to generate, for each speaker representation, a respective predicted isolated audio signal that corresponds to speech of one of the speakers in the recording.

Type: Grant

Filed: February 8, 2021

Date of Patent: October 18, 2022

Assignee: Google LLC

Inventors: Neil Zeghidour, David Grangier
TRAINING NEURAL NETWORKS USING AUXILIARY TASK UPDATE DECOMPOSITION

Publication number: 20220108174

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network having a plurality of model parameters to perform a main task. In one aspect, a method comprises: determining an auxiliary task update to the model parameters of the neural network that, if applied to the model parameters, is predicted to increase a performance of the neural network on an auxiliary task; determining a decomposition of the auxiliary task update into multiple constituent updates that, if applied to the model parameters, are each predicted to have a different impact on a performance of the neural network on the main task; determining a new auxiliary task update to the model parameters of the neural network as a function of the plurality of constituent updates; and applying the new auxiliary task update to the model parameters of the neural network.

Type: Application

Filed: October 1, 2021

Publication date: April 7, 2022

Inventors: Yann Dauphin, David Grangier, Lucio Dery
SEPARATING SPEECH BY SOURCE IN AUDIO RECORDINGS BY PREDICTING ISOLATED AUDIO SIGNALS CONDITIONED ON SPEAKER REPRESENTATIONS

Publication number: 20210249027

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech separation. One of the methods includes obtaining a recording comprising speech from a plurality of speakers; processing the recording using a speaker neural network having speaker parameter values and configured to process the recording in accordance with the speaker parameter values to generate a plurality of per-recording speaker representations, each speaker representation representing features of a respective identified speaker in the recording; and processing the per-recording speaker representations and the recording using a separation neural network having separation parameter values and configured to process the recording and the speaker representations in accordance with the separation parameter values to generate, for each speaker representation, a respective predicted isolated audio signal that corresponds to speech of one of the speakers in the recording.

Type: Application

Filed: February 8, 2021

Publication date: August 12, 2021

Inventors: Neil Zeghidour, David Grangier
Feature set embedding for incomplete data

Patent number: 8706668

Abstract: Methods and systems for classifying incomplete data are disclosed. In accordance with one method, pairs of features and values are generated based upon feature measurements on the incomplete data. In addition, a transformation function is applied on the pairs of features and values to generate a set of vectors by mapping each of the pairs to a corresponding vector in an embedding space. Further, a hardware processor applies a prediction function to the set of vectors to generate at least one confidence assessment for at least one class that indicates whether the incomplete data is of the at least one class. The method further includes outputting the at least one confidence assessment.

Type: Grant

Filed: June 2, 2011

Date of Patent: April 22, 2014

Assignee: NEC Laboratories America, Inc.

Inventors: Iain Melvin, David Grangier
Supervised semantic indexing and its extensions

Patent number: 8359282

Abstract: A system and method for determining a similarity between a document and a query includes providing a frequently used dictionary and an infrequently used dictionary in storage memory. For each word or gram in the infrequently used dictionary, n words or grams are correlated from the frequently used dictionary based on a first score. Features for a vector of the infrequently used words or grams are replaced with features from a vector of the correlated words or grams from the frequently used dictionary when the features from a vector of the correlated words or grams meet a threshold value. A similarity score is determined between weight vectors of a query and one or more documents in a corpus by employing the features from the vector of the correlated words or grams that met the threshold value.

Type: Grant

Filed: September 18, 2009

Date of Patent: January 22, 2013

Assignee: NEC Laboratories America, Inc.

Inventors: Bing Bai, Jason Weston, Ronan Collorbert, David Grangier
Supervised semantic indexing and its extensions

Patent number: 8341095

Abstract: A system and method for determining a similarity between a document and a query includes building a weight vector for each of a plurality of documents in a corpus of documents stored in memory and building a weight vector for a query input into a document retrieval system. A weight matrix is generated which distinguishes between relevant documents and lower ranked documents by comparing document/query tuples using a gradient step approach. A similarity score is determined between weight vectors of the query and documents in a corpus by determining a product of a document weight vector, a query weight vector and the weight matrix.

Type: Grant

Filed: September 18, 2009

Date of Patent: December 25, 2012

Assignee: NEC Laboratories America, Inc.

Inventors: Bing Bai, Jason Weston, Ronan Collobert, David Grangier
FEATURE SET EMBEDDING FOR INCOMPLETE DATA

Publication number: 20110302118

Abstract: Methods and systems for classifying incomplete data are disclosed. In accordance with one method, pairs of features and values are generated based upon feature measurements on the incomplete data. In addition, a transformation function is applied on the pairs of features and values to generate a set of vectors by mapping each of the pairs to a corresponding vector in an embedding space. Further, a hardware processor applies a prediction function to the set of vectors to generate at least one confidence assessment for at least one class that indicates whether the incomplete data is of the at least one class. The method further includes outputting the at least one confidence assessment.

Type: Application

Filed: June 2, 2011

Publication date: December 8, 2011

Applicant: NEC Laboratories America, Inc.

Inventors: IAIN MELVIN, David Grangier
SUPERVISED SEMANTIC INDEXING AND ITS EXTENSIONS

Publication number: 20100185659

Abstract: A system and method for determining a similarity between a document and a query includes providing a frequently used dictionary and an infrequently used dictionary in storage memory. For each word or gram in the infrequently used dictionary, n words or grams are correlated from the frequently used dictionary based on a first score. Features for a vector of the infrequently used words or grams are replaced with features from a vector of the correlated words or grams from the frequently used dictionary when the features from a vector of the correlated words or grams meet a threshold value. A similarity score is determined between weight vectors of a query and one or more documents in a corpus by employing the features from the vector of the correlated words or grams that met the threshold value.

Type: Application

Filed: September 18, 2009

Publication date: July 22, 2010

Applicant: NEC Laboratories America, Inc.

Inventors: BING BAI, JASON WESTON, RONAN COLLORBERT, DAVID GRANGIER
SUPERVISED SEMANTIC INDEXING AND ITS EXTENSIONS

Publication number: 20100179933

Abstract: A system and method for determining a similarity between a document and a query includes building a weight vector for each of a plurality of documents in a corpus of documents stored in memory and building a weight vector for a query input into a document retrieval system. A weight matrix is generated which distinguishes between relevant documents and lower ranked documents by comparing document/query tuples using a gradient step approach. A similarity score is determined between weight vectors of the query and documents in a corpus by determining a product of a document weight vector, a query weight vector and the weight matrix.

Type: Application

Filed: September 18, 2009

Publication date: July 15, 2010

Applicant: NEC Laboratories America, Inc.

Inventors: BING BAI, Jason Weston, Ronan Collobert, David Grangier