Patents by Inventor Olivier Siohan
Olivier Siohan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230267922Abstract: Implementations relate to an application that can bias automatic speech recognition for meetings using data that may be associated with the meeting and/or meeting participants. A transcription of inputs provided during a meeting can additionally and/or alternatively be processed to determine whether the inputs should be incorporated into a meeting document, which can provide a summary for the meeting. In some instances, entries into a meeting document can be designated as action items, and those action items can optionally have conditions for reminding meeting participants about the action items and/or for determining whether an action item has been fulfilled. In this way, various tasks that may typically be manually performed by meeting participants, such as creating a meeting summary, can be automated in a more accurate manner. This can preserve resources that may otherwise be wasted during video conferences, in-person meetings, and/or other gatherings.Type: ApplicationFiled: February 23, 2022Publication date: August 24, 2023Inventors: Olivier Siohan, Takaki Makino, Joshua Maynez, Ryan Mcdonald, Benyah Shaparenko, Joseph Nelson, Kishan Sachdeva, Basilio Garcia
-
Patent number: 11527248Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.Type: GrantFiled: May 27, 2020Date of Patent: December 13, 2022Assignee: GOOGLE LLCInventors: Brian Strope, Francoise Beaufays, Olivier Siohan
-
Publication number: 20220392439Abstract: A method (400) includes receiving audio data (112) corresponding to an utterance (101) spoken by a user (10), receiving video data (114) representing motion of lips of the user while the user was speaking the utterance, and obtaining multiple candidate transcriptions (135) for the utterance based on the audio data. For each candidate transcription of the multiple candidate transcriptions, the method also includes generating a synthesized speech representation (145) of the corresponding candidate transcription and determining an agreement score (155) indicating a likelihood that the synthesized speech representation matches the motion of the lips of the user while the user speaks the utterance. The method also includes selecting one of the multiple candidate transcriptions for the utterance as a speech recognition output (175) based on the agreement scores determined for the multiple candidate transcriptions for the utterance.Type: ApplicationFiled: November 18, 2019Publication date: December 8, 2022Applicant: Google LLCInventors: Olivier Siohan, Takaki Makino, Richard Rose, Otavio Braga, Hank Liao, Basillo Garcia Castillo
-
Publication number: 20200357413Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.Type: ApplicationFiled: May 27, 2020Publication date: November 12, 2020Applicant: Google LLCInventors: Brian Strope, Francoise Beaufays, Olivier Siohan
-
Patent number: 10699714Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.Type: GrantFiled: July 20, 2018Date of Patent: June 30, 2020Assignee: Google LLCInventors: Brian Strope, Francoise Beaufays, Olivier Siohan
-
Patent number: 10204619Abstract: Methods, systems, and apparatus are described that receive audio data for an utterance. Association data is accessed that indicates associations between data corresponding to uncorrupted audio segments, and data corresponding to corrupted versions of the uncorrupted audio segments, where the associations are determined before receiving the audio data for the utterance. Using the association data and the received audio data for the utterance, data corresponding to at least one uncorrupted audio segment is selected. A transcription of the utterance is determined based on the selected data corresponding to the at least one uncorrupted audio segment.Type: GrantFiled: February 22, 2016Date of Patent: February 12, 2019Assignee: Google LLCInventors: Olivier Siohan, Pedro J. Moreno Mengibar
-
Publication number: 20180330735Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.Type: ApplicationFiled: July 20, 2018Publication date: November 15, 2018Applicant: Google LLCInventors: Brian Strope, Francoise Beaufays, Olivier Siohan
-
Patent number: 10049672Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.Type: GrantFiled: June 2, 2016Date of Patent: August 14, 2018Assignee: Google LLCInventors: Brian Patrick Strope, Francoise Beaufays, Olivier Siohan
-
Patent number: 9472187Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer.Type: GrantFiled: May 25, 2016Date of Patent: October 18, 2016Assignee: Google Inc.Inventors: Olga Kapralova, John Paul Alex, Eugene Weinstein, Pedro J. Moreno Mengibar, Olivier Siohan, Ignacio Lopez Moreno
-
Publication number: 20160275951Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.Type: ApplicationFiled: June 2, 2016Publication date: September 22, 2016Inventors: Brian Patrick Strope, Francoise Beaufays, Olivier Siohan
-
Publication number: 20160267903Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer.Type: ApplicationFiled: May 25, 2016Publication date: September 15, 2016Inventors: Olga Kapralova, John Paul Alex, Eugene Weinstein, Pedro J. Moreno Mengibar, Olivier Siohan, Ignacio Lopez Moreno
-
Patent number: 9378731Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer.Type: GrantFiled: April 22, 2015Date of Patent: June 28, 2016Assignee: Google Inc.Inventors: Olga Kapralova, John Paul Alex, Eugene Weinstein, Pedro J. Moreno Mengibar, Olivier Siohan, Ignacio Lopez Moreno
-
Patent number: 9373329Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.Type: GrantFiled: October 28, 2013Date of Patent: June 21, 2016Assignee: Google Inc.Inventors: Brian Strope, Francoise Beaufays, Olivier Siohan
-
Publication number: 20160171977Abstract: Methods, systems, and apparatus are described that receive audio data for an utterance. Association data is accessed that indicates associations between data corresponding to uncorrupted audio segments, and data corresponding to corrupted versions of the uncorrupted audio segments, where the associations are determined before receiving the audio data for the utterance. Using the association data and the received audio data for the utterance, data corresponding to at least one uncorrupted audio segment is selected. A transcription of the utterance is determined based on the selected data corresponding to the at least one uncorrupted audio segment.Type: ApplicationFiled: February 22, 2016Publication date: June 16, 2016Inventors: Olivier Siohan, Pedro J. Moreno Mengibar
-
Publication number: 20160093294Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer.Type: ApplicationFiled: April 22, 2015Publication date: March 31, 2016Inventors: Olga Kapralova, John Paul Alex, Eugene Weinstein, Pedro J. Moreno Mengibar, Olivier Siohan, Ignacio Lopez Moreno
-
Patent number: 9299347Abstract: Methods, systems, and apparatus are described that receive audio data for an utterance. Association data is accessed that indicates associations between data corresponding to uncorrupted audio segments, and data corresponding to corrupted versions of the uncorrupted audio segments, where the associations are determined before receiving the audio data for the utterance. Using the association data and the received audio data for the utterance, data corresponding to at least one uncorrupted audio segment is selected. A transcription of the utterance is determined based on the selected data corresponding to the at least one uncorrupted audio segment.Type: GrantFiled: April 14, 2015Date of Patent: March 29, 2016Assignee: Google Inc.Inventors: Olivier Siohan, Pedro J. Moreno Mengibar
-
Patent number: 9263033Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a set of training utterances. The methods, systems, and apparatus include actions of obtaining a target multi-dimensional distribution of characteristics in an initial set of candidate utterances and selecting a subset of the initial set of candidate utterances based on speech recognition confidence scores associated with the candidate utterances. Additional actions include selecting a particular candidate utterance from the subset of the initial set of utterances and determining that adding the particular candidate utterance to a set of training utterances reduces a divergence of a multi-dimensional distribution of the characteristics in the set of training utterances from the target multi-dimensional distribution. Further actions include adding the particular candidate utterance to the set of training utterances.Type: GrantFiled: June 25, 2014Date of Patent: February 16, 2016Assignee: Google Inc.Inventors: Olivier Siohan, Pedro J. Mengibar
-
Publication number: 20150379983Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a set of training utterances. The methods, systems, and apparatus include actions of obtaining a target multi-dimensional distribution of characteristics in an initial set of candidate utterances and selecting a subset of the initial set of candidate utterances based on speech recognition confidence scores associated with the candidate utterances. Additional actions include selecting a particular candidate utterance from the subset of the initial set of utterances and determining that adding the particular candidate utterance to a set of training utterances reduces a divergence of a multi-dimensional distribution of the characteristics in the set of training utterances from the target multi-dimensional distribution. Further actions include adding the particular candidate utterance to the set of training utterances.Type: ApplicationFiled: June 25, 2014Publication date: December 31, 2015Inventors: Olivier Siohan, Pedro J. Mengibar
-
Publication number: 20140058728Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.Type: ApplicationFiled: October 28, 2013Publication date: February 27, 2014Applicant: Google Inc.Inventors: Brian Strope, Francoise Beaufays, Olivier Siohan
-
Patent number: 8571860Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.Type: GrantFiled: January 25, 2013Date of Patent: October 29, 2013Assignee: Google Inc.Inventors: Brian Strope, Francoise Beaufays, Olivier Siohan