Patents by Inventor Françoise Beaufays

Françoise Beaufays has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

LEARNING PRONUNCIATIONS FROM ACOUSTIC SEQUENCES

Publication number: 20160351188

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for learning pronunciations from acoustic sequences. One method includes receiving an acoustic sequence, the acoustic sequence comprising a respective acoustic feature representation at each of a plurality of time steps; for each of the time steps processing the acoustic feature representation through each of one or more recurrent neural network layers to generate a recurrent output; processing the recurrent output for the time step using a phoneme output layer to generate a phoneme representation for the acoustic feature representation for the time step; and processing the recurrent output for the time step using a grapheme output layer to generate a grapheme representation for the acoustic feature representation for the time step; and extracting, from the phoneme and grapheme representations for the acoustic feature representations at each time step, a respective pronunciation for each of one or more words.

Type: Application

Filed: July 29, 2015

Publication date: December 1, 2016

Inventors: Kanury Kanishka Rao, Francoise Beaufays, Hasim Sak, Ouais Alsharif
Personalized Speech Synthesis for Voice Actions

Publication number: 20160307569

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining a template that defines (i) trigger criteria for presenting a notification type and (ii) content rules for determining content to include in a notification of the notification type. Additional actions include accessing enterprise resources of an enterprise, the enterprise resources including data describing entities related to the enterprise and relationships among the entities. Further actions include, accessing user information specific to a user and determining that the trigger criteria is satisfied by the enterprise resources and the user information. Additional actions include generating a particular notification of the notification type based at least on the content rules and providing the particular notification to the user.

Type: Application

Filed: April 14, 2015

Publication date: October 20, 2016

Inventors: Fuchun Peng, Jakob Nicolaus Foerster, Diego Melendo Casado, Fei Huang, Francoise Beaufays
NEURAL NETWORK FOR KEYBOARD INPUT DECODING

Publication number: 20160299685

Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.

Type: Application

Filed: April 10, 2015

Publication date: October 13, 2016

Inventors: Shumin Zhai, Thomas Breuel, Ouais Alsharif, Yu Ouyang, Francoise Beaufays, Johan Schalkwyk
Business listing search

Patent number: 9460712

Abstract: A method of operating a voice-enabled business directory search system includes receiving category-business pairs, each category-business pair including a business category and a specific business, and establishing a data structure having nodes based on the category-business pairs. Each node of the data structure is associated with one or more business categories and a speech recognition language model for recognizing specific businesses associated with the one or more businesses categories.

Type: Grant

Filed: August 7, 2014

Date of Patent: October 4, 2016

Assignee: GOOGLE INC.

Inventors: Brian Strope, William J. Byrne, Francoise Beaufays
Speech Recognition with Parallel Recognition Tasks

Publication number: 20160275951

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

Type: Application

Filed: June 2, 2016

Publication date: September 22, 2016

Inventors: Brian Patrick Strope, Francoise Beaufays, Olivier Siohan
Speech recognition with parallel recognition tasks

Patent number: 9373329

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

Type: Grant

Filed: October 28, 2013

Date of Patent: June 21, 2016

Assignee: Google Inc.

Inventors: Brian Strope, Francoise Beaufays, Olivier Siohan
Recognizing different versions of a language

Patent number: 9275635

Abstract: Speech recognition systems may perform the following operations: receiving audio at a computing device; identifying a language associated with the audio; recognizing the audio using recognition models for different versions of the language to produce recognition candidates for the audio, where the recognition candidates are associated with corresponding information; comparing the information of the recognition candidates to identify agreement between at least two of the recognition models; selecting a recognition candidate based on information of the recognition candidate and agreement between the at least two of the recognition models; and outputting data corresponding to the selected recognition candidate as a recognized version of the audio.

Type: Grant

Filed: November 9, 2012

Date of Patent: March 1, 2016

Assignee: Google Inc.

Inventors: Francoise Beaufays, Brian Strope, Yun-hsuan Sung
Computing Device with Remote Contact Lists

Publication number: 20160057099

Abstract: In one implementation a computer-implemented method includes generating a group of telephone contacts for a first user, wherein the generating includes identifying a second user as a contact of the first user based upon a determination that the second user has at least a threshold email-based association with the first user; and adding the identified second user to the group of telephone contacts for the first user. The method further includes receiving a first request to connect a first telephone device associated with the first user to a second telephone device associated with the second user. The method also includes identifying a contact identifier of the second telephone device using the generated group of telephone contacts for the first user, and initiating a connection between the first telephone device and the second telephone device using the identified contact identifier.

Type: Application

Filed: November 4, 2015

Publication date: February 25, 2016

Inventors: Brian Patrick Strope, Francoise Beaufays, Hy Murveit
GENERATING REPRESENTATIONS OF INPUT SEQUENCES USING NEURAL NETWORKS

Publication number: 20150356075

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating representations of input sequences. One of the methods includes receiving a grapheme sequence, the grapheme sequence comprising a plurality of graphemes arranged according to an input order; processing the sequence of graphemes using a long short-term memory (LSTM) neural network to generate an initial phoneme sequence from the grapheme sequence, the initial phoneme sequence comprising a plurality of phonemes arranged according to an output order; and generating a phoneme representation of the grapheme sequence from the initial phoneme sequence generated by the LSTM neural network, wherein generating the phoneme representation comprises removing, from the initial phoneme sequence, phonemes in one or more positions in the output order.

Type: Application

Filed: June 2, 2015

Publication date: December 10, 2015

Inventors: Kanury Kanishka Rao, Fuchun Peng, Hasim Sak, Francoise Beaufays
Computing device with remote contact lists

Patent number: 9210258

Abstract: In one implementation a computer-implemented method includes generating a group of telephone contacts for a first user, wherein the generating includes identifying a second user as a contact of the first user based upon a determination that the second user has at least a threshold email-based association with the first user; and adding the identified second user to the group of telephone contacts for the first user. The method further includes receiving a first request to connect a first telephone device associated with the first user to a second telephone device associated with the second user. The method also includes identifying a contact identifier of the second telephone device using the generated group of telephone contacts for the first user, and initiating a connection between the first telephone device and the second telephone device using the identified contact identifier.

Type: Grant

Filed: July 3, 2013

Date of Patent: December 8, 2015

Assignee: Google Inc.

Inventors: Brian Strope, Francoise Beaufays, Hy Murveit
RECOGNIZING SPEECH USING NEURAL NETWORKS

Publication number: 20150340034

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for recognizing speech using neural networks. One of the methods includes receiving an audio input; processing the audio input using an acoustic model to generate a respective phoneme score for each of a plurality of phoneme labels; processing one or more of the phoneme scores using an inverse pronunciation model to generate a respective grapheme score for each of a plurality of grapheme labels; and processing one or more of the grapheme scores using a language model to generate a respective text label score for each of a plurality of text labels.

Type: Application

Filed: May 22, 2015

Publication date: November 26, 2015

Inventors: Johan Schalkwyk, Francoise Beaufays, Hasim Sak, John Giannandrea
Recognizing speech in multiple languages

Patent number: 9129591

Abstract: Speech recognition systems may perform the following operations: receiving audio; recognizing the audio using language models for different languages to produce recognition candidates for the audio, where the recognition candidates are associated with corresponding recognition scores; identifying a candidate language for the audio; selecting a recognition candidate based on the recognition scores and the candidate language; and outputting data corresponding to the selected recognition candidate as a recognized version of the audio.

Type: Grant

Filed: December 26, 2012

Date of Patent: September 8, 2015

Assignee: Google Inc.

Inventors: Yun-hsuan Sung, Francoise Beaufays, Brian Strope, Hui Lin, Jui-Ting Huang
Acoustically informed pruning for language modeling

Patent number: 9110880

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for pruning a language model are disclosed. The methods, systems, and apparatus include actions of selecting a candidate portion of the language model to evaluate for pruning, obtaining an entropy score representing information loss that would result from pruning the candidate portion of the language model, obtaining an acoustic score representing acoustic confusability of one or more words modeled by the candidate portion of the language model, and evaluating whether to prune the candidate portion of the language model using the entropy score and the acoustic score.

Type: Grant

Filed: March 15, 2013

Date of Patent: August 18, 2015

Assignee: Google Inc.

Inventors: Brian Strope, Francoise Beaufays
IDENTIFYING SUBSTITUTE PRONUNCIATIONS

Publication number: 20150170642

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, including selecting terms; obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the terms; receiving audio data corresponding to a particular user speaking the terms in the natural language; obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the terms in the natural language; aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user; identifying, based on the aligning, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic tr

Type: Application

Filed: December 17, 2013

Publication date: June 18, 2015

Applicant: Google Inc.

Inventors: Fuchun Peng, Francoise Beaufays, Pedro J. Moreno Mengibar, Brian Patrick Strope
PRONUNCIATION VERIFICATION

Publication number: 20150161985

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying pronunciations. In one aspect, a method includes obtaining a first transcription for an utterance. A second transcription for the utterance is obtained. The second transcription is different from the first transcription. One or more feature scores are determined based on the first transcription and the second transcription. The one or more feature scores are input to a trained classifier. An output of the classifier is received. The output indicates which of the first transcription and the second transcription is more likely to be a correct transcription of the utterance.

Type: Application

Filed: February 21, 2014

Publication date: June 11, 2015

Applicant: Google Inc.

Inventors: Fuchun Peng, Kanury Kanishka Rao, Francoise Beaufays
DYNAMIC SELECTION AMONG ACOUSTIC TRANSFORMS

Publication number: 20150149167

Abstract: Aspects of this disclosure are directed to accurately transforming speech data into one or more word strings that represent the speech data. A speech recognition device may receive the speech data from a user device and an indication of the user device. The speech recognition device may execute a speech recognition algorithm using one or more user and acoustic condition specific transforms that are specific to the user device and an acoustic condition of the speech data. The execution of the speech recognition algorithm may transform the speech data into one or more word strings that represent the speech data. The speech recognition device may estimate which one of the one or more word strings more accurately represents the received speech data.

Type: Application

Filed: September 30, 2011

Publication date: May 28, 2015

Applicant: GOOGLE INC.

Inventors: Françoise Beaufays, Johan Schalkwyk, Vincent Olivier Vanhoucke, Petar Stanisa Aleksic
Discovery of problematic pronunciations for automatic speech recognition systems

Patent number: 8959020

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for discovery of problematic pronunciations for automatic speech recognition systems. One of the methods includes determining a frequency of occurrences of one or more n-grams in transcribed text and a frequency of occurrences of the n-grams in typed text and classifying a system pronunciation of a word included in the n-grams as correct or incorrect based on the frequencies. The n-grams may comprise one or more words and at least one of the words is classified as incorrect based on the frequencies. The frequencies of the specific n-grams may be determined across a domain using one or more n-grams that typically appear adjacent to the specific n-grams.

Type: Grant

Filed: March 29, 2013

Date of Patent: February 17, 2015

Assignee: Google Inc.

Inventors: Brian Strope, Francoise Beaufays, Trevor D. Strohman
DATA DRIVEN PRONUNCIATION LEARNING WITH CROWD SOURCING

Publication number: 20150006178

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining pronunciations for particular terms. The methods, systems, and apparatus include actions of obtaining audio samples of speech corresponding to a particular term and obtaining candidate pronunciations for the particular term. Further actions include generating, for each candidate pronunciation for the particular term and audio sample of speech corresponding to the particular term, a score reflecting a level of similarity between of the candidate pronunciation and the audio sample. Additional actions include aggregating the scores for each candidate pronunciation and adding one or more candidate pronunciations for the particular term to a pronunciation lexicon based on the aggregated scores for the candidate pronunciations.

Type: Application

Filed: June 28, 2013

Publication date: January 1, 2015

Inventors: Fuchun Peng, Francoise Beaufays, Brian Strope, Xin Lei, Pedro J. Moreno Mengibar, Trevor D. Strohman
Business listing search

Patent number: 8831930

Abstract: A method of operating a voice-enabled business directory search system includes receiving category-business pairs, each category-business pair including a business category and a specific business, and establishing a data structure having nodes based on the category-business pairs. Each node of the data structure is associated with one or more business categories and a speech recognition language model for recognizing specific businesses associated with the one or more businesses categories.

Type: Grant

Filed: October 27, 2010

Date of Patent: September 9, 2014

Assignee: Google Inc.

Inventors: Brian Strope, William J. Byrne, Francoise Beaufays
SPEECH TRANSCRIPTION INCLUDING WRITTEN TEXT

Publication number: 20140149119

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transcribing utterances into written text are disclosed. The methods, systems, and apparatus include actions of obtaining a lexicon model that maps phones to spoken text and obtaining a language model that assigns probabilities to written text. Further includes generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of the written text to an item of the spoken text. Additionally, the actions include constructing a decoding network for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model.

Type: Application

Filed: March 14, 2013

Publication date: May 29, 2014

Applicant: Google Inc.

Inventors: Hasim Sak, Francoise Beaufays

prev 1 2 3 4 5 6 7 next