Patents by Inventor Francoise Beaufays
Francoise Beaufays has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20180330735Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.Type: ApplicationFiled: July 20, 2018Publication date: November 15, 2018Applicant: Google LLCInventors: Brian Strope, Francoise Beaufays, Olivier Siohan
-
Patent number: 10127904Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for learning pronunciations from acoustic sequences. One method includes receiving an acoustic sequence, the acoustic sequence comprising a respective acoustic feature representation at each of a plurality of time steps; for each of the time steps processing the acoustic feature representation through each of one or more recurrent neural network layers to generate a recurrent output; processing the recurrent output for the time step using a phoneme output layer to generate a phoneme representation for the acoustic feature representation for the time step; and processing the recurrent output for the time step using a grapheme output layer to generate a grapheme representation for the acoustic feature representation for the time step; and extracting, from the phoneme and grapheme representations for the acoustic feature representations at each time step, a respective pronunciation for each of one or more words.Type: GrantFiled: July 29, 2015Date of Patent: November 13, 2018Assignee: Google LLCInventors: Kanury Kanishka Rao, Francoise Beaufays, Hasim Sak, Ouais Alsharif
-
Publication number: 20180322877Abstract: A method of providing a personal directory service includes receiving, over the Internet, from a user terminal, a query spoken by a user, where the query spoken by the user includes a speech utterance representing a category of persons. The method also includes determining a geographic location of the user terminal, recognizing the category of persons with the speech recognition engine based on the speech utterance representing the category of persons a listing of persons within or near the determined geographic location matching the query to select persons responsive to the query spoken by the user, and sending to the user terminal information related to at least some of the responsive persons.Type: ApplicationFiled: July 16, 2018Publication date: November 8, 2018Inventors: Brian Strope, Francoise Beaufays, Willaim J. Byrne
-
Patent number: 10102852Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining a template that defines (i) trigger criteria for presenting a notification type and (ii) content rules for determining content to include in a notification of the notification type. Additional actions include accessing enterprise resources of an enterprise, the enterprise resources including data describing entities related to the enterprise and relationships among the entities. Further actions include, accessing user information specific to a user and determining that the trigger criteria is satisfied by the enterprise resources and the user information. Additional actions include generating a particular notification of the notification type based at least on the content rules and providing the particular notification to the user.Type: GrantFiled: April 14, 2015Date of Patent: October 16, 2018Assignee: Google LLCInventors: Fuchun Peng, Jakob Nicolaus Foerster, Diego Melendo Casado, Fei Huang, Francoise Beaufays
-
Patent number: 10049672Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.Type: GrantFiled: June 2, 2016Date of Patent: August 14, 2018Assignee: Google LLCInventors: Brian Patrick Strope, Francoise Beaufays, Olivier Siohan
-
Publication number: 20180217749Abstract: A keyboard is described that determines, using a first decoder and based on a selection of keys of a graphical keyboard, text. Responsive to determining that a characteristic of the text satisfies a threshold, a model of the keyboard identifies the target language of the text, and determines whether the target language is different than a language associated with the first decoder. If the target language of the text is not different than the language associated with the first decoder, the keyboard outputs, for display, an indication of first candidate words determined by the first decoder from the text. If the target language of the text is different: the keyboard enables a second decoder, where a language associated with the second decoder matches the target language of the text, and outputs, for display, an indication of second candidate words determined by the second decoder from the text.Type: ApplicationFiled: February 1, 2017Publication date: August 2, 2018Inventors: Ouais Alsharif, Peter Ciccotto, Francoise Beaufays, Dragan Zivkovic
-
Patent number: 10026402Abstract: A method of searching a business listing with voice commands includes receiving, over the Internet, from a user terminal, a query spoken by a user, which includes a speech utterance representing a category of merchandize, a speech utterance representing a merchandize item, and a speech utterance representing a geographic location.Type: GrantFiled: October 3, 2016Date of Patent: July 17, 2018Assignee: GOOGLE LLCInventors: Brian Strope, William J. Byrne, Francoise Beaufays
-
Publication number: 20180188948Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.Type: ApplicationFiled: December 29, 2016Publication date: July 5, 2018Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
-
Patent number: 9837070Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying pronunciations. In one aspect, a method includes obtaining a first transcription for an utterance. A second transcription for the utterance is obtained. The second transcription is different from the first transcription. One or more feature scores are determined based on the first transcription and the second transcription. The one or more feature scores are input to a trained classifier. An output of the classifier is received. The output indicates which of the first transcription and the second transcription is more likely to be a correct transcription of the utterance.Type: GrantFiled: February 21, 2014Date of Patent: December 5, 2017Assignee: Google Inc.Inventors: Fuchun Peng, Kanury Kanishka Rao, Francoise Beaufays
-
Patent number: 9747897Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, including selecting terms; obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the terms; receiving audio data corresponding to a particular user speaking the terms in the natural language; obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the terms in the natural language; aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user; identifying, based on the aligning, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic trType: GrantFiled: December 17, 2013Date of Patent: August 29, 2017Assignee: Google Inc.Inventors: Fuchun Peng, Francoise Beaufays, Pedro J. Moreno Mengibar, Brian Patrick Strope
-
Patent number: 9741339Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining pronunciations for particular terms. The methods, systems, and apparatus include actions of obtaining audio samples of speech corresponding to a particular term and obtaining candidate pronunciations for the particular term. Further actions include generating, for each candidate pronunciation for the particular term and audio sample of speech corresponding to the particular term, a score reflecting a level of similarity between of the candidate pronunciation and the audio sample, wherein the said score for the particular term is obtained by using a minimum of individual scores of phonemes comprising the term. Additional actions include aggregating the scores for each candidate pronunciation and adding one or more candidate pronunciations for the particular term to a pronunciation lexicon based on the aggregated scores for the candidate pronunciations.Type: GrantFiled: June 28, 2013Date of Patent: August 22, 2017Assignee: Google Inc.Inventors: Fuchun Peng, Francoise Beaufays, Brian Strope, Xin Lei, Pedro J. Moreno Mengibar, Trevor D. Strohman
-
Patent number: 9728185Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for recognizing speech using neural networks. One of the methods includes receiving an audio input; processing the audio input using an acoustic model to generate a respective phoneme score for each of a plurality of phoneme labels; processing one or more of the phoneme scores using an inverse pronunciation model to generate a respective grapheme score for each of a plurality of grapheme labels; and processing one or more of the grapheme scores using a language model to generate a respective text label score for each of a plurality of text labels.Type: GrantFiled: May 22, 2015Date of Patent: August 8, 2017Assignee: Google Inc.Inventors: Johan Schalkwyk, Francoise Beaufays, Hasim Sak, John Giannandrea
-
Publication number: 20170221475Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for implementing a pronunciation dictionary that stores entity name pronunciations. In one aspect, a method includes actions of receiving audio data corresponding to an utterance that includes a command and an entity name. Additional actions may include generating, by an automated speech recognizer, an initial transcription for a portion of the audio data that is associated with the entity name, receiving a corrected transcription for the portion of the utterance that is associated with the entity name, obtaining a phonetic pronunciation that is associated with the portion of the audio data that is associated with the entity name, updating a pronunciation dictionary to associate the phonetic pronunciation with the entity name, receiving a subsequent utterance that includes the entity name, and transcribing the subsequent utterance based at least in part on the updated pronunciation dictionary.Type: ApplicationFiled: February 3, 2016Publication date: August 3, 2017Inventors: Antoine Jean Bruguier, Fuchun Peng, Francoise Beaufays
-
Publication number: 20170199665Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.Type: ApplicationFiled: March 29, 2017Publication date: July 13, 2017Inventors: Shumin Zhai, Thomas Breuel, Ouais Alsharif, Yu Ouyang, Francoise Beaufays, Johan Schalkwyk
-
Publication number: 20170185286Abstract: Methods, systems, and apparatus for receiving data indicating a location of a particular touchpoint representing a latest received touchpoint in a sequence of received touchpoints; identifying candidate characters associated with the particular touchpoint; generating, for each of the candidate characters, a confidence score; identifying different candidate sequences of characters each including for each received touchpoint, one candidate character associated with a location of the received touchpoint, and one of the candidate characters associated with the particular touchpoint; for each different candidate sequence of characters, determining a language model score and generating a transcription score based at least on the confidence score for one or more of the candidate characters in the candidate sequence of characters and the language model score for the candidate sequence of characters; selecting, and providing for output, a representative sequence of characters from among the candidate sequences of charType: ApplicationFiled: December 29, 2015Publication date: June 29, 2017Inventors: Francoise Beaufays, Yu Ouyang, David Rybach, Michael D. Riley, Lars Hellsten
-
Patent number: 9678664Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.Type: GrantFiled: April 10, 2015Date of Patent: June 13, 2017Assignee: Google Inc.Inventors: Shumin Zhai, Thomas Breuel, Ouais Alsharif, Yu Ouyang, Francoise Beaufays, Johan Schalkwyk
-
Patent number: 9594744Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transcribing utterances into written text are disclosed. The methods, systems, and apparatus include actions of obtaining a lexicon model that maps phones to spoken text and obtaining a language model that assigns probabilities to written text. Further includes generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of the written text to an item of the spoken text. Additionally, the actions include constructing a decoding network for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model.Type: GrantFiled: March 14, 2013Date of Patent: March 14, 2017Assignee: Google Inc.Inventors: Hasim Sak, Francoise Beaufays
-
Publication number: 20170025123Abstract: A method of operating a voice-enabled business directory search system includes receiving category-business pairs, each category-business pair including a business category and a specific business, and establishing a data structure having nodes based on the category-business pairs. Each node of the data structure is associated with one or more business categories and a speech recognition language model for recognizing specific businesses associated with the one or more businesses categories.Type: ApplicationFiled: October 3, 2016Publication date: January 26, 2017Inventors: Brian Strope, William J. Byrne, Francoise Beaufays
-
Publication number: 20160351188Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for learning pronunciations from acoustic sequences. One method includes receiving an acoustic sequence, the acoustic sequence comprising a respective acoustic feature representation at each of a plurality of time steps; for each of the time steps processing the acoustic feature representation through each of one or more recurrent neural network layers to generate a recurrent output; processing the recurrent output for the time step using a phoneme output layer to generate a phoneme representation for the acoustic feature representation for the time step; and processing the recurrent output for the time step using a grapheme output layer to generate a grapheme representation for the acoustic feature representation for the time step; and extracting, from the phoneme and grapheme representations for the acoustic feature representations at each time step, a respective pronunciation for each of one or more words.Type: ApplicationFiled: July 29, 2015Publication date: December 1, 2016Inventors: Kanury Kanishka Rao, Francoise Beaufays, Hasim Sak, Ouais Alsharif
-
Publication number: 20160307569Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining a template that defines (i) trigger criteria for presenting a notification type and (ii) content rules for determining content to include in a notification of the notification type. Additional actions include accessing enterprise resources of an enterprise, the enterprise resources including data describing entities related to the enterprise and relationships among the entities. Further actions include, accessing user information specific to a user and determining that the trigger criteria is satisfied by the enterprise resources and the user information. Additional actions include generating a particular notification of the notification type based at least on the content rules and providing the particular notification to the user.Type: ApplicationFiled: April 14, 2015Publication date: October 20, 2016Inventors: Fuchun Peng, Jakob Nicolaus Foerster, Diego Melendo Casado, Fei Huang, Francoise Beaufays