Patents by Inventor Francoise Beaufays

Francoise Beaufays has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speech Recognition with Parallel Recognition Tasks

Publication number: 20180330735

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

Type: Application

Filed: July 20, 2018

Publication date: November 15, 2018

Applicant: Google LLC

Inventors: Brian Strope, Francoise Beaufays, Olivier Siohan
Learning pronunciations from acoustic sequences

Patent number: 10127904

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for learning pronunciations from acoustic sequences. One method includes receiving an acoustic sequence, the acoustic sequence comprising a respective acoustic feature representation at each of a plurality of time steps; for each of the time steps processing the acoustic feature representation through each of one or more recurrent neural network layers to generate a recurrent output; processing the recurrent output for the time step using a phoneme output layer to generate a phoneme representation for the acoustic feature representation for the time step; and processing the recurrent output for the time step using a grapheme output layer to generate a grapheme representation for the acoustic feature representation for the time step; and extracting, from the phoneme and grapheme representations for the acoustic feature representations at each time step, a respective pronunciation for each of one or more words.

Type: Grant

Filed: July 29, 2015

Date of Patent: November 13, 2018

Assignee: Google LLC

Inventors: Kanury Kanishka Rao, Francoise Beaufays, Hasim Sak, Ouais Alsharif
Personal Directory Service

Publication number: 20180322877

Abstract: A method of providing a personal directory service includes receiving, over the Internet, from a user terminal, a query spoken by a user, where the query spoken by the user includes a speech utterance representing a category of persons. The method also includes determining a geographic location of the user terminal, recognizing the category of persons with the speech recognition engine based on the speech utterance representing the category of persons a listing of persons within or near the determined geographic location matching the query to select persons responsive to the query spoken by the user, and sending to the user terminal information related to at least some of the responsive persons.

Type: Application

Filed: July 16, 2018

Publication date: November 8, 2018

Inventors: Brian Strope, Francoise Beaufays, Willaim J. Byrne
Personalized speech synthesis for acknowledging voice actions

Patent number: 10102852

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining a template that defines (i) trigger criteria for presenting a notification type and (ii) content rules for determining content to include in a notification of the notification type. Additional actions include accessing enterprise resources of an enterprise, the enterprise resources including data describing entities related to the enterprise and relationships among the entities. Further actions include, accessing user information specific to a user and determining that the trigger criteria is satisfied by the enterprise resources and the user information. Additional actions include generating a particular notification of the notification type based at least on the content rules and providing the particular notification to the user.

Type: Grant

Filed: April 14, 2015

Date of Patent: October 16, 2018

Assignee: Google LLC

Inventors: Fuchun Peng, Jakob Nicolaus Foerster, Diego Melendo Casado, Fei Huang, Francoise Beaufays
Speech recognition with parallel recognition tasks

Patent number: 10049672

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

Type: Grant

Filed: June 2, 2016

Date of Patent: August 14, 2018

Assignee: Google LLC

Inventors: Brian Patrick Strope, Francoise Beaufays, Olivier Siohan
KEYBOARD AUTOMATIC LANGUAGE IDENTIFICATION AND RECONFIGURATION

Publication number: 20180217749

Abstract: A keyboard is described that determines, using a first decoder and based on a selection of keys of a graphical keyboard, text. Responsive to determining that a characteristic of the text satisfies a threshold, a model of the keyboard identifies the target language of the text, and determines whether the target language is different than a language associated with the first decoder. If the target language of the text is not different than the language associated with the first decoder, the keyboard outputs, for display, an indication of first candidate words determined by the first decoder from the text. If the target language of the text is different: the keyboard enables a second decoder, where a language associated with the second decoder matches the target language of the text, and outputs, for display, an indication of second candidate words determined by the second decoder from the text.

Type: Application

Filed: February 1, 2017

Publication date: August 2, 2018

Inventors: Ouais Alsharif, Peter Ciccotto, Francoise Beaufays, Dragan Zivkovic
Business or personal listing search

Patent number: 10026402

Abstract: A method of searching a business listing with voice commands includes receiving, over the Internet, from a user terminal, a query spoken by a user, which includes a speech utterance representing a category of merchandize, a speech utterance representing a merchandize item, and a speech utterance representing a geographic location.

Type: Grant

Filed: October 3, 2016

Date of Patent: July 17, 2018

Assignee: GOOGLE LLC

Inventors: Brian Strope, William J. Byrne, Francoise Beaufays
MODALITY LEARNING ON MOBILE DEVICES

Publication number: 20180188948

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.

Type: Application

Filed: December 29, 2016

Publication date: July 5, 2018

Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
Verification of mappings between phoneme sequences and words

Patent number: 9837070

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying pronunciations. In one aspect, a method includes obtaining a first transcription for an utterance. A second transcription for the utterance is obtained. The second transcription is different from the first transcription. One or more feature scores are determined based on the first transcription and the second transcription. The one or more feature scores are input to a trained classifier. An output of the classifier is received. The output indicates which of the first transcription and the second transcription is more likely to be a correct transcription of the utterance.

Type: Grant

Filed: February 21, 2014

Date of Patent: December 5, 2017

Assignee: Google Inc.

Inventors: Fuchun Peng, Kanury Kanishka Rao, Francoise Beaufays
Identifying substitute pronunciations

Patent number: 9747897

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, including selecting terms; obtaining an expected phonetic transcription of an idealized native speaker of a natural language speaking the terms; receiving audio data corresponding to a particular user speaking the terms in the natural language; obtaining, based on the audio data, an actual phonetic transcription of the particular user speaking the terms in the natural language; aligning the expected phonetic transcription of the idealized native speaker of the natural language with the actual phonetic transcription of the particular user; identifying, based on the aligning, a portion of the expected phonetic transcription that is different than a corresponding portion of the actual phonetic transcription; and based on identifying the portion of the expected phonetic transcription, designating the expected phonetic transcription as a substitute pronunciation for the corresponding portion of the actual phonetic tr

Type: Grant

Filed: December 17, 2013

Date of Patent: August 29, 2017

Assignee: Google Inc.

Inventors: Fuchun Peng, Francoise Beaufays, Pedro J. Moreno Mengibar, Brian Patrick Strope
Data driven word pronunciation learning and scoring with crowd sourcing based on the word's phonemes pronunciation scores

Patent number: 9741339

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining pronunciations for particular terms. The methods, systems, and apparatus include actions of obtaining audio samples of speech corresponding to a particular term and obtaining candidate pronunciations for the particular term. Further actions include generating, for each candidate pronunciation for the particular term and audio sample of speech corresponding to the particular term, a score reflecting a level of similarity between of the candidate pronunciation and the audio sample, wherein the said score for the particular term is obtained by using a minimum of individual scores of phonemes comprising the term. Additional actions include aggregating the scores for each candidate pronunciation and adding one or more candidate pronunciations for the particular term to a pronunciation lexicon based on the aggregated scores for the candidate pronunciations.

Type: Grant

Filed: June 28, 2013

Date of Patent: August 22, 2017

Assignee: Google Inc.

Inventors: Fuchun Peng, Francoise Beaufays, Brian Strope, Xin Lei, Pedro J. Moreno Mengibar, Trevor D. Strohman
Recognizing speech using neural networks

Patent number: 9728185

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for recognizing speech using neural networks. One of the methods includes receiving an audio input; processing the audio input using an acoustic model to generate a respective phoneme score for each of a plurality of phoneme labels; processing one or more of the phoneme scores using an inverse pronunciation model to generate a respective grapheme score for each of a plurality of grapheme labels; and processing one or more of the grapheme scores using a language model to generate a respective text label score for each of a plurality of text labels.

Type: Grant

Filed: May 22, 2015

Date of Patent: August 8, 2017

Assignee: Google Inc.

Inventors: Johan Schalkwyk, Francoise Beaufays, Hasim Sak, John Giannandrea
LEARNING PERSONALIZED ENTITY PRONUNCIATIONS

Publication number: 20170221475

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for implementing a pronunciation dictionary that stores entity name pronunciations. In one aspect, a method includes actions of receiving audio data corresponding to an utterance that includes a command and an entity name. Additional actions may include generating, by an automated speech recognizer, an initial transcription for a portion of the audio data that is associated with the entity name, receiving a corrected transcription for the portion of the utterance that is associated with the entity name, obtaining a phonetic pronunciation that is associated with the portion of the audio data that is associated with the entity name, updating a pronunciation dictionary to associate the phonetic pronunciation with the entity name, receiving a subsequent utterance that includes the entity name, and transcribing the subsequent utterance based at least in part on the updated pronunciation dictionary.

Type: Application

Filed: February 3, 2016

Publication date: August 3, 2017

Inventors: Antoine Jean Bruguier, Fuchun Peng, Francoise Beaufays
NEURAL NETWORK FOR KEYBOARD INPUT DECODING

Publication number: 20170199665

Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.

Type: Application

Filed: March 29, 2017

Publication date: July 13, 2017

Inventors: Shumin Zhai, Thomas Breuel, Ouais Alsharif, Yu Ouyang, Francoise Beaufays, Johan Schalkwyk
CONTINUOUS KEYBOARD RECOGNITION

Publication number: 20170185286

Abstract: Methods, systems, and apparatus for receiving data indicating a location of a particular touchpoint representing a latest received touchpoint in a sequence of received touchpoints; identifying candidate characters associated with the particular touchpoint; generating, for each of the candidate characters, a confidence score; identifying different candidate sequences of characters each including for each received touchpoint, one candidate character associated with a location of the received touchpoint, and one of the candidate characters associated with the particular touchpoint; for each different candidate sequence of characters, determining a language model score and generating a transcription score based at least on the confidence score for one or more of the candidate characters in the candidate sequence of characters and the language model score for the candidate sequence of characters; selecting, and providing for output, a representative sequence of characters from among the candidate sequences of char

Type: Application

Filed: December 29, 2015

Publication date: June 29, 2017

Inventors: Francoise Beaufays, Yu Ouyang, David Rybach, Michael D. Riley, Lars Hellsten
Neural network for keyboard input decoding

Patent number: 9678664

Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.

Type: Grant

Filed: April 10, 2015

Date of Patent: June 13, 2017

Assignee: Google Inc.

Inventors: Shumin Zhai, Thomas Breuel, Ouais Alsharif, Yu Ouyang, Francoise Beaufays, Johan Schalkwyk
Speech transcription including written text

Patent number: 9594744

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transcribing utterances into written text are disclosed. The methods, systems, and apparatus include actions of obtaining a lexicon model that maps phones to spoken text and obtaining a language model that assigns probabilities to written text. Further includes generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of the written text to an item of the spoken text. Additionally, the actions include constructing a decoding network for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model.

Type: Grant

Filed: March 14, 2013

Date of Patent: March 14, 2017

Assignee: Google Inc.

Inventors: Hasim Sak, Francoise Beaufays
Business Listing Search

Publication number: 20170025123

Abstract: A method of operating a voice-enabled business directory search system includes receiving category-business pairs, each category-business pair including a business category and a specific business, and establishing a data structure having nodes based on the category-business pairs. Each node of the data structure is associated with one or more business categories and a speech recognition language model for recognizing specific businesses associated with the one or more businesses categories.

Type: Application

Filed: October 3, 2016

Publication date: January 26, 2017

Inventors: Brian Strope, William J. Byrne, Francoise Beaufays
LEARNING PRONUNCIATIONS FROM ACOUSTIC SEQUENCES

Publication number: 20160351188

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for learning pronunciations from acoustic sequences. One method includes receiving an acoustic sequence, the acoustic sequence comprising a respective acoustic feature representation at each of a plurality of time steps; for each of the time steps processing the acoustic feature representation through each of one or more recurrent neural network layers to generate a recurrent output; processing the recurrent output for the time step using a phoneme output layer to generate a phoneme representation for the acoustic feature representation for the time step; and processing the recurrent output for the time step using a grapheme output layer to generate a grapheme representation for the acoustic feature representation for the time step; and extracting, from the phoneme and grapheme representations for the acoustic feature representations at each time step, a respective pronunciation for each of one or more words.

Type: Application

Filed: July 29, 2015

Publication date: December 1, 2016

Inventors: Kanury Kanishka Rao, Francoise Beaufays, Hasim Sak, Ouais Alsharif
Personalized Speech Synthesis for Voice Actions

Publication number: 20160307569

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining a template that defines (i) trigger criteria for presenting a notification type and (ii) content rules for determining content to include in a notification of the notification type. Additional actions include accessing enterprise resources of an enterprise, the enterprise resources including data describing entities related to the enterprise and relationships among the entities. Further actions include, accessing user information specific to a user and determining that the trigger criteria is satisfied by the enterprise resources and the user information. Additional actions include generating a particular notification of the notification type based at least on the content rules and providing the particular notification to the user.

Type: Application

Filed: April 14, 2015

Publication date: October 20, 2016

Inventors: Fuchun Peng, Jakob Nicolaus Foerster, Diego Melendo Casado, Fei Huang, Francoise Beaufays

prev 1 2 3 4 5 6 7 next