Patents by Inventor Françoise Beaufays

Françoise Beaufays has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

EPHEMERAL LEARNING OF MACHINE LEARNING MODEL(S)

Publication number: 20230156248

Abstract: Implementations disclosed herein are directed to ephemeral learning of machine learning (“ML”) model(s) based on gradient(s) generated at a remote system (e.g., remote server(s)). Processor(s) of the remote system can receive stream(s) of audio data capturing spoken utterance(s) from a client device of a user. A fulfillment pipeline can process the stream(s) of audio data to cause certain fulfillment(s) of the spoken utterance(s) to be performed. Meanwhile, a training pipeline can process the stream(s) of audio data to generate gradient(s) using unsupervised learning techniques. Subsequent to the processing by the fulfillment pipeline and/or the training pipeline, the stream(s) of audio data are discarded by the remote system. Accordingly, the ML model(s) can be trained at the remote system without storing or logging of the stream(s) of audio data by non-transient memory thereof, thereby providing more efficient training mechanisms for training the ML model(s) and also increasing security of user data.

Type: Application

Filed: November 23, 2021

Publication date: May 18, 2023

Inventors: Françoise Beaufays, Khe Chai Sim, Trevor Strohman, Oren Litvin
Exploring Heterogeneous Characteristics of Layers In ASR Models For More Efficient Training

Publication number: 20230107475

Abstract: A computer-implemented method includes obtaining a multi-domain (MD) dataset and training a neural network model using the MD dataset with short-form data withheld (MD-SF). The neural network model includes a plurality of layer each having a plurality of parameters. The method also includes resetting each respective layer in the trained neural network one at a time. For each respective layer in the trained neural network model, and after resetting the respective layer, the method also includes determining a corresponding word error rate of the trained neural network model and identifying the respective layer as corresponding to an ambient layer when the corresponding word error rate satisfies a word error rate threshold. The method also includes transmitting an on-device neural network model to execute on one or more client devices for generating gradients based on the withheld domain (SF) of the MD dataset.

Type: Application

Filed: October 4, 2022

Publication date: April 6, 2023

Applicant: Google LLC

Inventors: Dhruv Guliani, Lillian Zhou, Andreas Kebel, Giovanni Motta, Francoise Beaufays
ON-DEVICE PERSONALIZATION OF SPEECH SYNTHESIS FOR TRAINING OF SPEECH RECOGNITION MODEL(S)

Publication number: 20230068897

Abstract: Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using an on-device TTS generator model, to generate synthesized speech audio data that includes synthesized speech of the textual segment; process the synthesized speech, using an on-device ASR model to generate predicted ASR output; and generate a gradient based on comparing the predicted ASR output to ground truth output corresponding to the textual segment. Processor(s) of the client device can also: process the synthesized speech audio data using an on-device TTS generator model to make a prediction; and generate a gradient based on the prediction. In these implementations, the generated gradient(s) can be used to update weight(s) of the respective on-device model(s) and/or transmitted to a remote system for use in remote updating of respective global model(s). The updated weight(s) and/or the updated model(s) can be transmitted to client device(s).

Type: Application

Filed: November 9, 2022

Publication date: March 2, 2023

Inventors: Françoise Beaufays, Johan Schalkwyk, Khe Chai Sim
Neural network for keyboard input decoding

Patent number: 11573698

Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.

Type: Grant

Filed: September 8, 2021

Date of Patent: February 7, 2023

Assignee: Google LLC

Inventors: Shumin Zhai, Thomas Breuel, Ouais Alsharif, Yu Ouyang, Francoise Beaufays, Johan Schalkwyk
On-device personalization of speech synthesis for training of speech model(s)

Patent number: 11545133

Abstract: Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using an on-device TTS generator model, to generate synthesized speech audio data that includes synthesized speech of the textual segment; process the synthesized speech, using an on-device ASR model to generate predicted ASR output; and generate a gradient based on comparing the predicted ASR output to ground truth output corresponding to the textual segment. Processor(s) of the client device can also: process the synthesized speech audio data using an on-device TTS generator model to make a prediction; and generate a gradient based on the prediction. In these implementations, the generated gradient(s) can be used to update weight(s) of the respective on-device model(s) and/or transmitted to a remote system for use in remote updating of respective global model(s). The updated weight(s) and/or the updated model(s) can be transmitted to client device(s).

Type: Grant

Filed: October 28, 2020

Date of Patent: January 3, 2023

Assignee: GOOGLE LLC

Inventors: Françoise Beaufays, Johan Schalkwyk, Khe Chai Sim
Modality Learning On Mobile Devices

Publication number: 20220413696

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.

Type: Application

Filed: August 31, 2022

Publication date: December 29, 2022

Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
MULTI-STREAM RECURRENT NEURAL NETWORK TRANSDUCER(S)

Publication number: 20220405549

Abstract: Techniques are disclosed that enable generating jointly probable output by processing input using a multi-stream recurrent neural network transducer (MS RNN-T) model. Various implementations include generating a first output sequence and a second output sequence by processing a single input sequence using the MS RNN-T, where the first output sequence is jointly probable with the second output sequence. Additional or alternative techniques are disclosed that enable generating output by processing multiple input sequences using the MS RNN-T. Various implementations include processing a first input sequence and a second input sequence using the MS RNN-T to generate output. In some implementations, the MS RNN-T can be used to process two or more input sequences to generate two or more jointly probable output sequences.

Type: Application

Filed: December 15, 2020

Publication date: December 22, 2022

Inventors: Khe Chai Sim, Françoise Beaufays
Speech recognition with parallel recognition tasks

Patent number: 11527248

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

Type: Grant

Filed: May 27, 2020

Date of Patent: December 13, 2022

Assignee: GOOGLE LLC

Inventors: Brian Strope, Francoise Beaufays, Olivier Siohan
ASCERTAINING AND/OR MITIGATING EXTENT OF EFFECTIVE RECONSTRUCTION, OF PREDICTIONS, FROM MODEL UPDATES TRANSMITTED IN FEDERATED LEARNING

Publication number: 20220383204

Abstract: Implementations relate to ascertaining to what extent predictions, generated using a machine learning model, can be effectively reconstructed from model updates, where the model updates are generated based on those predictions and based on applying a particular loss technique (e.g., a particular cross-entropy loss technique). Some implementations disclosed generate measures that each indicate a degree of conformity between a corresponding reconstruction, generated using a corresponding model update, and a corresponding prediction. In some of those implementations, the measures are utilized in determining whether to utilize the particular loss technique (utilized in generating the model updates) in federated learning of the machine learning model and/or of additional machine learning model(s).

Type: Application

Filed: November 24, 2021

Publication date: December 1, 2022

Inventors: Om Dipakbhai Thakkar, Trung Dang, Swaroop Indra Ramaswamy, Rajiv Mathews, Françoise Beaufays
Compression of Data that Exhibits Mixed Compressibility

Publication number: 20220368343

Abstract: Systems and methods for compression of data that exhibits mixed compressibility, such as floating-point data, are provided. As one example, aspects of the present disclosure can be used to compress floating-point data that represents the values of parameters of a machine-learned model. Therefore, aspects of the present disclosure can be used to compress machine-learned models (e.g., for reducing storage requirements associated with the model, reducing the bandwidth expended to transmit the model, etc.).

Type: Application

Filed: September 9, 2019

Publication date: November 17, 2022

Inventors: Giovanni Motta, Francoise Beaufays, Petr Zadrazil
EVALUATING ON-DEVICE MACHINE LEARNING MODEL(S) BASED ON PERFORMANCE MEASURES OF CLIENT DEVICE(S) AND/OR THE ON-DEVICE MACHINE LEARNING MODEL(S)

Publication number: 20220309389

Abstract: Implementations disclosed herein are directed to systems and methods for evaluating on-device machine learning (ML) model(s) based on performance measure(s) of client device(s) and/or the on-device ML model(s). The client device(s) can include on-device memory that stores the on-device ML model(s) and a plurality of testing instances for the on-device ML model(s). When certain condition(s) are satisfied, the client device(s) can process, using the on-device ML model(s), the plurality of testing instances to generate the performance measure(s). The performance measure(s) can include, for example, latency measure(s), memory consumption measure(s), CPU usage measure(s), ML model measure(s) (e.g., precision and/or recall), and/or other measures. In some implementations, the on-device ML model(s) can be activated (or kept active) for use locally at the client device(s) based on the performance measure(s). In other implementations, the on-device ML model(s) can be sparsified based on the performance measure(s).

Type: Application

Filed: March 29, 2021

Publication date: September 29, 2022

Inventors: Dragan Zivkovic, Akash Agrawal, Françoise Beaufays, Tamar Lucassen
MIXED CLIENT-SERVER FEDERATED LEARNING OF MACHINE LEARNING MODEL(S)

Publication number: 20220293093

Abstract: Implementations disclosed herein are directed to federated learning of machine learning (“ML”) model(s) based on gradient(s) generated at corresponding client devices and a remote system. Processor(s) of the corresponding client devices can process client data generated locally at the corresponding client devices using corresponding on-device ML model(s) to generate corresponding predicted outputs, generate corresponding client gradients based on the corresponding predicted outputs, and transmit the corresponding client gradients to the remote system. Processor(s) of the remote system can process remote data obtained from remote database(s) using global ML model(s) to generate additional corresponding predicted outputs, generate corresponding remote gradients based on the additional corresponding predicted outputs. Further, the remote system can utilize the corresponding client gradients and the corresponding remote gradients to update the global ML model(s) or weights thereof.

Type: Application

Filed: March 10, 2021

Publication date: September 15, 2022

Inventors: Françoise Beaufays, Andrew Hard, Swaroop Indra Ramaswamy, Om Dipakbhai Thakkar, Rajiv Mathews
Modality learning on mobile devices

Patent number: 11435898

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross input modality learning in a mobile device are disclosed. In one aspect, a method includes activating a first modality user input mode in which user inputs by way of a first modality are recognized using a first modality recognizer; and receiving a user input by way of the first modality. The method includes, obtaining, as a result of the first modality recognizer recognizing the user input, a transcription that includes a particular term; and generating an input context data structure that references at least the particular term. The method further includes, transmitting, by the first modality recognizer, the input context data structure to a second modality recognizer for use in updating a second modality recognition model associated with the second modality recognizer.

Type: Grant

Filed: October 6, 2020

Date of Patent: September 6, 2022

Assignee: Google LLC

Inventors: Yu Ouyang, Diego Melendo Casado, Mohammadinamul Hasan Sheik, Francoise Beaufays, Dragan Zivkovic, Meltem Oktem
UNSUPERVISED FEDERATED LEARNING OF MACHINE LEARNING MODEL LAYERS

Publication number: 20220270590

Abstract: Implementations disclosed herein are directed to unsupervised federated training of global machine learning (“ML”) model layers that, after the federated training, can be combined with additional layer(s), thereby resulting in a combined ML model. Processor(s) can: detect audio data that captures a spoken utterance of a user of a client device; process, using a local ML model, the audio data to generate predicted output(s); generate, using unsupervised learning locally at the client device, a gradient based on the predicted output(s); transmit the gradient to a remote system; update weight(s) of the global ML model layers based on the gradient; subsequent to updating the weight(s), train, using supervised learning remotely at the remote system, a combined ML model that includes the updated global ML model layers and additional layer(s); transmit the combined ML model to the client device; and use the combined ML model to make prediction(s) at the client device.

Type: Application

Filed: July 20, 2020

Publication date: August 25, 2022

Inventors: Françoise Beaufays, Khe Chai Sim, Johan Schalkwyk
Keyboard Automatic Language Identification and Reconfiguration

Publication number: 20220229548

Abstract: A keyboard is described that determines, using a first decoder and based on a selection of keys of a graphical keyboard, text. Responsive to determining that a characteristic of the text satisfies a threshold, a model of the keyboard identifies the target language of the text, and determines whether the target language is different than a language associated with the first decoder. If the target language of the text is not different than the language associated with the first decoder, the keyboard outputs, for display, an indication of first candidate words determined by the first decoder from the text. If the target language of the text is different: the keyboard enables a second decoder, where a language associated with the second decoder matches the target language of the text, and outputs, for display, an indication of second candidate words determined by the second decoder from the text.

Type: Application

Filed: April 6, 2022

Publication date: July 21, 2022

Applicant: Google LLC

Inventors: Ouais Alsharif, Peter Ciccotto, Francoise Beaufays, Dragan Zivkovic
Personal directory service

Patent number: 11341970

Abstract: A method of providing navigation directions includes receiving, at a user terminal, a query spoken by a user, wherein the query spoken by the user includes a speech utterance indicating (i) a category of business, (ii) a name of the business, and (iii) a location at which or near which the business is disposed; identifying, by processing hardware, the business based on the speech utterance; and providing navigation directions to the business via the user terminal.

Type: Grant

Filed: June 8, 2020

Date of Patent: May 24, 2022

Assignee: GOOGLE LLC

Inventors: Brian Strope, Francoise Beaufays, William J. Byrne
Keyboard automatic language identification and reconfiguration

Patent number: 11327652

Abstract: A keyboard is described that determines, using a first decoder and based on a selection of keys of a graphical keyboard, text. Responsive to determining that a characteristic of the text satisfies a threshold, a model of the keyboard identifies the target language of the text, and determines whether the target language is different than a language associated with the first decoder. If the target language of the text is not different than the language associated with the first decoder, the keyboard outputs, for display, an indication of first candidate words determined by the first decoder from the text. If the target language of the text is different: the keyboard enables a second decoder, where a language associated with the second decoder matches the target language of the text, and outputs, for display, an indication of second candidate words determined by the second decoder from the text.

Type: Grant

Filed: August 10, 2020

Date of Patent: May 10, 2022

Assignee: Google LLC

Inventors: Ouais Alsharif, Peter Ciccotto, Francoise Beaufays, Dragan Zivkovic
ON-DEVICE PERSONALIZATION OF SPEECH SYNTHESIS FOR TRAINING OF SPEECH RECOGNITION MODEL(S)

Publication number: 20220115000

Abstract: Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using an on-device TTS generator model, to generate synthesized speech audio data that includes synthesized speech of the textual segment; process the synthesized speech, using an on-device ASR model to generate predicted ASR output; and generate a gradient based on comparing the predicted ASR output to ground truth output corresponding to the textual segment. Processor(s) of the client device can also: process the synthesized speech audio data using an on-device TTS generator model to make a prediction; and generate a gradient based on the prediction. In these implementations, the generated gradient(s) can be used to update weight(s) of the respective on-device model(s) and/or transmitted to a remote system for use in remote updating of respective global model(s). The updated weight(s) and/or the updated model(s) can be transmitted to client device(s).

Type: Application

Filed: October 28, 2020

Publication date: April 14, 2022

Inventors: Françoise Beaufays, Johan Schalkwyk, Khe Chai Sim
ON-DEVICE SPEECH SYNTHESIS OF TEXTUAL SEGMENTS FOR TRAINING OF ON-DEVICE SPEECH RECOGNITION MODEL

Publication number: 20220005458

Abstract: Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using a speech synthesis model stored locally at the client device, to generate synthesized speech audio data that includes synthesized speech of the identified textual segment; process the synthesized speech, using an on-device speech recognition model that is stored locally at the client device, to generate predicted output; and generate a gradient based on comparing the predicted output to ground truth output that corresponds to the textual segment. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.

Type: Application

Filed: September 20, 2021

Publication date: January 6, 2022

Inventors: Françoise Beaufays, Johan Schalkwyk, Khe Chai Sim
Neural Network for Keyboard Input Decoding

Publication number: 20210405868

Abstract: In some examples, a computing device includes at least one processor; and at least one module, operable by the at least one processor to: output, for display at an output device, a graphical keyboard; receive an indication of a gesture detected at a location of a presence-sensitive input device, wherein the location of the presence-sensitive input device corresponds to a location of the output device that outputs the graphical keyboard; determine, based on at least one spatial feature of the gesture that is processed by the computing device using a neural network, at least one character string, wherein the at least one spatial feature indicates at least one physical property of the gesture; and output, for display at the output device, based at least in part on the processing of the at least one spatial feature of the gesture using the neural network, the at least one character string.

Type: Application

Filed: September 8, 2021

Publication date: December 30, 2021

Applicant: Google LLC

Inventors: Shumin Zhai, Thomas Breuel, Ouais Alsharif, Yu Ouyang, Francoise Beaufays, Johan Schalkwyk

prev 1 2 3 4 5 6 … next