Patents by Inventor Christian Fuegen

Christian Fuegen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Contextualized streaming end-to-end speech recognition with trie-based deep biasing and shallow fusion

Patent number: 12087306

Abstract: In one embodiment, a method includes receiving a user's utterance comprising a word in a custom vocabulary list of the user, generating a previous token to represent a previous audio portion of the utterance, and generating a current token to represent a current audio portion of the utterance by generating a bias embedding by using the previous token to query a trie of wordpieces representing the custom vocabulary list, generating first probabilities of respective first candidate tokens likely uttered in the current audio portion based on the bias embedding and the current audio portion, generating second probabilities of respective second candidate tokens likely uttered after the previous token based on the previous token and the bias embedding, and generating the current token to represent the current audio portion of the utterance based on the first probabilities of the first candidate tokens and the second probabilities of the second candidate tokens.

Type: Grant

Filed: November 24, 2021

Date of Patent: September 10, 2024

Assignee: Meta Platforms, Inc.

Inventors: Duc Hoang Le, FNU Mahaveer, Gil Keren, Christian Fuegen, Yatharth Saraf
Stylizing Text-to-Speech (TTS) Voice Response for Assistant Systems

Publication number: 20230118412

Abstract: In one embodiment, a method includes receiving a voice input having first audio features at a client system, generating a text response corresponding to the voice input, wherein the text response is associated with style features, generating an output audio waveform of the text response by a text-to-speech model on the client system, wherein the output audio waveform is generated based on the first audio features and the style features, wherein the output audio waveform comprises second audio features, and rendering the output audio waveform at the client system in response to the voice input.

Type: Application

Filed: December 21, 2022

Publication date: April 20, 2023

Inventors: Yang Gao, Weiyi Zheng, Zhaojun Yang, Thilo Wolfgang Koehler, Christian Fuegen, Qing He
Stylizing text-to-speech (TTS) voice response for assistant systems

Patent number: 11562744

Abstract: In one embodiment, a method includes receiving a voice input from a user and determining a first style of the voice input, based on first features extracted from the voice input. A second style for a voice response having second features may then be determined based on the first style. Finally, the voice response may be generated based on the second features of the second style, and this voice response may be provided in response to the voice input.

Type: Grant

Filed: February 13, 2020

Date of Patent: January 24, 2023

Assignee: Meta Platforms Technologies, LLC

Inventors: Yang Gao, Weiyi Zheng, Zhaojun Yang, Thilo Wolfgang Koehler, Christian Fuegen, Qing He
Methods and systems for performing end-to-end spoken language analysis

Patent number: 11107462

Abstract: Exemplary embodiments relate to improvements in spoken language understanding (SLU) systems. Conventionally, SLU systems include an automatic speech recognition (ASR) component configured to receive an input of audio data and to generate a textual representation of the audio data. Conventional SLU systems also include a natural language understanding (NLU) component configured to receive a text-based transcript and perform language-based tasks such as domain classification, intent determination, and slot-filling. However, these two components are typically trained separately based on different metrics. In real-world situations, errors in the ASR component propagate to the NLU component, which degrades the performance of the overall system. Exemplary embodiments described herein perform SLU in an end-to-end manner that infers semantic meaning directly from audio features without an intermediate text representation.

Type: Grant

Filed: October 30, 2018

Date of Patent: August 31, 2021

Assignee: FACEBOOK, INC.

Inventors: Christian Fuegen, Yongquiang Wang, Anuj Kumar, Baiyang Liu, Dmitrii Serdiuk
Social hash for language models

Patent number: 10902215

Abstract: Components of language processing engines, such as translation models and language models, can be customized for groups of users or based on user type values. Users can be organized into groups or assigned a value on a continuum based on factors such as interests, biographical characteristics, social media interactions, etc. In some implementations, translation engine components can be customized for groups of users by selecting the training data from content created by users in that group. In some implementations, the group identifier or continuum value can be part of the input into a general translation component allowing the translation component to take a language style of that user group into account when performing language processing tasks.

Type: Grant

Filed: August 23, 2016

Date of Patent: January 26, 2021

Assignee: FACEBOOK, INC.

Inventors: Ying Zhang, Christian Fuegen, Guillaume Lample, Jing Zheng
Social hash for language models

Patent number: 10902221

Abstract: Components of language processing engines, such as translation models and language models, can be customized for groups of users or based on user type values. Users can be organized into groups or assigned a value on a continuum based on factors such as interests, biographical characteristics, social media interactions, etc. In some implementations, translation engine components can be customized for groups of users by selecting the training data from content created by users in that group. In some implementations, the group identifier or continuum value can be part of the input into a general translation component allowing the translation component to take a language style of that user group into account when performing language processing tasks.

Type: Grant

Filed: June 30, 2016

Date of Patent: January 26, 2021

Assignee: FACEBOOK, INC.

Inventors: Ying Zhang, Christian Fuegen, Guillaume Lample, Jing Zheng
Hybrid, offline/online speech translation system

Patent number: 10331794

Abstract: A hybrid speech translation system whereby a wireless-enabled client computing device can, in an offline mode, translate input speech utterances from one language to another locally, and also, in an online mode when there is wireless network connectivity, have a remote computer perform the translation and transmit it back to the client computing device via the wireless network for audible outputting by client computing device. The user of the client computing device can transition between modes or the transition can be automatic based on user preferences or settings. The back-end speech translation server system can adapt the various recognition and translation models used by the client computing device in the offline mode based on analysis of user data over time, to thereby configure the client computing device with scaled-down, yet more efficient and faster, models than the back-end speech translation server system, while still be adapted for the user's domain.

Type: Grant

Filed: August 26, 2016

Date of Patent: June 25, 2019

Assignee: Facebook, Inc.

Inventors: Naomi Aoki Waibel, Alexander Waibel, Christian Fuegen, Kay Rottmann
User-specific pronunciations in a social networking system

Patent number: 10061855

Abstract: A social networking system obtains user pronunciations of terms whose pronunciations might vary among different users, such as names of users. The social networking system additionally obtains demographic information about the users from whom the pronunciations were obtained, as well as social graph information for those users, such as information about connections of those users in the social graph. Based on the obtained pronunciations, the demographic information, and the social graph information, the social networking system determines, for a user having that name (or other term in question), one or more suggested pronunciations for the name that are likely to be the pronunciations that that user would use.

Type: Grant

Filed: December 31, 2014

Date of Patent: August 28, 2018

Assignee: Facebook, Inc.

Inventors: Alexander Waibel, Christian Fuegen, Thilo Wolfgang Koehler
Hybrid, Offline/Online Speech Translation System

Publication number: 20160364385

Abstract: A hybrid speech translation system whereby a wireless-enabled client computing device can, in an offline mode, translate input speech utterances from one language to another locally, and also, in an online mode when there is wireless network connectivity, have a remote computer perform the translation and transmit it back to the client computing device via the wireless network for audible outputting by client computing device. The user of the client computing device can transition between modes or the transition can be automatic based on user preferences or settings. The back-end speech translation server system can adapt the various recognition and translation models used by the client computing device in the offline mode based on analysis of user data over time, to thereby configure the client computing device with scaled-down, yet more efficient and faster, models than the back-end speech translation server system, while still be adapted for the user's domain.

Type: Application

Filed: August 26, 2016

Publication date: December 15, 2016

Inventors: Naomi Aoki Waibel, Alexander Waibel, Christian Fuegen, Kay Rottmann
Hybrid, offline/online speech translation system

Patent number: 9430465

Abstract: A hybrid speech translation system whereby a wireless-enabled client computing device can, in an offline mode, translate input speech utterances from one language to another locally, and also, in an online mode when there is wireless network connectivity, have a remote computer perform the translation and transmit it back to the client computing device via the wireless network for audible outputting by client computing device. The user of the client computing device can transition between modes or the transition can be automatic based on user preferences or settings. The back-end speech translation server system can adapt the various recognition and translation models used by the client computing device in the offline mode based on analysis of user data over time, to thereby configure the client computing device with scaled-down, yet more efficient and faster, models than the back-end speech translation server system, while still be adapted for the user's domain.

Type: Grant

Filed: June 12, 2013

Date of Patent: August 30, 2016

Assignee: Facebook, Inc.

Inventors: Naomi Aoki Waibel, Alexander Waibel, Christian Fuegen, Kay Rottman
USER-SPECIFIC PRONUNCIATIONS IN A SOCIAL NETWORKING SYSTEM

Publication number: 20160188727

Abstract: A social networking system obtains user pronunciations of terms whose pronunciations might vary among different users, such as names of users. The social networking system additionally obtains demographic information about the users from whom the pronunciations were obtained, as well as social graph information for those users, such as information about connections of those users in the social graph. Based on the obtained pronunciations, the demographic information, and the social graph information, the social networking system determines, for a user having that name (or other term in question), one or more suggested pronunciations for the name that are likely to be the pronunciations that that user would use.

Type: Application

Filed: December 31, 2014

Publication date: June 30, 2016

Inventors: Alexander Waibel, Christian Fuegen, Thilo Wolfgang Koehler
HYBRID, OFFLINE/ONLINE SPEECH TRANSLATION SYSTEM

Publication number: 20140337007

Abstract: A hybrid speech translation system whereby a wireless-enabled client computing device can, in an offline mode, translate input speech utterances from one language to another locally, and also, in an online mode when there is wireless network connectivity, have a remote computer perform the translation and transmit it back to the client computing device via the wireless network for audible outputting by client computing device. The user of the client computing device can transition between modes or the transition can be automatic based on user preferences or settings. The back-end speech translation server system can adapt the various recognition and translation models used by the client computing device in the offline mode based on analysis of user data over time, to thereby configure the client computing device with scaled-down, yet more efficient and faster, models than the back-end speech translation server system, while still be adapted for the user's domain.

Type: Application

Filed: June 12, 2013

Publication date: November 13, 2014

Applicant: Facebook, Inc.

Inventors: Naomi Aoki Waibel, Alexander Waibel, Christian Fuegen, Kay Rottmann
Simultaneous translation of open domain lectures and speeches

Patent number: 8504351

Abstract: A real-time open domain speech translation system for simultaneous translation of a spoken presentation that is a spoken monologue comprising one of a lecture, a speech, a presentation, a colloquium, and a seminar. The system includes an automatic speech recognition unit configured for accepting sound comprising the spoken presentation in a first language and for continuously creating word hypotheses, and a machine translation unit that receives the hypotheses, wherein the machine translation unit outputs a translation, into a second language, from the spoken presentation.

Type: Grant

Filed: December 2, 2011

Date of Patent: August 6, 2013

Assignee: Mobile Technologies, LLC

Inventors: Alexander Waibel, Christian Fuegen
SIMULTANEOUS TRANSLATION OF OPEN DOMAIN LECTURES AND SPEECHES

Publication number: 20120078608

Abstract: A real-time open domain speech translation system for simultaneous translation of a spoken presentation that is a spoken monologue comprising one of a lecture, a speech, a presentation, a colloquium, and a seminar. The system includes an automatic speech recognition unit configured for accepting sound comprising the spoken presentation in a first language and for continuously creating word hypotheses, and a machine translation unit that receives the hypotheses, wherein the machine translation unit outputs a translation, into a second language, from the spoken presentation.

Type: Application

Filed: December 2, 2011

Publication date: March 29, 2012

Applicant: MOBILE TECHNOLOGIES, LLC

Inventors: Alexander Waibel, Christian Fuegen
Simultaneous translation of open domain lectures and speeches

Patent number: 8090570

Abstract: A real-time open domain speech translation system for simultaneous translation of a spoken presentation that is a spoken monologue comprising one of a lecture, a speech, a presentation, a colloquium, and a seminar. The system includes an automatic speech recognition unit configured for accepting sound comprising the spoken presentation in a first language and for continuously creating word hypotheses, and a machine translation unit that receives the hypotheses, wherein the machine translation unit outputs a translation, into a second language, from the spoken presentation.

Type: Grant

Filed: October 26, 2007

Date of Patent: January 3, 2012

Assignee: Mobile Technologies, LLC

Inventors: Alexander Waibel, Christian Fuegen
SIMULTANEOUS TRANSLATION OF OPEN DOMAIN LECTURES AND SPEECHES

Publication number: 20080120091

Abstract: A real-time open domain speech translation system for simultaneous translation of a spoken presentation that is a spoken monologue comprising one of a lecture, a speech, a presentation, a colloquium, and a seminar. The system includes an automatic speech recognition unit configured for accepting sound comprising the spoken presentation in a first language and for continuously creating word hypotheses, and a machine translation unit that receives the hypotheses, wherein the machine translation unit outputs a translation, into a second language, from the spoken presentation.

Type: Application

Filed: October 26, 2007

Publication date: May 22, 2008

Inventors: Alexander Waibel, Christian Fuegen