Patents by Inventor Yating Sasha Sheng

Yating Sasha Sheng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SPEECH TRANSCRIPTION USING MULTIPLE DATA SOURCES

Publication number: 20220139400

Abstract: This disclosure describes transcribing speech using audio, image, and other data. A system is described that includes an audio capture system configured to capture audio data associated with a plurality of speakers, an image capture system configured to capture images of one or more of the plurality of speakers, and a speech processing engine. The speech processing engine may be configured to recognize a plurality of speech segments in the audio data, identify, for each speech segment of the plurality of speech segments and based on the images, a speaker associated with the speech segment, transcribe each of the plurality of speech segments to produce a transcription of the plurality of speech segments including, for each speech segment in the plurality of speech segments, an indication of the speaker associated with the speech segment, and analyze the transcription to produce additional data derived from the transcription.

Type: Application

Filed: January 14, 2022

Publication date: May 5, 2022

Inventors: Vincent Charles Cheung, Chengxuan Bai, Yating Sasha Sheng
Speech transcription using multiple data sources

Patent number: 11227602

Abstract: This disclosure describes transcribing speech using audio, image, and other data. A system is described that includes an audio capture system configured to capture audio data associated with a plurality of speakers, an image capture system configured to capture images of one or more of the plurality of speakers, and a speech processing engine. The speech processing engine may be configured to recognize a plurality of speech segments in the audio data, identify, for each speech segment of the plurality of speech segments and based on the images, a speaker associated with the speech segment, transcribe each of the plurality of speech segments to produce a transcription of the plurality of speech segments including, for each speech segment in the plurality of speech segments, an indication of the speaker associated with the speech segment, and analyze the transcription to produce additional data derived from the transcription.

Type: Grant

Filed: November 20, 2019

Date of Patent: January 18, 2022

Assignee: Facebook Technologies, LLC

Inventors: Vincent Charles Cheung, Chengxuan Bai, Yating Sasha Sheng
SPEECH TRANSCRIPTION USING MULTIPLE DATA SOURCES

Publication number: 20210151058

Abstract: This disclosure describes transcribing speech using audio, image, and other data. A system is described that includes an audio capture system configured to capture audio data associated with a plurality of speakers, an image capture system configured to capture images of one or more of the plurality of speakers, and a speech processing engine. The speech processing engine may be configured to recognize a plurality of speech segments in the audio data, identify, for each speech segment of the plurality of speech segments and based on the images, a speaker associated with the speech segment, transcribe each of the plurality of speech segments to produce a transcription of the plurality of speech segments including, for each speech segment in the plurality of speech segments, an indication of the speaker associated with the speech segment, and analyze the transcription to produce additional data derived from the transcription.

Type: Application

Filed: November 20, 2019

Publication date: May 20, 2021

Inventors: Vincent Charles Cheung, Chengxuan Bai, Yating Sasha Sheng

SPEECH TRANSCRIPTION USING MULTIPLE DATA SOURCES

Speech transcription using multiple data sources

SPEECH TRANSCRIPTION USING MULTIPLE DATA SOURCES