Patents by Inventor Yixin Gao

Yixin Gao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

VIDEO INTERACTION METHOD AND APPARATUS, AND DEVICE AND MEDIUM

Publication number: 20240129592

Abstract: The present disclosure relates to a video interaction method and apparatus, and a device and a medium. The video interaction method comprises: presenting a video list in a video presentation page; when a first operation on the video list is detected, playing, on the video presentation page and by means of a floating window, a video corresponding to the first operation; and when a second operation on the floating window is detected, jumping, from the video presentation page, to display a target video detail page, wherein the target video detail page is a video detail page corresponding to the target video, which is played in the floating window.

Type: Application

Filed: December 22, 2023

Publication date: April 18, 2024

Inventors: Sainan GUAN, Ding GAO, Qiongxing HONG, Yixin ZHENG
Wakeword and acoustic event detection

Patent number: 11670299

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

Type: Grant

Filed: May 17, 2021

Date of Patent: June 6, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Ming Sun, Thibaud Senechai, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
Wakeword detection using multi-word model

Patent number: 11308939

Abstract: A system and method performs wakeword detection and automatic speech recognition using the same acoustic model. A mapping engine maps phones/senones output by the acoustic model to phones/senones corresponding to the wakeword. A hidden Markov model (HMM) may determine that the wakeword is present in audio data; the HMM may have multiple paths for multiple wakewords or may have multiple models. Once the wakeword is detected, ASR is performed using the acoustic model.

Type: Grant

Filed: September 25, 2018

Date of Patent: April 19, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Yixin Gao, Ming Sun, Varun Nagaraja, Gengshen Fu, Chao Wang, Shiv Naga Prasad Vitaladevuni
MULTILINGUAL WAKEWORD DETECTION

Publication number: 20210398533

Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.

Type: Application

Filed: June 28, 2021

Publication date: December 23, 2021

Inventors: Yixin Gao, Ming Sun, Jason Krone, Shiv Naga Prasad Vitaladevuni, Yuzong Liu
WAKEWORD AND ACOUSTIC EVENT DETECTION

Publication number: 20210358497

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

Type: Application

Filed: May 17, 2021

Publication date: November 18, 2021

Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
Wakeword and acoustic event detection

Patent number: 11132990

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

Type: Grant

Filed: June 26, 2019

Date of Patent: September 28, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
Multilingual wakeword detection

Patent number: 11069353

Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.

Type: Grant

Filed: May 6, 2019

Date of Patent: July 20, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Yixin Gao, Ming Sun, Jason Krone, Shiv Naga Prasad Vitaladevuni, Yuzong Liu
Wakeword and acoustic event detection

Patent number: 11043218

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

Type: Grant

Filed: June 26, 2019

Date of Patent: June 22, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
Method and apparatus for identifying questionable line break characters in an application

Patent number: 10713437

Abstract: Embodiments of the present invention provide a method and an apparatus for word detection in an application program. The method includes extracting a resource file from a multilingual application program installation package and converting the resource file into a text file. The method further includes disassembling the text file according to a language version to acquire a corresponding language text file; invoking a language detection tool according to the language version; and checking the language text file by using the language detection tool to identify questionable character information. The apparatus for word detection includes a file processing module, configured to extract a resource file from a multilingual application program installation package, and convert the resource file into a text file; and a disassembling module, configured to disassemble the text file according to a language version to acquire a corresponding language text file.

Type: Grant

Filed: August 14, 2017

Date of Patent: July 14, 2020

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Rumin Ding, Juzhen Huo, Yixin Gao
Binary target acoustic trigger detecton

Patent number: 10460729

Abstract: A method for selective transmission of audio data to a speech processing server uses detection of an acoustic trigger in the audio data in determining the data to transmit. Detection of the acoustic trigger makes use of an efficient computation approach that reduces the amount of run-time computation required, or equivalently improves accuracy for a given amount of computation, by using a neural network to determine an indicator of presence of the acoustic trigger. In some example, the neural network combines a “time delay” structure in which intermediate results of computations are reused at various time delays, thereby avoiding computation of computing new results, and decomposition of certain transformations to require fewer arithmetic operations without sacrificing significant performance.

Type: Grant

Filed: June 30, 2017

Date of Patent: October 29, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Ming Sun, Aaron Lee Mathers Challenner, Yixin Gao, Shiv Naga Prasad Vitaladevuni
Acoustic trigger detection

Patent number: 10460722

Abstract: A method for selective transmission of audio data to a speech processing server uses detection of an acoustic trigger in the audio data in determining the data to transmit. Detection of the acoustic trigger makes use of an efficient computation approach that reduces the amount of run-time computation required, or equivalently improves accuracy for a given amount of computation, by combining a “time delay” structure in which intermediate results of computations are reused at various time delays, thereby avoiding computation of computing new results, and decomposition of certain transformations to require fewer arithmetic operations without sacrificing significant performance. For a given amount of computation capacity the combination of these two techniques provides improved accuracy as compared to current approaches.

Type: Grant

Filed: June 30, 2017

Date of Patent: October 29, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Ming Sun, David Snyder, Yixin Gao, Nikko Strom, Spyros Matsoukas, Shiv Naga Prasad Vitaladevuni
METHOD AND APPARATUS FOR WORD DETECTION IN APPLICATION PROGRAM

Publication number: 20170364501

Abstract: Embodiments of the present invention provide a method and an apparatus for word detection in an application program. The method includes extracting a resource file from a multilingual application program installation package and converting the resource file into a text file. The method further includes disassembling the text file according to a language version to acquire a corresponding language text file; invoking a language detection tool according to the language version; and checking the language text file by using the language detection tool to identify questionable character information. The apparatus for word detection includes a file processing module, configured to extract a resource file from a multilingual application program installation package, and convert the resource file into a text file; and a disassembling module, configured to disassemble the text file according to a language version to acquire a corresponding language text file.

Type: Application

Filed: August 14, 2017

Publication date: December 21, 2017

Inventors: RUMIN DING, JUZHEN HUO, YIXIN GAO
Correcting questionable line breaks after an OCR

Patent number: 9767090

Abstract: Embodiments of the present invention provide a method and an apparatus for word detection in an application program. The method includes extracting a resource file from a multilingual application program installation package and converting the resource file into a text file. The method further includes disassembling the text file according to a language version to acquire a corresponding language text file; invoking a language detection tool according to the language version; and checking the language text file by using the language detection tool to identify questionable character information. The apparatus for word detection includes a file processing module, configured to extract a resource file from a multilingual application program installation package, and convert the resource file into a text file; and a disassembling module, configured to disassemble the text file according to a language version to acquire a corresponding language text file.

Type: Grant

Filed: June 24, 2015

Date of Patent: September 19, 2017

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Rumin Ding, Juzhen Huo, Yixin Gao
METHOD AND APPARATUS FOR WORD DETECTION IN APPLICATION PROGRAM

Publication number: 20150293898

Abstract: Embodiments of the present invention provide a method and an apparatus for word detection in an application program. The method includes extracting a resource file from a multilingual application program installation package and converting the resource file into a text file. The method further includes disassembling the text file according to a language version to acquire a corresponding language text file; invoking a language detection tool according to the language version; and checking the language text file by using the language detection tool to identify questionable character information. The apparatus for word detection includes a file processing module, configured to extract a resource file from a multilingual application program installation package, and convert the resource file into a text file; and a disassembling module, configured to disassemble the text file according to a language version to acquire a corresponding language text file.

Type: Application

Filed: June 24, 2015

Publication date: October 15, 2015

Inventors: RUMIN DING, JUZHEN HUO, YIXIN GAO
METHOD AND SYSTEM FOR ANALYZING A TASK TRAJECTORY

Publication number: 20140378995

Abstract: A computer-implemented method of analyzing a sample task trajectory including obtaining, with one or more computers, position information of an instrument in the sample task trajectory, obtaining, with the one or more computers, pose information of the instrument in the sample task trajectory, comparing, with the one or more computers, the position information and the pose information for the sample task trajectory with reference position information and reference pose information of the instrument for a reference task trajectory, determining, with the one or more computers, a skill assessment for the sample task trajectory based on the comparison, and outputting, with the one or more computers, the determined skill assessment for the sample task trajectory.

Type: Application

Filed: May 7, 2012

Publication date: December 25, 2014

Applicants: Intuitive Surgical Operations, Inc., The Johns Hopkins University

Inventors: Rajesh Kumar, Gregory D. Hager, Amod S. Jog, Yixin Gao, May Liu, Simon Peter DiMaio, Brandon Itkowitz, Myriam Curet