Patents by Inventor Junkun CHEN

Junkun CHEN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Stable Output Streaming Speech Translation System

Publication number: 20240395240

Abstract: A computer implemented method includes receiving speech data representative of speech in a first language The speech data is divided into chunks of speech data, each chunk comprising multiple temporally consecutive frames of acoustic information. Each temporally consecutive chunk of data is processed using beam search on each frame to identify candidate language tokens representing a second language different from the first language. A best candidate language token(s) is selected for each chunk as processed. The selected best candidate language token or tokens for each chunk of data is committed as a prefix for a next temporally consecutive chunk of data.

Type: Application

Filed: May 23, 2023

Publication date: November 28, 2024

Inventors: Junkun Chen, Jinyu Li, Peidong Wang, Jian Xue
Fused acoustic and text encoding for multimodal bilingual pretraining and speech translation

Patent number: 12050882

Abstract: Representation learning for text and speech has improved many language-related tasks. However, existing methods only learn from one input modality, while a unified representation for both speech and text is needed for tasks such as end-to-end speech translation. Consequently, these methods cannot exploit various large-scale text and speech data and their performance is limited by the scarcity of parallel speech translation data. To address these problems, embodiments of a fused acoustic and text masked language model (FAT-MLM) are disclosed. FAT-MLM embodiments jointly learn a unified representation for both acoustic and text input from various types of corpora including parallel data for speech recognition and machine translation, and pure speech and text data. Within this cross-modal representation learning framework, an end-to-end model is further presented for fused acoustic and text speech translation.

Type: Grant

Filed: November 23, 2021

Date of Patent: July 30, 2024

Assignee: Baidu USA LLC

Inventors: Renjie Zheng, Junkun Chen, Mingbo Ma, Liang Huang
SEMICONDUCTOR DEVICE AND MANUFACTURING METHOD THEREOF

Publication number: 20240204101

Abstract: A semiconductor device and a manufacturing method thereof are disclosed in the present invention. The semiconductor device includes a source structure; a gate structure disposed above the source structure; a first opening penetrates through the gate structure in a vertical direction; a semiconductor structure; a gate dielectric layer; an insulation structure; and a void. The semiconductor structure is partially disposed in the first opening, and at least a portion of the gate structure is located at two opposite sides of the semiconductor structure in a horizontal direction. The gate dielectric layer is disposed in the first opening and located between the semiconductor structure and the gate structure. At least a portion of the insulation structure is disposed in the first opening, at least a portion of the semiconductor structure is located between the insulation structure and the gate dielectric layer, and the void is located in the insulation structure.

Type: Application

Filed: March 30, 2023

Publication date: June 20, 2024

Applicant: FUJIAN JINHUA INTEGRATED CIRCUIT CO., LTD.

Inventors: GUOGUO KONG, GANG WU, MINGRU GE, SHIWEI HE, HSIEN-SHIH CHU, JUNKUN CHEN
FUSED ACOUSTIC AND TEXT ENCODING FOR MULTIMODAL BILINGUAL PRETRAINING AND SPEECH TRANSLATION

Publication number: 20230169281

Abstract: Representation learning for text and speech has improved many language-related tasks. However, existing methods only learn from one input modality, while a unified representation for both speech and text is needed for tasks such as end-to-end speech translation. Consequently, these methods cannot exploit various large-scale text and speech data and their performance is limited by the scarcity of parallel speech translation data. To address these problems, embodiments of a fused acoustic and text masked language model (FAT-MLM) are disclosed. FAT-MLM embodiments jointly learn a unified representation for both acoustic and text input from various types of corpora including parallel data for speech recognition and machine translation, and pure speech and text data. Within this cross-modal representation learning framework, an end-to-end model is further presented for fused acoustic and text speech translation.

Type: Application

Filed: November 23, 2021

Publication date: June 1, 2023

Applicant: Baidu USA LLC

Inventors: Renjie ZHENG, Junkun CHEN, Mingbo MA, Liang HUANG
Method and apparatus for generating dialogue model

Patent number: 11537798

Abstract: Embodiments of the present disclosure relate to a method and apparatus for generating a dialogue model. The method may include: acquiring a corpus sample set, a corpus sample including input information and target response information; classifying corpus samples in the corpus sample set, setting discrete hidden variables for the corpus samples based on a classification result to generate a training sample set, a training sample including the input information, the target response information, and a discrete hidden variable; and training a preset neural network using the training sample set to obtain the dialogue model, the dialogue model being used to represent a corresponding relationship between inputted input information and outputted target response information.

Type: Grant

Filed: June 8, 2020

Date of Patent: December 27, 2022

Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventors: Siqi Bao, Huang He, Junkun Chen, Fan Wang, Hua Wu, Jingzhou He
METHOD AND APPARATUS FOR GENERATING DIALOGUE MODEL

Publication number: 20210200957

Abstract: Embodiments of the present disclosure relate to a method and apparatus for generating a dialogue model. The method may include: acquiring a corpus sample set, a corpus sample including input information and target response information; classifying corpus samples in the corpus sample set, setting discrete hidden variables for the corpus samples based on a classification result to generate a training sample set, a training sample including the input information, the target response information, and a discrete hidden variable; and training a preset neural network using the training sample set to obtain the dialogue model, the dialogue model being used to represent a corresponding relationship between inputted input information and outputted target response information.

Type: Application

Filed: June 8, 2020

Publication date: July 1, 2021

Inventors: Siqi BAO, Huang HE, Junkun CHEN, Fan WANG, Hua WU, Jingzhou HE

Stable Output Streaming Speech Translation System

Fused acoustic and text encoding for multimodal bilingual pretraining and speech translation

SEMICONDUCTOR DEVICE AND MANUFACTURING METHOD THEREOF

FUSED ACOUSTIC AND TEXT ENCODING FOR MULTIMODAL BILINGUAL PRETRAINING AND SPEECH TRANSLATION

Method and apparatus for generating dialogue model

METHOD AND APPARATUS FOR GENERATING DIALOGUE MODEL