Patents by Inventor Baosheng Yuan

Baosheng Yuan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method, apparatus, and system for building context dependent models for a large vocabulary continuous speech recognition (LVCSR) system

Patent number: 7587321

Abstract: According to one aspect of the invention, a method is provided in which a set of multiple mixture monophone models is created and trained to generate a set of multiple mixture context dependent models. A set of single mixture triphone models is created and trained to generate a set of context dependent models. Corresponding states of the triphone models are clustered to obtain a set of tied states based on a decision tree clustering process. Parameters of the context dependent models are estimated using a data dependent maximum a posteriori (MAP) adaptation method in which parameters of the tied states of the context dependent models are derived by adapting corresponding parameters of the context independent models using the training data associated with the respective tied states.

Type: Grant

Filed: May 8, 2001

Date of Patent: September 8, 2009

Assignee: Intel Corporation

Inventors: Xiaoxing Liu, Baosheng Yuan, Yonghong Yan
Method and system to scale down a decision tree-based hidden markov model (HMM) for speech recognition

Patent number: 7472064

Abstract: A method and system are provided in which a decision tree-based model (“general model”) is scaled down (“trim-down”) for a given task. The trim-down model can be adapted for the given task using task specific data. The general model can be based on a hidden markov model (HMM). By allowing a decision tree-based acoustic model (“general model”) to be scaled according to the vocabulary of the given task, the general model can be configured dynamically into a trim-down model, which can be used to improve speech recognition performance and reduce system resource utilization. Furthermore, the trim-down model can be adapted/adjusted according to task specific data, e.g., task vocabulary, model size, or other like task specific data.

Type: Grant

Filed: September 30, 2000

Date of Patent: December 30, 2008

Assignee: Intel Corporation

Inventors: Qing Guo, Yonghong Yan, Baosheng Yuan
Method, apparatus, and system for building a compact model for large vocabulary continuous speech recognition (LVCSR) system

Patent number: 7454341

Abstract: According to one aspect of the invention, a method is provided in which a mean vector set and a variance vector set of a set of N Gaussians are divided into multiple mean sub-vector sets and variance sub-vector sets, respectively. Each mean sub-vector set contains a subset of the dimensions of the corresponding mean vector set and each variance sub-vector set contains a subset of the dimensions of the corresponding variance vector set. Each resultant sub-vector set is clustered to build a codebook for the respective sub-vector set using a modified K-means clustering process which dynamically merges and splits clusters based upon the size and average distortion of each cluster during each iteration in the modified K-means clustering process.

Type: Grant

Filed: September 30, 2000

Date of Patent: November 18, 2008

Assignee: Intel Corporation

Inventors: Jielin Pan, Baosheng Yuan
Method, apparatus, and system for bottom-up tone integration to Chinese continuous speech recognition system

Patent number: 7181391

Abstract: According to one aspect of the invention, a method is provided in which knowledge about tone characteristics of a tonal syllabic language is used to model speech at various levels in a bottom-up speech recognition structure. The various levels in the bottom-up recognition structure include the acoustic level, the phonetic level, the work level, and the sentence level. At the acoustic level, pitch is treated as a continuous acoustic variable and pitch information extracted from the speech signal is included as feature component of feature vectors. At the phonetic level, main vowels having the same phonetic structure but different tones are defined and modeled as different phonemes. At the word level, as set of tone changes rules is used to build transcription for training data and pronunciation lattice for decoding. At sentence level, a set of sentence ending words with light tone are also added to the system vocabulary.

Type: Grant

Filed: September 30, 2000

Date of Patent: February 20, 2007

Assignee: Intel Corporation

Inventors: Ying Jia, Yonghong Yan, Baosheng Yuan
Search method based on single triphone tree for large vocabulary continuous speech recognizer

Patent number: 6980954

Abstract: A search method based on a single triphone tree for large vocabulary continuous speech recognizer is disclosed in which speech signal are received. Tokens are propagated in a phonetic tree to integrate a language model to recognize the received speech signals. By propagating tokens, which are preserved in tree nodes and record the path history, a single triphone tree can be used in a one pass searching process thereby reducing speech recognition processing time and system resource use.

Type: Grant

Filed: September 30, 2000

Date of Patent: December 27, 2005

Assignee: Intel Corporation

Inventors: Quingwei Zhao, Zhiwei Lin, Yonghong Yan, Baosheng Yuan
Method, apparatus, and system for building context dependent models for a large vocabulary continuous speech recognition (lvcsr) system

Publication number: 20050228666

Abstract: According to one aspect of the invention, a method is provided in which a set of multiple mixture monophone models is created and trained to generate a set of multiple mixture context dependent models. A set of single mixture triphone models is created and trained to generate a set of context dependent models. Corresponding states of the triphone models are clustered to obtain a set of tied states based on a decision tree clustering process. Parameters of the context dependent models are estimated using a data dependent maximum a posteriori (MAP) adaptation method in which parameters of the tied states of the context dependent models are derived by adapting corresponding parameters of the context independent models using the training data associated with the respective tied states.

Type: Application

Filed: May 8, 2001

Publication date: October 13, 2005

Inventors: Xiaoxing Liu, Baosheng Yuan, Yonghong Yan
Method and apparatus for tone-sensitive acoustic modeling

Patent number: 5884261

Abstract: Tone-sensitive acoustic models are generated by first generating acoustic vectors which represent the input data. The input data is separated into multiple frames and an acoustic vector is generated for each frame which represents the input data over its corresponding frame. A tone-sensitive parameter is then generated for each of the frames which indicates the tone of the input data at its corresponding frame. Tone-sensitive parameters are generated in accordance with two embodiments. First, a pitch detector may be used to calculate a pitch for each of the frames. If a pitch cannot be detected for a particular frame, then a pitch is created for that frame based on the pitch values of surrounding frames. Second, the cross covariance between the autocorrelation coefficients for each frame and its successive frame may be generated and used as the tone-sensitive parameter.

Type: Grant

Filed: July 7, 1994

Date of Patent: March 16, 1999

Assignee: Apple Computer, inc.

Inventors: Peter V. de Souza, Adam B. Fineberg, Hsiao-Wuen Hon, Baosheng Yuan

Method, apparatus, and system for building context dependent models for a large vocabulary continuous speech recognition (LVCSR) system

Method and system to scale down a decision tree-based hidden markov model (HMM) for speech recognition

Method, apparatus, and system for building a compact model for large vocabulary continuous speech recognition (LVCSR) system

Method, apparatus, and system for bottom-up tone integration to Chinese continuous speech recognition system

Search method based on single triphone tree for large vocabulary continuous speech recognizer

Method, apparatus, and system for building context dependent models for a large vocabulary continuous speech recognition (lvcsr) system

Method and apparatus for tone-sensitive acoustic modeling