Patents Assigned to ATR Interpreting Telephony Research Laboratories

Speech processing using an expanded left to right parser

Patent number: 6058365

Abstract: Continuous speech is recognized by selecting among hypotheses, consisting of candidates of symbol strings obtained by connecting phonemes corresponding to a Hidden Markov Model (HMM) having the highest probability, by referring to a phoneme context dependent type HMM from input speech using a HMM phoneme verification portion. A phoneme context dependent type LR (Left-Right) parser portion predicts a subsequent phoneme by referring to an action specifying item stored in an LR (Left to Right) parsing table to predict a phoneme context around the predicted phoneme using an action specifying item of the LR table.

Type: Grant

Filed: July 6, 1993

Date of Patent: May 2, 2000

Assignee: ATR Interpreting Telephony Research Laboratories

Inventors: Akito Nagai, Kenji Kita, Shigeki Sagayama
Method of generating a subword model for speech recognition

Patent number: 5677988

Abstract: An automated method of generating a subword model for speech recognition dependent on phoneme context for processing speech information using a Hidden Markov Model in which static features of speech and dynamic features of speech are modeled as a chain of a plurality of output probability density distributions. The method comprising determining a phoneme context class which is a model unit allocated to each model, the number of states used for representing each model, relationship of sharing of states among a plurality of models, and output probability density distribution of each model, by repeating splitting of a small number of states, provided in an initial Hidden Markov Model, based on a prescribed criterion on a probabilistic model.

Type: Grant

Filed: September 21, 1995

Date of Patent: October 14, 1997

Assignee: ATR Interpreting Telephony Research Laboratories

Inventors: Jun-ichi Takami, Shigeki Sagayama
Learning method of neural network

Patent number: 5555345

Abstract: The present invention is a learning method of a neural network for identifying N category using a data set consisted of N categories, in which one learning sample is extracted from a learning sample set in step SP1, and the distances between the sample and all the learning samples are obtained in step SP2. The closest n samples are obtained for each category in step SP3, and similarity for each category is obtained using the distances from the samples and a similarity conversion function f(d)=exp (-.alpha..multidot.d.sup.2). In step SP4, the similarity for each category is used as a target signal for the extracted learning sample, and it returns to an initial state until target signals for all the learning samples are determined. When target signals are determined for all the learning samples, in step SP5, the neural network is subjected to learning by the back-propagation using the learning samples and the obtained target signals.

Type: Grant

Filed: March 3, 1992

Date of Patent: September 10, 1996

Assignee: ATR Interpreting Telephony Research Laboratories

Inventors: Yasuhiro Komori, Shigeki Sagayama
Method and apparatus for speaker individuality conversion

Patent number: 5307442

Abstract: Input speech of a reference speaker, who wants to convert his/her voice quality, and speech of a target speaker are converted into a digital signal by an analog to digital (A/D) converter. The digital signal is then subjected to speech analysis by a linear predictive coding (LPC) analyzer. Speech data of the reference speaker is processed into speech segments by a speech segmentation unit. A speech segment correspondence unit makes a dynamic programming (DP) based correspondence between the obtained speech segments and training speech data of the target speaker, thereby making a speech segment correspondence table. A speaker individuality conversion is made on the basis of the speech segment correspondence table by a speech individuality conversion and synthesis unit.

Type: Grant

Filed: September 17, 1991

Date of Patent: April 26, 1994

Assignee: ATR Interpreting Telephony Research Laboratories

Inventors: Masanobu Abe, Shigeki Sagayama

Speech processing using an expanded left to right parser

Method of generating a subword model for speech recognition

Learning method of neural network

Method and apparatus for speaker individuality conversion