Patents Examined by Michelle Doerrler
  • Patent number: 5485543
    Abstract: A method for speech analysis and synthesis for obtaining synthesized speech of a high quality includes the steps of determining a short-period power spectrum by performing an FFT operation on a speech wave, sampling the spectrum at the positions corresponding to the multiples of a basic frequency, applying a cosine polynomial model to the thus obtained sample points to determine the spectrum envelope thereat, then calculating the mel cepstrum coefficients from the spectrum envelope, and effecting speech synthesis, utilizing the mel cepstrum coefficients as the filter coefficients in a synthesizing (logarithmic mel spectrum approximation) filter.
    Type: Grant
    Filed: June 8, 1994
    Date of Patent: January 16, 1996
    Assignee: Canon Kabushiki Kaisha
    Inventor: Takashi Aso
  • Patent number: 5481642
    Abstract: In Code Excited Linear Predictive (CELP) coding, stochastic (noise-like) excitation is used in exciting a cascade of long-term and short-term all-pole linear synthesis filters. This approach is based on the observation that the ideal excitation, obtained by inverse-filtering the speech signal, can be modeled for simplicity as Gaussian white noise. Although such stochastic excitation resembles the ideal excitation in its global statistical properties, it contains a noisy component that is irrelevant to the synthesis process. This component introduces some roughness and noisiness in the synthesized speech. The present invention reduces this effect by adaptively controlling the level of the stochastic excitation. The proposed control mechanism links the stochastic excitation to the long-term predictor in such a way that the excitation level is inversely related to the efficiency of the predictor.
    Type: Grant
    Filed: August 8, 1994
    Date of Patent: January 2, 1996
    Assignee: AT&T Corp.
    Inventor: Yair Shoham
  • Patent number: 5479563
    Abstract: The present invention extracts boundaries from a sentence with no need for linguistic knowledge or complicated grammatical rules. Upon extracting a clause/phrase boundary, words are classified according to part-of-speech numbers of words which form inputted sentence information. Then, an input pattern representing part-of-speech numbers of a target word is checked to determine whether a clause/phrase boundary exists before or after the target word; a plurality of words before and after the target words is then applied to a neural network. Among units in the output layer of the neural network, a unit having the output larger than a threshold is determined to refer to a clause/phrase boundary of the target word. Upon extracting a subject-predicate boundary, words are classified in word number, and an input pattern corresponding to a plurality of words are applied to the neural network.
    Type: Grant
    Filed: December 28, 1993
    Date of Patent: December 26, 1995
    Assignee: Fujitsu Limited
    Inventor: Yukiko Yamaguchi
  • Patent number: 5479559
    Abstract: A method for excitation synchronous time encoding of speech signals. The method includes steps of providing an input speech signal, processing the input speech signal to characterize qualities including linear predictive coding (LPC) coefficients, epoch length and voicing and characterizing the input speech signals on a single epoch time domain basis when the input speech signals comprise voiced speech to provide a parameterized voiced excitation function. The method further includes steps of characterizing the input speech signals for at least a portion of a frame when the input speech signals comprise unvoiced speech to provide a parameterized unvoiced excitation function and encoding a composite excitation function including the parameterized unvoiced excitation function and the parameterized voiced excitation function to provide a digital output signal representing the input speech signal.
    Type: Grant
    Filed: May 28, 1993
    Date of Patent: December 26, 1995
    Assignee: Motorola, Inc.
    Inventors: Bruce A. Fette, Chad S. Bergstrom, Sean S. You
  • Patent number: 5475791
    Abstract: A method for recognizing a spoken word in the presence of interfering speech, such as a system-generated voice prompt, begins by echo cancelling the voice prompt and any detected speech signal to produce a residual signal. Portions of the residual signal that have been most recently echo-cancelled are then continuously stored in a buffer. The energy in the residual signal is also continuously processed to determine onset of the spoken word. Upon detection of word onset, the portion of the residual signal then currently in the buffer is retained, the voice prompt is terminated, and the recognizer begins realtime recognition of subsequent portions of the residual signal. Upon detection of word completion, the method retrieves the portion of the residual signal that was retained in the buffer upon detection of word onset and performs recognition of that portion. The recognized portions of the word are then reconstructed to determine the spoken word.
    Type: Grant
    Filed: August 13, 1993
    Date of Patent: December 12, 1995
    Assignee: Voice Control Systems, Inc.
    Inventors: Thomas Schalk, Fadi Kaake
  • Patent number: 5475798
    Abstract: A device for assisting communication that comprises a generally rectangular enclosure of a size constructed and adapted to be held in a user's hand. A microphone is positioned within the enclosure for receiving speech acoustics and converting such acoustics into corresponding electrical signals. Information correlating speech with alphanumeric text is prestored in electronic memory positioned within the enclosure, and the speech signals received from the microphone are correlated with corresponding text in memory. A liquid crystal display (LCD) on one wall of the enclosure displays the alphanumeric text to the user substantially in real time.
    Type: Grant
    Filed: January 6, 1992
    Date of Patent: December 12, 1995
    Assignee: Handlos, L.L.C.
    Inventor: Thomas A. Handlos
  • Patent number: 5475792
    Abstract: A telephony channel simulation process is disclosed for training a speech recognizer to respond to speech obtained from telephone systems. An input speech data set is provided to a speech recognition training processor, whose bandwidth is higher than a telephone bandwidth. The process performs a series of alterations to the input speech data set to obtain a modified speech data set. The modified speech data set enables the speech recognition processor to perform speech recognition on voice signals from a telephone system.
    Type: Grant
    Filed: February 24, 1994
    Date of Patent: December 12, 1995
    Assignee: International Business Machines Corporation
    Inventors: Vince M. Stanford, Norman F. Brickman
  • Patent number: 5475790
    Abstract: In order to reduce the number of update operations for determining LPC (viz., reflection) data from an incoming sampled data, a plurality of matrices representative of autocorrelation functions are set in memory. Subsequently, data representative of the upper triangular portions of the matrices (virtual upper triangular matrices) are extracted from the memory and arranged into an array. This array is then updated using a j-th reflection coefficient, after which the value of j is incremented and the updating is repeated until a predetermined number of updating is completed.
    Type: Grant
    Filed: March 4, 1994
    Date of Patent: December 12, 1995
    Assignee: NEC Corporation
    Inventor: Mayumi Nagasaki
  • Patent number: 5473729
    Abstract: A portable, continuous loop, microchip recording device which stores sound prior to and for a preset period after a timer is triggered by either a manual switch, a preprogrammed acoustical signature, or when a preset decibel or pressure level is reached.
    Type: Grant
    Filed: September 30, 1992
    Date of Patent: December 5, 1995
    Inventors: David P. Bryant, Gene M. Nitschke
  • Patent number: 5473728
    Abstract: A method for training a speech recognizer in a speech recognition system is described. The method of the present invention comprises the steps of providing a data base containing acoustic speech units, generating a homoscedastic hidden Markov model from the acoustic speech units in the data base, and loading the homoscedastic hidden Markov model into the speech recognizer. The hidden Markov model loaded into the speech recognizer has a single covariance matrix which represents the tied covariance matrix of every Gaussian probability density function PDF for every state of every hidden Markov model structure in the homoscedastic hidden Markov model.
    Type: Grant
    Filed: February 24, 1993
    Date of Patent: December 5, 1995
    Assignee: The United States of America as represented by the Secretary of the Navy
    Inventors: Tod E. Luginbuhl, Michael L. Rosseau, Roy L. Streit
  • Patent number: 5467425
    Abstract: The present invention is an n-gram language modeler which significantly reduces the memory storage requirement and convergence time for language modelling systems and methods. The present invention aligns each n-gram with one of "n" number of non-intersecting classes. A count is determined for each n-gram representing the number of times each n-gram occurred in the training data. The n-grams are separated into classes and complement counts are determined. Using these counts and complement counts factors are determined, one factor for each class, using an iterative scaling algorithm. The language model probability, i.e., the probability that a word occurs given the occurrence of the previous two words, is determined using these factors.
    Type: Grant
    Filed: February 26, 1993
    Date of Patent: November 14, 1995
    Assignee: International Business Machines Corporation
    Inventors: Raymond Lau, Ronald Rosenfeld, Salim Roukos
  • Patent number: 5465318
    Abstract: The method disclosed herein facilitates the generation of a recognition model for a non-standard word uttered by a user in the context of a large vocabulary speech recognition system in which standard vocabulary models are represented by sequences of probability distributions for various acoustic symbols. Along with the probability distributions, a corresponding plurality of converse probability functions are precalculated which represent the likelihood that a particular probability distribution would correspond to a given input acoustic symbol. For a non-standard word uttered, a corresponding sequence of acoustic symbols is generated and, for each such symbol in the sequence, the most likely probability distribution is selected using the converse probability functions.
    Type: Grant
    Filed: June 18, 1993
    Date of Patent: November 7, 1995
    Assignee: Kurzweil Applied Intelligence, Inc.
    Inventor: Vladimir Sejnoha
  • Patent number: 5463713
    Abstract: An apparatus for synthesizing speech from text includes a language processing section which determines an accent environment of each mora of the text. In a basic accent pattern table, a basic accent pattern is classified according to the accent environment of the mora. The basic accent pattern includes a pitch data which is edited from real voice data according to the accent environment. A basic accent pattern processing section selects the basic accent pattern of each more from the basic accent pattern table according to the accent environment and processes the basic accent pattern in pitch according to the accent environment. A correcting section receives the corrected pitch data in the basic accent patter processing section and corrects the corrected pitch data according to the number of mora in each phrase and the position of the mora in phrase so as to correct the data into the corrected accent component.
    Type: Grant
    Filed: April 21, 1994
    Date of Patent: October 31, 1995
    Assignee: Kabushiki Kaisha Meidensha
    Inventor: Kazsuya Hasegawa
  • Patent number: 5463716
    Abstract: A frequency bandwidth of a speech signal is divided into a plurality of partial bandwidths. Formant information is extracted on the basis of LPC information developed for the respective partial bandwidths. At least one partial bandwidth may overlap upon the preceding bandwidth. The boundary frequencies of the partial bandwidths can be determined based on the frequency envelope of the speech signal.
    Type: Grant
    Filed: January 18, 1994
    Date of Patent: October 31, 1995
    Assignee: NEC Corporation
    Inventor: Tetsu Taguchi
  • Patent number: 5463715
    Abstract: Speech generation from phonetic code is carried out by a microcomputer based system which stores digitized waveform segments and appropriately joins the segments and outputs them to a digital to analog converter and then to a speaker. An allophone is generated for each phoneme designated by the phonetic codes according to the articulation type of each adjacent phoneme. Each phoneme is classified as neutral, labial, glottal, or medial according to its effect on the articulation of adjacent phonemes. Each phoneme is characterized by at least one center waveform dependent on the phonetic code, and an initial waveform and a final waveform, each of which depend on the phonetic code and the articulation type of the neighboring phoneme. Tables of waveform pointers are accessed according to phonetic code and articulation type, and other tables provide articulation types, times of each waveform portion, transition rate, fricative state, and pitch for each phonetic code.
    Type: Grant
    Filed: December 30, 1992
    Date of Patent: October 31, 1995
    Assignee: Innovation Technologies
    Inventor: Richard T. Gagnon
  • Patent number: 5461697
    Abstract: A speaker recognition system for recognizing a speaker from an input voice using a neural network, in which a feature quantity extracted from the input voice is timewise averaged to create an input pattern to the neural network. The averaging technique is such that the input voice is equally divided timewise into a plurality of blocks in a simple manner and that such feature quantity is averaged every block. The feature quantity includes a frequency characteristic, pitch frequency, linear prediction coefficient, and partial self-correlation (PARCOR) coefficient of the voice.
    Type: Grant
    Filed: November 12, 1993
    Date of Patent: October 24, 1995
    Assignee: Sekisui Kagaku Kogyo Kabushiki Kaisha
    Inventors: Shingo Nishimura, Masashi Miyakawa, Masayuki Umino, Shigenobu Nonaka
  • Patent number: 5457782
    Abstract: A digital voice processing circuit board having use in a voice processing system wherein voice processing functions are run in software. This application of software allows a modular structure since the application software resides in boards that are coupled to a host computer. With this structure, the software can be updated as required and the capacity of the system can be expanded readily to meet increased needs. The digital voice processing board has an interface chip to which random access memories are connected for temporary storage of data and storage of the operating code for the voice processing card. The interface is in communication with an application processor which runs the application programming and database management. The application processor is in communication with and controls a pair of signal processors which are in communication with a time division multiplexer chip which in turn is in communication with the bus.
    Type: Grant
    Filed: September 29, 1993
    Date of Patent: October 10, 1995
    Assignee: Dictaphone Corporation
    Inventors: Daniel F. Daly, Thomas C. Grandy, Mark N. Harris, Salvatore J. Morlando, Mark Sekas, Shamla V. Sharma
  • Patent number: 5454062
    Abstract: A method for identifying any one of a plurality of utterances using a programmed digital computing system, each utterance having an audible form representable by a sequence of speech elements each having a respective position in the sequence. In the computing system, a digital representation corresponding to each of the plurality of utterances is stored a designation respective identifying is assigned to each utterance.
    Type: Grant
    Filed: December 31, 1992
    Date of Patent: September 26, 1995
    Assignee: Audio Navigation Systems, Inc.
    Inventor: Charles La Rue
  • Patent number: 5450525
    Abstract: A voice controlled vehicle accessory system is responsive to voice commands and/or manual commands, wherein the manual commands are entered via a single pushbutton having multiple functions depending upon the instantaneous state of a system controller. The present invention provides an accessory control which avoids false activation of the system by the voice recognition unit while improving accessability of the accessory to a user by providing manual control.
    Type: Grant
    Filed: November 12, 1992
    Date of Patent: September 12, 1995
    Inventors: Donald P. Russell, Paul E. Duffy
  • Patent number: 5450522
    Abstract: A method and system are provided for alleviating the harmful effects of convolutional distortions of speech, such as the effect of a telecommunication channel, on the performance of an automatic speech recognizer (ASR). The technique is based on the filtering of time trajectories of an auditory-like spectrum derived from the Perceptual Linear Predictive (PLP) method of speech parameter estimation.
    Type: Grant
    Filed: August 19, 1991
    Date of Patent: September 12, 1995
    Assignees: U S West Advanced Technologies, Inc., International Computer Science Institute
    Inventors: Hynek Hermansky, Nelson H. Morgan, Philip D. Kohn