Patents Examined by Robert C. Mattson
  • Patent number: 5940794
    Abstract: A boundary estimation method capable of readily learning the probability of existence of a boundary in speech and a speech recognition apparatus with high precision and less model calculation. In a learning mode, an estimator estimates distributions of boundary samples and non-boundary samples. In an estimation mode, a likelihood calculator calculates a likelihood of a boundary from a boundary probability density and a non-boundary probability density.
    Type: Grant
    Filed: July 15, 1996
    Date of Patent: August 17, 1999
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventor: Yoshiharu Abe
  • Patent number: 5845092
    Abstract: A stand-alone, real-time voice recognition system, which converts an analog voice signal into serial digital signal, then preprocesses in parallel the digital signal to detect the end-point, and output fixed multi-order prediction coefficients. In this recognition system, these multi-order prediction coefficients are stored as the reference pattern in the training mode. In recognition mode, these multi-order prediction coefficients are adapted by dynamic time warping method, which is modified by a symmetric form. This symmetric form is implemented with a one-dimensional circular buffer for dynamic programming matching instead of the traditional two-dimensional buffer to save memory space. Finally, these adapted coefficients are compared with reference pattern to output the result of recognition.
    Type: Grant
    Filed: April 14, 1995
    Date of Patent: December 1, 1998
    Assignee: Industrial Technology Research Institute
    Inventor: Chau-Kai Hsieh
  • Patent number: 5841945
    Abstract: In the present invention, the voice signals are A/D converted at a predetermined period of t and are divided into two frequency sides, in that a high frequency side and a low frequency side, and the high frequency side voice signals are then converted into data having a lower sampling frequency than that of the A/D conversion after performing a voice tone level conversion toward a lower frequency side and without performing thinned out sampling of the high frequency side voice signals. On the other hand, the sampling data of the low frequency side are thinned out, therefore equivalently the sampling frequency is reduced. Thereby, the amount of data to be stored is reduced and the reduced data are stored in a memory.
    Type: Grant
    Filed: December 22, 1994
    Date of Patent: November 24, 1998
    Assignee: Rohm Co., Ltd.
    Inventor: Shuji Nishitani
  • Patent number: 5832181
    Abstract: A speech-recognition system for recognizing isolated words includes pre-processing circuitry for performing analog-to-digital conversion and cepstral analysis, and a plurality of neural networks which compute discriminant functions based on polynomial expansions. The system may be implemented using hardware, software, or any combination of hardware and software components. The speech wave-form of a spoken word is analyzed and converted into a sequence of data frames. The sequence of frames is partitioned into data blocks, and the data blocks are then broadcast to a plurality of neural networks. Using the data blocks, the neural networks compute polynomial expansions. The output of the neural networks is used to determine the identity of the spoken word. The neural networks utilize a training algorithm which does not require repetitive training and which yields a global minimum to each given set of training examples.
    Type: Grant
    Filed: June 17, 1996
    Date of Patent: November 3, 1998
    Assignee: Motorola Inc.
    Inventor: Shay-Ping Thomas Wang
  • Patent number: 5796916
    Abstract: In a synthetic speech system intonation of a natural utterance is automatically applied to a synthesized utterance. The present invention applies the desired intonation of the natural utterance to the synthesized utterance by aligning voicing sections of the natural utterance to the synthesized utterance. The voicing sections are initially delineated by voiced versus unvoiced, based on default voicing specifications for the synthetic utterance and on pitch tracker analysis of the natural utterance, and an attempt is made to align individual sections thereby. If no initial alignment occurs then a further attempt is made by varying the default voicing specifications of the synthesized utterance. If alignment is still not achieved, then each of the utterances, natural and synthetic, is considered a single large voicing section, which thus forces alignment therebetween.
    Type: Grant
    Filed: May 26, 1995
    Date of Patent: August 18, 1998
    Assignee: Apple Computer, Inc.
    Inventor: Scott E. Meredith
  • Patent number: 5787231
    Abstract: A voice enunciation system and method provides a user with the capability to sound out text files. As the files are audibly played, if the user is not satisfied with the pronunciation of a particular word, the system provides the user with the means of replacing the word with his own particular pronunciation. The preferred pronunciation is also stored in an override dictionary so that any subsequent encounter with that particular word is pronounced correctly.
    Type: Grant
    Filed: February 2, 1995
    Date of Patent: July 28, 1998
    Assignee: International Business Machines Corporation
    Inventors: William Johnson, Owen Weber
  • Patent number: 5751903
    Abstract: The present invention provides a multi-mode CELP encoding and decoding method and device for digitized speech signals providing improvements over prior art codecs and coding methods by selectively utilizes backward prediction for the short-term predictor parameters and fixed codebook gain of a speech signal. In order to achieve these improvements, the present invention provides a coding method comprising the steps of classifying a segment of the digitized speech signal as one of a plurality of predetermined modes, determining a set of unquantized line spectral frequencies to represent the short term predictor parameters for that segment, and quantizing the determined set of unquantized line spectral frequencies using a mode-specific combination of scalar quantization and vector quantization, which utilizes backward prediction for modes with voiced speech signals.
    Type: Grant
    Filed: December 19, 1994
    Date of Patent: May 12, 1998
    Assignee: Hughes Electronics
    Inventors: Kumar Swaminathan, Murthy Vemuganti
  • Patent number: 5751902
    Abstract: An adaptive prediction filter which provides recursive calculation of prediction coefficients from sampled values of segments of a sampled audio signal. In order to obtain improved computation accuracy the coefficients are in block floating point format, and are recursively calculated from reflection coefficients. Upon calculation of a kth reflection coefficient the k-1 previously computed prediction coefficients are recomputed again based thereon. In the event that results in an overflow of the number of bits in the mantissa of a prediction coefficient which is being calculated, the block floating point format is adapted by increasing the common exponent of the block. That conventionally required recalculation of all previously recalculated coefficients.
    Type: Grant
    Filed: January 5, 1995
    Date of Patent: May 12, 1998
    Assignee: U.S. Philips Corporation
    Inventor: Rudolf Hofmann
  • Patent number: 5734790
    Abstract: In a speech signal encoding system comprising a maximum similarity series extracting unit (50) for producing a series of excitation pulses appearing at an equidistant time interval and an identical amplitude, the maximum similarity series extracting unit sums up, as a waveform, a series of autocorrelation coefficients to produce a series of summation result coefficients. The autocorrelation coefficients correspond to polarized pulses which are equal to one another in pulse interval and pulse amplitude. The polarized pulses form a plurality of pulse sequences which have phases different from one another.
    Type: Grant
    Filed: July 25, 1996
    Date of Patent: March 31, 1998
    Assignee: NEC Corporation
    Inventor: Tetsu Taguchi
  • Patent number: 5729657
    Abstract: The present invention relates to a method and arrangement for transforming phonemes over a shorter or longer time than an existing phoneme. The transformation takes place asymmetrically in that a basic phoneme is divided into a number of points, the said points being identified with respect to information-carrying elements in the phoneme. This provides a weighting in the phoneme between information-carrying elements and elements carrying less information. The parts of the phoneme which elements carrying less information are transformed over a longer or, respectively, shorter time interval. Elements in the phoneme which represent information-carrying parts are transferred unchanged in time. This provides a transformation of the phoneme which retains its original character in all essentials.By the parts of the phoneme carrying less information being identified, the invention also provides an indication of where different phonemes can be fitted into one another in the creation of artificial speech.
    Type: Grant
    Filed: April 16, 1997
    Date of Patent: March 17, 1998
    Assignee: Telia AB
    Inventor: Tomas Svensson
  • Patent number: 5721807
    Abstract: A method and device for recognizing individual words of spoken speech can be used to control technical processes. The method proposed by the invention is based on feature extraction which is particularly efficient in terms of computing capacity and recognition rate, plus subsequent classification of the individual words using a neural network.
    Type: Grant
    Filed: January 21, 1994
    Date of Patent: February 24, 1998
    Assignee: Siemens Aktiengesellschaft Oesterreich
    Inventor: Wolfgang Tschirk
  • Patent number: 5717820
    Abstract: A speech recognition method and apparatus enable the use of the same recognition software for speaker dependent recognition under various running environments such as different computers. The speech recognition method operates in the form of computer software to detect at least one running environment of hardware. The hardware running environment may include processing speed of a computer. Recognition parameters are then determined which correspond to the detected running environment. Next, the speech recognition method converts an input analog speech signal into a digital speech signal. The digital speech signal, in the form of extracted feature data, is then compared with digital feature data in a dictionary. The speech is then recognized in response to a match of the data.
    Type: Grant
    Filed: March 7, 1995
    Date of Patent: February 10, 1998
    Assignee: Fujitsu Limited
    Inventors: Ryosuke Hamasaki, Shinta Kimura
  • Patent number: 5717826
    Abstract: A speech recognition method and apparatus which has a first stage to provide keyword hypotheses and a second stage to provide testing of those hypotheses by utterance verification. The utterance verification used has three separate models for each word: one keyword verification model, one misrecognition verification model, and one non-keyword verification model. Further, all three are developed independently of the recognizer keyword models. Because of this independence, the three verification models can be iteratively trained using existing speech data bases to jointly provide a minimum amount of verification errors.
    Type: Grant
    Filed: August 11, 1995
    Date of Patent: February 10, 1998
    Assignee: Lucent Technologies Inc.
    Inventors: Anand Rangaswamy Setlur, Rafid Antoon Sukkar, Joseph Lawrence LoCicero, Grzegorz Szeszko
  • Patent number: 5704003
    Abstract: An improved method of speech coding for use in conjunction with speech coding methods wherein speech is digitized into a plurality of temporally defined frames, each frame including a plurality of sub-frames, and the digitized speech is partitioned into periodic components and a residual signal. For each of a plurality of sub-frames of the residual signal, the improved method of speech coding selects and applies a time shift T to the sub-frame by applying a matching criterion to (a) the current sub-frame of the residual signal, and (b) a sample-to-sample (subframe-to-subframe) pitch delay determined by applying linear interpolation to known pitch delays occurring at or near frame-to-frame boundaries of previous frames.The matching criterion is applied by minimizing .epsilon.
    Type: Grant
    Filed: September 19, 1995
    Date of Patent: December 30, 1997
    Assignee: Lucent Technologies Inc.
    Inventors: Willem Bastiaan Kleijn, Dror Nahumi
  • Patent number: 5701389
    Abstract: A method of encoding an audio signal is disclosed. The method comprises partitioning the audio signal into a first time block and a second time block. Next, a first time block first energy value and a first time block second energy value are calculated. Next, a second time block first energy value and a second time block second energy value are calculated. Next, the technique determines if an attack has occurred in the second time block by comparing the second time block first energy value and the second time block second energy value and also comparing the Erst time block and the second time block Advantageously, the method identifies attach such that the decoder can reproduce the attacks with little audible distortion and also affords the advantage of using long windows for portions of the audio signal that do not contain attacks.
    Type: Grant
    Filed: January 31, 1995
    Date of Patent: December 23, 1997
    Assignee: Lucent Technologies, Inc.
    Inventors: Sean Matthew Dorward, James David Johnston
  • Patent number: 5694521
    Abstract: A variable speed playback system exploits multiple-period similarities within a residual signal, and includes multiple-period template matching which may be applied to alter the excitation periodical structure, and thereby increase or decrease the rate of speech playback. Embodiments of the present invention enable accurate fast or slow speech playback for store and forward applications without changing the pitch period of the speech. A correlated multiple-period similarity measure is determined for an excitation signal within a compressor/expander. The multiple-period similarity enables overlap-and-add expansion or compression by a rational ratio. Energy variations at the onset and offset portions of the speech may be weighted by energy-based adaptive weight windows.
    Type: Grant
    Filed: January 11, 1995
    Date of Patent: December 2, 1997
    Assignee: Rockwell International Corporation
    Inventors: Eyal Shlomot, Albert Achuan Hsueh
  • Patent number: 5687280
    Abstract: A speech input device includes a speech input section for converting the input speech made by a speaker into an electric signal and outputting the electric signal, and a display section for displaying information indicating a spatial displacement of the position of the lip portion of the speaker from a predetermined position.
    Type: Grant
    Filed: October 29, 1993
    Date of Patent: November 11, 1997
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Kenji Matsui
  • Patent number: 5680508
    Abstract: A speech coding system employs measurements of robust features of speech frames whose distribution are not strongly affected by noise/levels to make voicing decisions for input speech occurring in a noisy environment. Linear programing analysis of the robust features and respective weights are used to determine an optimum linear combination of these features. The input speech vectors are matched to a vocabulary of codewords in order to select the corresponding, optimally matching codeword. Adaptive vector quantization is used in which a vocabulary of words obtained in a quiet environment is updated based upon a noise estimate of a noisy environment in which the input speech occurs, and the "noisy" vocabulary is then searched for the best match with an input speech vector. The corresponding clean codeword index is then selected for transmission and for synthesis at the receiver end. The results are better spectral reproduction and significant intelligibility enhancement over prior coding approaches.
    Type: Grant
    Filed: May 12, 1993
    Date of Patent: October 21, 1997
    Assignee: ITT Corporation
    Inventor: Yu-Jih Liu
  • Patent number: 5659664
    Abstract: The invention relates to a method and an arrangement for speech synthesis and provides an automatic mechanism for simulating human speech. The method provides a number of control parameters for controlling a speech synthesis device. The invention solves the problem of coarticulation by using an interpolation mechanism. The control parameters are stored in a matrix or a sequence list for each polyphone. The behaviour of the respective parameter with time is defined around each phoneme boundary and polyphones are joined by forming a weighted mean value of the curves which are defined by their two associated matrices/sequences list. The invention also provides an arrangement for carrying out the method.
    Type: Grant
    Filed: June 6, 1995
    Date of Patent: August 19, 1997
    Assignee: Televerket
    Inventor: Jaan Kaja
  • Patent number: 5657425
    Abstract: A method and system for location dependent verbal command execution in a computer based control system within an installation having multiple physical locations. A specified function within each physical location, such as a lighting fixture or alarm setting, may be controlled by a selected verbal command. A microphone within each room or physical location within the installation is utilized to detect each utterance of a verbal command and the volume of each verbal command is determined for each physical location at which that command is detected. Thereafter, the physical location having the highest volume for a detected verbal command is identified and the specified function is controlled at only that location. In the event multiple speakers simultaneously utter a verbal command at different physical locations, the location of maximum volume is determined for each speaker and the specified function is controlled at only the maximum volume location associated with each speaker.
    Type: Grant
    Filed: November 15, 1993
    Date of Patent: August 12, 1997
    Assignee: International Business Machines Corporation
    Inventor: William J. Johnson