Patents Examined by Michelle Doerrler

Training module for estimating mixture Gaussian densities for speech unit models in speech recognition systems

Patent number: 5450523

Abstract: A model-training module generates mixture Gaussian density models from speech training data for continuous, or isolated word speech recognition systems. Speech feature sequences are labeled into segments of states of speech units using Viterbi-decoding based optimized segmentation algorithm. Each segment is modeled by a Gaussian density, and the parameters are estimated by sample mean and sample covariance. A mixture Gaussian density is generated for each state of each speech unit by merging the Gaussian densities of all the segments with the same corresponding label. The resulting number of mixture components is proportional to the dispersion and sample size of the training data. A single, fully merged, Gaussian density is also generated for each state of each speech unit. The covariance matrices of the mixture components are selectively smoothed by a measure of relative sharpness of the Gaussian density and the smoothing can also be done blockwise.

Type: Grant

Filed: June 1, 1993

Date of Patent: September 12, 1995

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventor: Yunxin Zhao
Speech encoder

Patent number: 5448683

Abstract: A speech encoder circuit of a waveform coding system. An analog speech input signal having coded data therein is received by a buffer and split into two branches. In a first branch the input signal is applied to a subtractor of an adaptive predictor to obtain a predicted residual signal subjected to an orthogonal transformation in a discrete cosine transform unit to convert the residual signal into a frequency domain of each of the frames of the individual blocks. An adaptive encoder in the first branch quantizes the frequency domain frame and branches its digital coded output to a branch for output to a transmission line and to a decoding branch. In the second branch of the split input signal a first sample delay element delays the split input signal for a given number of samples and a first attenuator in the second branch delays the output of the first sample delay element a given amount of each frame received for application to the subtractor for obtaining the predicted residual signal.

Type: Grant

Filed: August 30, 1993

Date of Patent: September 5, 1995

Assignee: Kokusai Electric Co., Ltd.

Inventor: Masashi Naitoh
Time series association learning

Patent number: 5440661

Abstract: An acoustic input is recognized from inferred articulatory movements output by a learned relationship between training acoustic waveforms and articulatory movements. The inferred movements are compared with template patterns prepared from training movements when the relationship was learned to regenerate an acoustic recognition. In a preferred embodiment, the acoustic articulatory relationships are learned by a neural network. Subsequent input acoustic patterns then generate the inferred articulatory movements for use with the templates. Articulatory movement data may be supplemented with characteristic acoustic information, e.g. relative power and high frequency data, to improve template recognition.

Type: Grant

Filed: January 31, 1990

Date of Patent: August 8, 1995

Assignee: The United States of America as represented by the United States Department of Energy

Inventor: George J. Papcun
Speech recognition device for calculating a corrected similarity partially dependent on circumstances of production of input patterns

Patent number: 5432886

Abstract: In a speech recognition device including a similarity calculator for calculating a usual similarity as a provisional similarity between an input pattern and prepared reference patterns, a calculating arrangement calculates a reference similarity between the input pattern and produced reference patterns. A correcting unit corrects the provisional similarity by the reference similarity into a corrected similarity. As usual, the similarity may be a dissimilarity. The prepared reference patterns may be memorized in the calcultor or be given by concatenations of primary recognition units. Preferably, the produced reference patterns are concatenations of secondary recognition units memorized in a memory.

Type: Grant

Filed: February 7, 1992

Date of Patent: July 11, 1995

Assignee: NEC Corporation

Inventors: Satoshi Tsukada, Takao Watanabe
Apparatus and methods for training speech recognition systems and their users and otherwise improving speech recognition performance

Patent number: 5428707

Abstract: A tutorial instructs how to use a word recognition system, such as one for speech recognition. It specifies a set of allowed response words for each of a plurality of states. It sends messages on how to use the recognizer in certain states, and, in others, presents exercises in which the user is to enter signals representing expected words. It scores each such signal against word models to select which response word corresponds to it, and then advances to a state associated with that selected response. This scoring is performed against a large vocabulary even though only a small number of responses are allowed, and the signal is rejected if too many non-allowed words score better than any allowed word. The system comes with multiple sets of standard signal models; it scores each against a given user's signals, selects the set which scores best, and then performs adaptive and batch training upon that set.

Type: Grant

Filed: November 13, 1992

Date of Patent: June 27, 1995

Assignee: Dragon Systems, Inc.

Inventors: Joel M. Gould, Elizabeth E. Steele, James K. Baker
Speech signal coding using correlation valves between subframes

Patent number: 5426718

Abstract: A speech signal coding system for coding a speech signal at a bit rate of 8 to 4 kb/s wherein the amount of calculation for fractional search of delays of an adaptive codebook is reduced significantly. Before a fractional delay of the adaptive codebook is found, candidates of integer delay are found by an open-loop using correlation values. A search for a fractional delay by a closed loop is performed for a search range for fractional delays which is provided by .+-.several samples of each integer delay candidate thus found using the correlation values. The fractional delay search is realized by polyphase filtering of an excitation signal in the past. In the search, a plurality of candidates of fractional delay may be found for each integer delay candidate from the adaptive codebook. In this instance, a fractional delay is determined decisively from the decimal delay candidates after a search of an excitation codebook.

Type: Grant

Filed: February 26, 1992

Date of Patent: June 20, 1995

Assignee: NEC Corporation

Inventors: Keiichi Funaki, Kazunori Ozawa
Automatic management system for speech recognition processes

Patent number: 5425128

Abstract: In a computer system including an input-output terminal and an application program having a vocabulary file of acceptable input words and keystroke characters associated with the input words, the vocabulary file and corresponding keystroke characters are automatically retrieved by an operator request through the action of a host application program. Prior to executing a speech recognition process, the retrieved vocabulary file and corresponding keystroke characters are automatically assembled by the same host application program into a syntax file according to syntax rules of a speech recognition program. The resulting vocabulary/syntax file then is compiled into a format useable by the speech recognition process program. The compiled vocabulary/syntax file is used to automatically prompt the operator to speak the various vocabulary words, causing an analog-to-digital conversion circuit to produce digital template codes that are assembled into a speech template file.

Type: Grant

Filed: May 29, 1992

Date of Patent: June 13, 1995

Assignee: Sunquest Information Systems, Inc.

Inventor: Robert L. Morrison
Apparatus for transforming voice using neural networks

Patent number: 5425130

Abstract: An apparatus for transforming a voice signal of a talker into a voice signal having characteristics of a different person provides apparatus for separating the talker's voice signal into a plurality of voice parameters including frequency components, a neural network for transforming at least some of the separated frequency components into those characteristic of the different person, and apparatus for combining the voice parameters for reconstituting the talker's voice signal having characteristics of the different person.

Type: Grant

Filed: April 16, 1993

Date of Patent: June 13, 1995

Assignee: Lockheed Sanders, Inc.

Inventor: David P. Morgan
Apparatus and methods for the generation of stabilised images from waveforms

Patent number: 5422977

Abstract: Peaks are detected in the waveform and in response to the detection of peaks, successive segments of the waveform are sampled. The successive segments sampled are then summed with previously summed segments to produce a stabilized image of the waveform. The generation of the stabilized image is a data-driven process and one which is sensitive and responsive to periodic characteristics of the waveform and hence is particularly useful in the analysis of sound waves and in speech recognition systems.

Type: Grant

Filed: January 25, 1993

Date of Patent: June 6, 1995

Assignee: Medical Research Council

Inventors: Roy D. Patterson, John W. Holdsworth
High efficiency digital data encoding and decoding apparatus

Patent number: 5414795

Abstract: An apparatus for compressing a digital input signal compresses a digital input signal arranged into frames of plural samples. The digital input signal is orthogonally transformed by plural orthogonal transform circuits in blocks derived by dividing the frames by a different divisor in each circuit. The resulting spectral coefficients are quantized using an adaptive number of bits. The output of one orthogonal transform circuit is selected based on the outputs of the orthogonal transform circuits. The digital input signal is divided into plural frequency ranges, and frames are formed in each range. A block length decision circuit determines division of the frames of each range signal into blocks in response to range signal dynamics. The range signals are orthogonally transformed in blocks and the spectral coefficients are quantized. The apparatus also includes a frequency analyzing circuit that derives spectral data points from the input data, and groups them in plural bands.

Type: Grant

Filed: March 26, 1992

Date of Patent: May 9, 1995

Assignee: Sony Corporation

Inventors: Kyoya Tsutsui, Osamu Shimoyoshi
Sound outputting devices using digital displacement data for a PWM sound signal

Patent number: 5408583

Abstract: Sound information is binarized by delta modulation or adaptive delta modulation. The resulting data is subjected to pulse wide modulation and the resulting data is delivered to a reproducing device, which detects a leading edge of a received signal pulse to produce a self operation timing clock to thereby reproduce a sound. When the reproducing device detects the absence of edges for a predetermined interval after the edge detection, it turns off a power source for an analog circuit to suppress power consumption.

Type: Grant

Filed: July 14, 1992

Date of Patent: April 18, 1995

Assignee: Casio Computer Co., Ltd.

Inventors: Kazuyoshi Watanabe, Ryou Ishikawa
Method and apparatus adapted for an audibly-driven, handheld, keyless and mouseless computer for performing a user-centered natural computer language

Patent number: 5408582

Abstract: The invention is a software driven methodology plus input processing circuit used to program a computer with instructions derived from user-spoken voice commands. Any "mathematical" or "logical" operation or combination of operations can be executed by a single user-spoken voice command. The hardware and software has two modes of operation, a training mode and an execute mode. In its training mode, the hardware is taught to correlate a specific operation or instruction displayed to the user, with his voice command. In its execution mode the software permits comparison of digital representations of received voice sounds with the stored digital representations of the voice commands. If the received sounds match one of the stored commands, the software creates the corresponding operations of the matched commands to be sent to the output circuitry for execution.

Type: Grant

Filed: May 5, 1993

Date of Patent: April 18, 1995

Inventor: Ronald L. Colier
Speech recognition system with neural network

Patent number: 5404422

Abstract: A voice recognition apparatus capable of recognizing any word utterance by using a neural network, the apparatus includes a unit for inputting an input utterance and for outputting compressed feature variables of the input utterance a unit using a neural network and connected to the input unit for receiving the compressed feature variables output from the input unit and for outputting a value corresponding to a similarity between the input utterance and words to be recognized. The neural network unit has a first unit for outputting a value which corresponds to a similarity in partial phoneme series of a specific word among vocabularies to be recognized with respect to the input utterance. The neural network also has a second unit connected to the first unit for receiving all of the values output from the first unit and for outputting a value corresponding to a similarity in the specific word with respect to the input utterance.

Type: Grant

Filed: February 26, 1993

Date of Patent: April 4, 1995

Assignee: Sharp Kabushiki Kaisha

Inventors: Kenji Sakamoto, Kouichi Yamaguchi, Toshio Akabane, Yoshiji Fujimoto
Speech coding and decoding methods using adaptive and random code books

Patent number: 5396576

Abstract: An excitation vector of the previous frame stored in an adaptive codebook is cut out with a selected pitch period. The excitation vector thus cut out is repeated until one frame is formed, by which a periodic component codevector is generated. An optimum pitch period is searched for so that distortion of a reconstructed speech obtained by exciting a linear predictive synthesis filter with the periodic component codevector is minimized. Thereafter, a random codevector selected from a random codebook is cut out with the optimum pitch period and is repeated until one frame is formed, by which a repetitious random codevector is generated. The random codebook is searched for a random codevector which minimizes the distortion of the reconstructed speech which is provided by exciting the synthesis filter with the repetitious random codevector.

Type: Grant

Filed: May 20, 1992

Date of Patent: March 7, 1995

Assignee: Nippon Telegraph and Telephone Corporation

Inventors: Satoshi Miki, Takehiro Moriya, Kazunori Mano, Hitoshi Ohmuro, Hirohito Suda
Apparatus for high-speed recording compressed digital data with increased compression

Patent number: 5388209

Abstract: An apparatus for high-speed recording compressed digital data with additional bit reduction. The apparatus makes a recording, on a recording medium, of a recording signal in response to compressed digital data representing an audio information signal. The recording, upon reproduction from the recording medium, conversion to an analog signal, and reproduction of the analog signal, is for perception by the human ear. The compressed digital data has a constant bit rate. The apparatus includes a device that receives the compressed digital data, and a device that derives the recording signal by removing redundant bits from the compressed digital data. Redundant bits are bits that make no difference to quantization noise perceived by the human ear. Finally, the apparatus includes a device for recording the recording signal on the recording medium. The recording signal having a variable bit rate.

Type: Grant

Filed: July 31, 1992

Date of Patent: February 7, 1995

Assignee: Sony Corporation

Inventor: Kenzo Akagiri
Apparatus and method for playing back audio at faster or slower rates without pitch distortion

Patent number: 5386493

Abstract: A computer implemented apparatus and method for modifying the playback rate of a previously stored audio or voice data file stored within a computer system without altering the pitch of the audio data file as originally stored. The present invention also maintains a high level of sound quality during playback. The present invention includes a double buffering system in order to perform all of the desired calculations in real time. A time stretching technique is employed upon the audio data file to decrease or increase playback rate which creates audio segments requiring joining processing. Junctions are smoothed by employing a cross-fade amplitude envelope filter and a compressor/limiter is used to maintain filter range. The system may operate on a desktop computer allowing for advantageous playback and audio data management options of stored voice and or sound data.

Type: Grant

Filed: September 25, 1992

Date of Patent: January 31, 1995

Assignee: Apple Computer, Inc.

Inventors: Leo M. W. F. Degen, Martijn Zwartjes
Vector quantizing apparatus and speech analysis-synthesis system using the apparatus

Patent number: 5384891

Abstract: A vector quantizing apparatus having a general vector quantization circuit, and a storage means for storing at least one frame of data as the result of comparison by a matching circuit is provided. Further, provided are a speech analysis-synthesis system having a spectral envelope generator for generating a spectral envelope which is so smooth that excessive beating is avoided, a spectral envelope vector converter for sampling the spectral envelope at equal intervals on mel-scale, a vector quantizer for quantizing vectors, and a spectral envelope reconstructor for reconstructing the spectral envelope by interpolation on the basis of combined parabolas.

Type: Grant

Filed: October 15, 1991

Date of Patent: January 24, 1995

Assignee: Hitachi, Ltd.

Inventors: Yoshiaki Asakawa, Katsuya Yamasaki, Akira Ichikawa
Method and apparatus for providing multiple clients simultaneous access to a sound data stream

Patent number: 5384890

Abstract: A method and apparatus for providing multiple clients simultaneous access to a sound input/output (I/O) data stream. The present invention provides a method and apparatus for providing multiple programming data structures and multiple patch points in a list, in which each of the patch points are positioned relative to at least one of the programming data structures and is capable of receiving at least one programming data structure for insertion into the list to perform a function. The present invention also includes a method and apparatus for providing at least one buffer for inputting the data stream into and/or receiving the data stream output from each of inserted programming structures, such that each inserted structure can access and operate on the data stream. In this way, multiple clients can access and process the data stream transparently, without interfering with the operation of other clients, yet affecting the sound stream in the desired way.

Type: Grant

Filed: September 30, 1992

Date of Patent: January 24, 1995

Assignee: Apple Computer, Inc.

Inventors: Eric C. Anderson, Hugh B. Svendsen
Method and apparatus for speech synthesis based on prosodic analysis

Patent number: 5384893

Abstract: A system for synthesizing a speech signal from strings of words, which are themselves strings of characters, includes a memory in which predetermined syntax tags are stored in association with entered words and phonetic transcriptions are stored in association with the syntax tags. A parser accesses the memory and groups the syntax tags of the entered words into phrases according to a first set of predetermined grammatical rules relating the syntax tags to one another. The parser also verifies the conformance of sequences of the phrases to a second set of predetermined grammatical rules relating the phrases to one another. The system retrieves the phonetic transcriptions associated with the syntax tags that were grouped into phrases conforming to the second set of rules, and also translates predetermined strings of characters into words.

Type: Grant

Filed: September 23, 1992

Date of Patent: January 24, 1995

Assignee: Emerson & Stern Associates, Inc.

Inventor: Sandra E. Hutchins
Time series signal analyzer including neural network having path groups corresponding to states of Markov chains

Patent number: 5381513

Abstract: A neural network with a high recognition rate when applied to static patterns is made applicable to dynamic time series patterns such as voice signals. Plural units with one or more inputs and outputs are interconnected, and a unique load coefficient is assigned to each connection to weight the signals flowing through that connection. The neural network includes an input unit group to which are input the components of plural vectors included in the input feature vector series {y(t)}; an output unit which outputs the converted vectors, which are produced by passing the input vectors through each unit and the associated connections; and J paths from input unit group to the output unit group. The units are connected to form a Hidden Markov Model wherein each signal path identified as j=1, 2, . . . , J corresponds to the same state.

Type: Grant

Filed: March 16, 1994

Date of Patent: January 10, 1995

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventor: Eiichi Tsuboka

prev 1 2 3 4 5 6 7 … next