Patents Examined by John Michael Grover

Encoding, decoding and compression of audio-type data using reference coefficients located within a band a coefficients

Patent number: 5640486

Abstract: An audio type signal is encoded. The signal is first divided into bands. For each band, a yardstick signal element is selected. The yardstick may be the signal element having the largest magnitude in the band, the second largest, closest to the median magnitude, or having some other selected magnitude. This magnitude is used for various purposes, including assigning bits to the different bands, and for establishing reconstruction levels within a band. The magnitude of non yardstick signal elements is also quantized. The encoded signal is also decoded. Apparatus for both encoding and decoding are also disclosed. The location of the yardstick element within its band may also be recorded and encoded, and used for efficiently allocating bits to non-yardstick signal elements. Split bands may be established, such that each split band includes a yardstick signal element and each full band includes a major and a minor yardstick signal element.

Type: Grant

Filed: November 28, 1994

Date of Patent: June 17, 1997

Assignee: Massachusetts Institute of Technology

Inventor: Jae S. Lim
Method and apparatus for encoding, decoding and compression of audio-type data

Patent number: 5625746

Abstract: An audio-type signal is encoded. The signal is first divided into bands. For each band, a yardstick signal element is selected. Its magnitude is quantized using a first level of accuracy. This magnitude is used for various purposes, including assigning bits to the different bands, and for establishing reconstruction levels within a band. The magnitude of non-yardstick signal elements is quantized with less accuracy than are the yardstick signal elements. The encoded signal is also decoded. Apparatus for both encoding and decoding are also disclosed. The location of the yardstick element within its band may also be recorded and encoded, and used for efficiently allocating bits to non-yardstick signal elements.

Type: Grant

Filed: February 23, 1995

Date of Patent: April 29, 1997

Assignee: Massachusetts Institute of Technology

Inventor: Jae S. Lim
Method and system for improving speech recognition through front-end normalization of feature vectors

Patent number: 5604839

Abstract: A method and system for improving speech recognition through front-end normalization of feature vectors are provided. Speech to be recognized is spoken into a microphone, amplified by an amplifier, and converted from an analog signal to a digital signal by an analog-to-digital ("A/D") converter. The digital signal from the A/D converter is input to a feature extractor that breaks down the signal into frames of speech and then extracts a feature vector from each of the frames. The feature vector is input to an input normalizer that normalizes the vector. The input normalizer normalizes the feature vector by computing a correction vector and subtracting the correction vector from the feature vector. The correction vector is computed based on the probability of the current frame of speech being noise and based on the average noise and speech feature vectors for a current utterance and a database of utterances.

Type: Grant

Filed: July 29, 1994

Date of Patent: February 18, 1997

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Xuedong Huang
Method of speaker clustering for unknown speakers in conversational audio data

Patent number: 5598507

Abstract: A method for clustering speaker data from a plurality of unknown speakers. The method includes steps of providing a portion of audio data containing speech from at least all the speakers in the audio data and dividing the portion into data clusters. A pairwise distance between each pair of clusters is computed, the pairwise distance being based on a likelihood that two clusters were created by the same speaker, the likelihood measurement being biased by the prior probability of speaker changes. The two clusters with a minimum pairwise distance are combined into a new cluster and speakers models are trained for each of the remaining clusters including the new cluster. The likelihood that two clusters were created by the same speaker may be biased by a Markov duration model based on speaker changes over the length of the initial data clusters.

Type: Grant

Filed: April 12, 1994

Date of Patent: January 28, 1997

Assignee: Xerox Corporation

Inventors: Donald G. Kimber, Lynn D. Wilcox, Francine R. Chen
Mode-specific method and apparatus for encoding signals containing speech

Patent number: 5596676

Abstract: A method for encoding a signal that includes a speech component is described. First and second linear prediction windows of a frame are analyzed to generate sets of filter coefficients. First and second pitch analysis windows of the frame are analyzed to generate pitch estimates. The frame is classified in one of at least two modes, e.g. voiced, unvoiced and noise modes, based, for example, on pitch stationarity, short-term level gradient or zero crossing rate. Then the frame is encoded using the filter coefficients and pitch estimates in a particular manner depending upon the mode determination for the frame, preferably employing CELP based encoding algorithms.

Type: Grant

Filed: October 11, 1995

Date of Patent: January 21, 1997

Assignee: Hughes Electronics

Inventors: Kumar Swaminathan, Kalyan Ganesan, Prabhat K. Gupta
Method and apparatus for two-component signal compression

Patent number: 5592584

Abstract: A method and apparatus for performing a Modified Discrete Cosine Transform on an audio signal is disclosed which utilizes a Discrete Fourier Transform. Illustratively, the MDCT spectral coefficients for the signal are generated from the real FFT spectral coefficients.

Type: Grant

Filed: November 4, 1994

Date of Patent: January 7, 1997

Assignee: Lucent Technologies Inc.

Inventors: Anibal J. Ferreira, James D. Johnston
Speech signal processing device with continuous monitoring of signal-to-noise ratio

Patent number: 5572621

Abstract: A mobile radio set includes a speech processing device for processing digital samples (x(i)) of speech signals which have noise components as well as speech components. Such device includes a control unit for continuously forming estimates of the signal-to-noise ratio of the speech signals by determining and smoothing the power values of the samples thereof, and determining the minimum of each successive group of L smoothed power values. The groups uninterruptedly succeed each other and each contains a sufficient number of smoothed power values so that all the values of a single group associated with a random phoneme of the speech signal can be combined. An estimate of the present signal-to-noise ratio is formed based on the present smoothed power value and the most recently determined minimum successive smoothed power value.

Type: Grant

Filed: September 19, 1994

Date of Patent: November 5, 1996

Assignee: U.S. Philips Corporation

Inventor: Rainer Martin
Method for generating audio renderings of digitized works having highly technical content

Patent number: 5572625

Abstract: The present invention provides a method for producing auditory renderings of digitized works and, in particular, digitized documents containing complex mathematical expressions. Documents are first entered into a computer system and formatted with a markup language, such as one of the TeX.RTM. or LaTeX.RTM. family of languages. The formatted documents are parsed to provide a tree-structured, high-level representation. Mathematical expressions are in quasi-prefix form. Lexical analysis and recognition processes are then undertaken. The resulting analyzed documents are provided to an audio output device (such as a voice synthesizer) operating under control of a set of predetermined rendering rules. The resultant audio signal contains not only textual content but also the analogical markings produced by the reading rules. Multichannel audio outputs may be used to allow for spatial placement capability, in addition to the other analogical markings.

Type: Grant

Filed: October 22, 1993

Date of Patent: November 5, 1996

Assignee: Cornell Research Foundation, Inc.

Inventors: T. V. Raman, David Gries
Method for processing speech signals as block floating point numbers in a CELP-based coder using a fixed point processor

Patent number: 5570454

Abstract: A method of encoding speech using a fixed-point processor. The method treats the signal as floating point, while operating on each sample of the signal as fixed point. The disclosed method achieves precision similar to that of conventional floating point and may be rapidly executed on a fixed point processor.

Type: Grant

Filed: June 9, 1994

Date of Patent: October 29, 1996

Assignee: Hughes Electronics

Inventor: Weimin Liu
Speech approximation using successive sinusoidal overlap-add models and pitch-scale modifications

Patent number: 5504833

Abstract: A method and apparatus for the automatic analysis, synthesis and modification of audio signals, based on an overlap-add sinusoidal model, is disclosed. Automatic analysis of amplitude, frequency and phase parameters of the model is achieved using an analysis-by-synthesis procedure which incorporates successive approximation, yielding synthetic waveforms which are very good approximations to the original waveforms and are perceptually identical to the original sounds. A generalized overlap-add sinusoidal model is introduced which can modify audio signals without objectionable artifacts. In addition, a new approach to pitch-scale modification allows for the use of arbitrary spectral envelope estimates and addresses the problems of high-frequency loss and noise amplification encountered with prior art methods.

Type: Grant

Filed: May 4, 1994

Date of Patent: April 2, 1996

Inventors: E. Bryan George, Mark J. T. Smith