Patents Examined by John Michael Grover
-
Patent number: 5640486Abstract: An audio type signal is encoded. The signal is first divided into bands. For each band, a yardstick signal element is selected. The yardstick may be the signal element having the largest magnitude in the band, the second largest, closest to the median magnitude, or having some other selected magnitude. This magnitude is used for various purposes, including assigning bits to the different bands, and for establishing reconstruction levels within a band. The magnitude of non yardstick signal elements is also quantized. The encoded signal is also decoded. Apparatus for both encoding and decoding are also disclosed. The location of the yardstick element within its band may also be recorded and encoded, and used for efficiently allocating bits to non-yardstick signal elements. Split bands may be established, such that each split band includes a yardstick signal element and each full band includes a major and a minor yardstick signal element.Type: GrantFiled: November 28, 1994Date of Patent: June 17, 1997Assignee: Massachusetts Institute of TechnologyInventor: Jae S. Lim
-
Patent number: 5625746Abstract: An audio-type signal is encoded. The signal is first divided into bands. For each band, a yardstick signal element is selected. Its magnitude is quantized using a first level of accuracy. This magnitude is used for various purposes, including assigning bits to the different bands, and for establishing reconstruction levels within a band. The magnitude of non-yardstick signal elements is quantized with less accuracy than are the yardstick signal elements. The encoded signal is also decoded. Apparatus for both encoding and decoding are also disclosed. The location of the yardstick element within its band may also be recorded and encoded, and used for efficiently allocating bits to non-yardstick signal elements.Type: GrantFiled: February 23, 1995Date of Patent: April 29, 1997Assignee: Massachusetts Institute of TechnologyInventor: Jae S. Lim
-
Patent number: 5604839Abstract: A method and system for improving speech recognition through front-end normalization of feature vectors are provided. Speech to be recognized is spoken into a microphone, amplified by an amplifier, and converted from an analog signal to a digital signal by an analog-to-digital ("A/D") converter. The digital signal from the A/D converter is input to a feature extractor that breaks down the signal into frames of speech and then extracts a feature vector from each of the frames. The feature vector is input to an input normalizer that normalizes the vector. The input normalizer normalizes the feature vector by computing a correction vector and subtracting the correction vector from the feature vector. The correction vector is computed based on the probability of the current frame of speech being noise and based on the average noise and speech feature vectors for a current utterance and a database of utterances.Type: GrantFiled: July 29, 1994Date of Patent: February 18, 1997Assignee: Microsoft CorporationInventors: Alejandro Acero, Xuedong Huang
-
Patent number: 5598507Abstract: A method for clustering speaker data from a plurality of unknown speakers. The method includes steps of providing a portion of audio data containing speech from at least all the speakers in the audio data and dividing the portion into data clusters. A pairwise distance between each pair of clusters is computed, the pairwise distance being based on a likelihood that two clusters were created by the same speaker, the likelihood measurement being biased by the prior probability of speaker changes. The two clusters with a minimum pairwise distance are combined into a new cluster and speakers models are trained for each of the remaining clusters including the new cluster. The likelihood that two clusters were created by the same speaker may be biased by a Markov duration model based on speaker changes over the length of the initial data clusters.Type: GrantFiled: April 12, 1994Date of Patent: January 28, 1997Assignee: Xerox CorporationInventors: Donald G. Kimber, Lynn D. Wilcox, Francine R. Chen
-
Patent number: 5596676Abstract: A method for encoding a signal that includes a speech component is described. First and second linear prediction windows of a frame are analyzed to generate sets of filter coefficients. First and second pitch analysis windows of the frame are analyzed to generate pitch estimates. The frame is classified in one of at least two modes, e.g. voiced, unvoiced and noise modes, based, for example, on pitch stationarity, short-term level gradient or zero crossing rate. Then the frame is encoded using the filter coefficients and pitch estimates in a particular manner depending upon the mode determination for the frame, preferably employing CELP based encoding algorithms.Type: GrantFiled: October 11, 1995Date of Patent: January 21, 1997Assignee: Hughes ElectronicsInventors: Kumar Swaminathan, Kalyan Ganesan, Prabhat K. Gupta
-
Patent number: 5592584Abstract: A method and apparatus for performing a Modified Discrete Cosine Transform on an audio signal is disclosed which utilizes a Discrete Fourier Transform. Illustratively, the MDCT spectral coefficients for the signal are generated from the real FFT spectral coefficients.Type: GrantFiled: November 4, 1994Date of Patent: January 7, 1997Assignee: Lucent Technologies Inc.Inventors: Anibal J. Ferreira, James D. Johnston
-
Patent number: 5572625Abstract: The present invention provides a method for producing auditory renderings of digitized works and, in particular, digitized documents containing complex mathematical expressions. Documents are first entered into a computer system and formatted with a markup language, such as one of the TeX.RTM. or LaTeX.RTM. family of languages. The formatted documents are parsed to provide a tree-structured, high-level representation. Mathematical expressions are in quasi-prefix form. Lexical analysis and recognition processes are then undertaken. The resulting analyzed documents are provided to an audio output device (such as a voice synthesizer) operating under control of a set of predetermined rendering rules. The resultant audio signal contains not only textual content but also the analogical markings produced by the reading rules. Multichannel audio outputs may be used to allow for spatial placement capability, in addition to the other analogical markings.Type: GrantFiled: October 22, 1993Date of Patent: November 5, 1996Assignee: Cornell Research Foundation, Inc.Inventors: T. V. Raman, David Gries
-
Patent number: 5572621Abstract: A mobile radio set includes a speech processing device for processing digital samples (x(i)) of speech signals which have noise components as well as speech components. Such device includes a control unit for continuously forming estimates of the signal-to-noise ratio of the speech signals by determining and smoothing the power values of the samples thereof, and determining the minimum of each successive group of L smoothed power values. The groups uninterruptedly succeed each other and each contains a sufficient number of smoothed power values so that all the values of a single group associated with a random phoneme of the speech signal can be combined. An estimate of the present signal-to-noise ratio is formed based on the present smoothed power value and the most recently determined minimum successive smoothed power value.Type: GrantFiled: September 19, 1994Date of Patent: November 5, 1996Assignee: U.S. Philips CorporationInventor: Rainer Martin
-
Patent number: 5570454Abstract: A method of encoding speech using a fixed-point processor. The method treats the signal as floating point, while operating on each sample of the signal as fixed point. The disclosed method achieves precision similar to that of conventional floating point and may be rapidly executed on a fixed point processor.Type: GrantFiled: June 9, 1994Date of Patent: October 29, 1996Assignee: Hughes ElectronicsInventor: Weimin Liu
-
Patent number: 5504833Abstract: A method and apparatus for the automatic analysis, synthesis and modification of audio signals, based on an overlap-add sinusoidal model, is disclosed. Automatic analysis of amplitude, frequency and phase parameters of the model is achieved using an analysis-by-synthesis procedure which incorporates successive approximation, yielding synthetic waveforms which are very good approximations to the original waveforms and are perceptually identical to the original sounds. A generalized overlap-add sinusoidal model is introduced which can modify audio signals without objectionable artifacts. In addition, a new approach to pitch-scale modification allows for the use of arbitrary spectral envelope estimates and addresses the problems of high-frequency loss and noise amplification encountered with prior art methods.Type: GrantFiled: May 4, 1994Date of Patent: April 2, 1996Inventors: E. Bryan George, Mark J. T. Smith