Formant Patents (Class 704/209)

Adaptive post-filtering technique based on the Modified Yule-Walker filter

Patent number: 6233552

Abstract: An adaptive time-domain post-filtering technique is based on the modified Yule-Walker filter. This technique eliminates the problem of spectral tilt in speech spectrum that can be applied to various speech coders. The new post-filter has a flat frequency response at the formant peaks of speech spectrum. Information is gathered about the relation between poles and formants and then the formants and their bandwidths are estimated. The information about the formants and their bandwidths is then used to design the modified Yule-Walker filter based on a least squares fit in time domain.

Type: Grant

Filed: March 12, 1999

Date of Patent: May 15, 2001

Assignee: Comsat Corporation

Inventors: Azhar Mustapha, Suat Yeldener
Mapping of digital data symbols onto one or more formant frequencies for transmission over a coded voice channel

Patent number: 6208959

Abstract: A digital input symbol is transmitted to a receiver by determining one or more formant frequencies that correspond to the digital input symbol. In one embodiment, a pre-programmed addressable memory is used to map the set of possible digital input symbols onto a set of corresponding speech units, each comprising a superposition of one or more formant frequencies. A signal is then generated having the speech units. The signal is supplied for transmission over a voice channel. This may include supplying the signal to a voice coder prior to transmission. In another aspect of the invention, a forward error correction code (FEC) is determined for the digital input symbol, and the one or more speech units are modified as a function of the forward error correction code. In this way, the FEC may also be transmitted with the encoded input symbol. The modification may affect any of a number of attributes of the speech units, including a volume attribute and a pitch attribute.

Type: Grant

Filed: December 15, 1997

Date of Patent: March 27, 2001

Assignee: Telefonaktibolaget LM Ericsson (publ)

Inventors: Björn Jonsson, Jan Swerup, Krister Törnqvist, Per-Olof Nerbrant
Method for coding speech containing noise-like speech periods and/or having background noise

Patent number: 6205423

Abstract: A method of coding speech under background noise conditions or during noise-like speech periods wherein during active voice speech segments an analysis-by-synthesis method is used. However, when a background noise segment or noise-like speech segment is detected, an adaptive code book (pitch prediction) contribution is used as a source of a pseudo-random sequence in order to provide a better representation of the background noise or the noise-like speech. An improved gain quantization scheme is also employed when a background noise segment is detected, wherein energy of the total excitation with quantized gains is matched to the energy of total excitation with unquantized gains.

Type: Grant

Filed: October 19, 1999

Date of Patent: March 20, 2001

Assignee: Conexant Systems, Inc.

Inventors: Huan-Yu Su, Eric Kwok Fung Yuen, Adil Benyassine, Jes Thyssen
System, method and article of manufacture for detecting emotion in voice signals through analysis of a plurality of voice signal parameters

Patent number: 6151571

Abstract: A method and system for monitoring a conversation between a pair of speakers for detecting an emotion of at least one of the speakers is provided. First, a voice signal is received after which a particular feature is extracted from the voice signal. Next, an emotion associated with the voice signal is determined based on the extracted feature. The emotion is screened and feedback is provided only if the emotion is determined to be a negative emotion selected from the group of negative emotions consisting of anger, sadness, and fear. Such determined negative emotion is then outputted to a third party during the conversation.

Type: Grant

Filed: August 31, 1999

Date of Patent: November 21, 2000

Assignee: Andersen Consulting

Inventor: Valery A. Pertrushin
Allophonic text-to-speech generator

Patent number: 6148285

Abstract: The allophonic text-to-speech generator (ATTG) 10 includes a CPU 100. The CPU has a random access memory 102 and a read only memory 104 for holding the operating system, application programs, and data for the CPU 100. A keyboard 110 provides a user with control over the CPU 100. A database 130 holds phonetic transcritps of words. Such databases are well-known in the field of telephone directory assistance. A second database 140 maps allophonic text to parse and pre-recorded allophones. The CPU 100 converts a phonetic transcript of a word into an allophonic text string in accordance with a rules program 120. Then the CPU 100 extracts the audio allophone files of the allophonic string and concatenates the audio files to form the new word in the same voice as the other words fromed from the allophones in database 140.

Type: Grant

Filed: October 30, 1998

Date of Patent: November 14, 2000

Assignee: Nortel Networks Corporation

Inventor: Philip John Busardo
System and method for determining a first formant analysis filter and prefiltering a speech signal for improved pitch estimation

Patent number: 6047254

Abstract: The present invention comprises an improved vocoder system and method for estimating the pitch of a speech signal. The speech signal comprises a stream of digitized speech samples. The speech samples are partitioned into frames. For each frame of the speech signal, an optimal order-two inverse filter is determined. The optimal order-two inverse filter is determined by computing an order-two inverse filter at various locations within the speech frame. For each order-two inverse filter an energy value is calculated which represents the proportion of energy which would remain if the speech signal were filtered with the order-two inverse filter. The order-two inverse filter which minimizes the energy proportion is chosen to be the optimal order-two inverse filter. The optimal order-two inverse filter is then used to filter the samples of the speech frame. An autocorrelation is performed on the filtered signal for a range of tine-delay values.

Type: Grant

Filed: October 24, 1997

Date of Patent: April 4, 2000

Assignee: Advanced Micro Devices, Inc.

Inventors: Mark A. Ireton, John G. Bartkowiak
Method of deriving characteristics values from a speech signal

Patent number: 6041296

Abstract: In a frequently used speech synthesis for voice output an excitation signal is applied to a number of resonators whose frequency and amplitude are adjusted in accordance with the sound to be produced. These parameters for adjusting the resonators may be gained from natural speech signals. Such parameters gained from natural speech signals may also be used for speech recognition, in which these parameter values are compared with comparison values. According to the invention, the parameters, particularly the formant frequencies, are determined by forming the power density spectrum via discrete frequencies from which autocorrelation coefficients are formed for consecutive frequency segments of the power density spectrum from which, in turn, error values are formed, while the sum of the error values is minimized over all segments and the optimum boundary frequencies of the segments are determined for this minimum.

Type: Grant

Filed: April 21, 1997

Date of Patent: March 21, 2000

Assignee: U.S. Philips Corporation

Inventors: Lutz Welling, Hermann Ney
First formant location determination and removal from speech correlation information for pitch detection

Patent number: 6026357

Abstract: A vocoder system and method for estimating the pitch of a speech signal. The speech signal comprises a stream of digitized speech samples. The speech samples are partitioned into frames. For each frame of the speech signal, the following processing steps are performed. First, an optimal order-two inverse filter is determined based on the samples of the speech frame. Second, a dominant formant frequency is calculated from the coefficients of the optimal order-two inverse filter. Third, an autocorrelation function is calculated on the samples of the speech frame. The autocorrelation is performed for a range of time-delay values over which the pitch period and its multiples might be expected to occur. Fourth, the peaks of the autocorrelation function are analyzed incorporating the knowledge of the dominant formant period (which is the inverse of the dominant formant frequency). Normally, the dominant formant is the first formant.

Type: Grant

Filed: October 24, 1997

Date of Patent: February 15, 2000

Assignee: Advanced Micro Devices, Inc.

Inventors: Mark A. Ireton, John G. Bartkowiak
Variable bitrate speech transmission system

Patent number: 6012026

Abstract: A transmission system with a transmitter and a receiver. The transmitter has a speech encoder with analysis means, has calculation means, and has control means. The receiver has a speech decoder. Through a transmission medium, the transmitter transmits frames of data to the receiver. The analysis means determine analysis coefficients from a speech signal. From a bitrate setting, the calculation means calculate a fraction of the frames of data to carry more information about the analysis coefficients than a remaining number of the frames of data. The control means control the transmitter to transmit the fraction of the frames of data and to transmit the remaining number of the frames of data. The receiver receives the frames of data. The receiver derives a reconstructed speech signal from the received frames of data.

Type: Grant

Filed: March 31, 1998

Date of Patent: January 4, 2000

Assignee: U.S. Philips Corporation

Inventors: Rakesh Taori, Andreas J. Gerrits
Method and system for speech processing with greatly reduced harmonic and intermodulation distortion

Patent number: 6003000

Abstract: A method and system for representing speech with greatly reduced harmonic and intermodulation distortion using a fixed interval scale, known as Tru-Scale. Speech is reproduced in accordance with a frequency matrix which reduces intermodulation interference and harmonic distortion (overtone collision). Enhanced speech quality and reduced noise results from increasing the signal-to-noise ratio in the processed speech signal. The method and system use an Auto-Regressive (AR) modeling technique, using, among other approaches, Linear Predictive Coding (LPC) analysis. In accordance with another aspect of the invention, a Fourier transform-based modeling technique also is used. The application of the system to speech coders also is contemplated.

Type: Grant

Filed: April 29, 1997

Date of Patent: December 14, 1999

Assignee: Meta-C Corporation

Inventors: Michele L. Ozzimo, Matthew C. Cobb, James A. Dinnan
Audio signal coding/decoding method

Patent number: 5956686

Abstract: An adaptive transform coding/and decoding arrangement is provided to effectively exploit different redundancies between the bands of a spectrum envelope to effect coding at a low bit rate for an audio signal. In the adaptive transform coding method, the spectrum envelope is divided into bands so that different coding methods may be applied to the spectrum envelopes of the individual bands. By applying the present invention to the adaptive transform coding of an audio signal, the spectrum envelope can be adjusted to the coding/and transmission method which is suitable for the time fluctuation in each frequency band, so that the different redundancies for the individual bands can be effectively exploited to realize a highly efficient audio signal coding/and decoding method which has its bits reduced as required for coding the spectrum envelope.

Type: Grant

Filed: June 30, 1995

Date of Patent: September 21, 1999

Assignee: Hitachi, Ltd.

Inventors: Makoto Takashima, Yoshiaki Asakawa, Hidetoshi Sekine
Detecting transients to emphasize formant peaks

Patent number: 5953696

Abstract: Nasalized sound effects during reproduction of low-pitch sounds are suppressed to produce playback sounds of high clarity. Amplitude data is processed with high range formant emphasis of crests and valleys of the envelope of the frequency spectrum on the high frequency range and with deepening of the valley of the frequency spectrum over the entire frequency range, above all, over the low to mid frequency range. Next, the amplitude data is processed for emphasizing the peak values of the formant of the voiced frame in the portion of the speech signal which is rising in magnitude and for unconditionally emphasizing the spectral envelope on the high frequency range. The voiced speech spectrum is generated by synthesizing the cosine wave based upon the emphasized amplitude data.

Type: Grant

Filed: September 23, 1997

Date of Patent: September 14, 1999

Assignee: Sony Corporation

Inventors: Masayuki Nishiguchi, Jun Matsumoto
Efficient pitch estimation method

Patent number: 5946650

Abstract: A method and means to estimate the pitch of a speech or acoustic signal within a vocoder begins with the center clipping and low-pass filtering of the speech or acoustic signal to eliminate the formants from the speech or acoustic signal. An error function for each pitch is calculated for each pitch within the speech or acoustic signal. A fast tracking method is used to select the estimated pitch for the pitch or acoustic signal. A final check for the doubling of the pitch will minimize any incorrect estimation of the pitch.

Type: Grant

Filed: June 19, 1997

Date of Patent: August 31, 1999

Assignee: Tritech Microelectronics, Ltd.

Inventor: Ma Wei
System and method for improved pitch estimation which performs first formant energy removal for a frame using coefficients from a prior frame

Patent number: 5937374

Abstract: An improved vocoder system and method for estimating pitch in a speech waveform which pre-filters speech data with improved efficiency and reduced computational requirements. The vocoder system is preferably a low bit rate speech coder which analyzes a plurality of frames of speech data in parallel. Once the LPC filter coefficients and the pitch for a first frame have been calculated, the vocoder then looks ahead to the next frame to estimate the pitch, i.e., to estimate the pitch of the next frame. In the preferred embodiment of the invention, the vocoder filters speech data in a second frame using a plurality of the coefficients from a first frame as a multi pole analysis filter. These coefficients are used as a "crude" two pole analysis filter.

Type: Grant

Filed: May 15, 1996

Date of Patent: August 10, 1999

Assignee: Advanced Micro Devices, Inc.

Inventors: John G. Bartkowiak, Mark A. Ireton
Retaining prosody during speech analysis for later playback

Patent number: 5933805

Abstract: A speech system includes a speech encoding system and a speech decoding system. The speech encoding system includes a speech analyzer for identifying each of the speech segments (i.e., phonemes) in the received digitized speech signal. A pitch detector, a duration detector, and an amplitude detector are each coupled to the memory and the analyzer and detect various prosodic parameters of each received speech segment. A speech encoder generates a data signal that includes the speech segment IDs and the values of the corresponding prosodic parameters. The speech decoding system includes a digital data decoder and a speech synthesizer for generating a speech signal based on the segment IDs and prosodic parameter values.

Type: Grant

Filed: December 13, 1996

Date of Patent: August 3, 1999

Assignee: Intel Corporation

Inventors: Dale Boss, Sridhar Iyengar, T. Don Dennis
Method and apparatus for sibilant classification in a speech recognition system

Patent number: 5897614

Abstract: When a speech signal that may include a sibilant consisting of one or more formants is received, frequencies and selectivity factors are determined for each sibilant formant in the speech signal. Then, the frequencies and selectivity factors are compared to a set of empirically derived criteria to classify the sibilant sound.

Type: Grant

Filed: December 20, 1996

Date of Patent: April 27, 1999

Assignee: International Business Machines Corporation

Inventor: Frank Albert McKiel, Jr.
System and method for multiresolution scalable audio signal encoding

Patent number: 5886276

Abstract: An audio signal analyzer and encoder is based on a model that considers audio signals to be composed of deterministic or sinusoidal components, transient components representing the onset of notes or other events in an audio signal, and stochastic components. Deterministic components are represented as a series of overlapping sinusoidal waveforms. To generate the deterministic components, the input signal is divided into a set of frequency bands by a multi-complementary filter bank. The frequency band signals are oversampled so as to suppress cross-band aliasing energy in each band. Each frequency band is analyzed and encoded as a set of spectral components using a windowing time frame whose length is inversely proportional to the frequency range in that band. Low frequency bands are encoded using longer time frames than higher frequency bands.

Type: Grant

Filed: January 16, 1998

Date of Patent: March 23, 1999

Assignee: The Board of Trustees of the Leland Stanford Junior University

Inventors: Scott N. Levine, Tony S. Verma
Coding apparatus having adaptive coding at different bit rates and pitch emphasis

Patent number: 5878387

Abstract: The coding apparatus comprises an adaptive codebook storing excitation signals as vectors, a synthesis filter for forming a synthesis signal, referring to the vectors stored in the adaptive codebook, a similarity computation circuit for computing a similarity between the synthesis signal obtained by the synthesis filter and a target signal, and a coding scheme determining circuit for deciding one coding scheme from a plurality of coding schemes respectively having coding bit rates different from each other, on the basis of the similarity obtained by the similarity computation circuit.

Type: Grant

Filed: September 29, 1995

Date of Patent: March 2, 1999

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masahiro Oshikiri, Kimio Miseki, Masami Akamine, Tadashi Amada
Karaoke apparatus detecting register of live vocal to tune harmony vocal

Patent number: 5876213

Abstract: A karaoke apparatus is constructed to perform a karaoke accompaniment part and a karaoke harmony part for accompanying a live vocal part. A pickup device collects a singing voice of the live vocal part. A detector device analyzes the collected singing voice to detect a musical register thereof at which the live vocal part is actually performed. A harmony generator device generates a harmony voice of the karaoke harmony part according to the detected musical register so that the karaoke harmony part is made consonant with the live vocal part. A tone generator device generates an instrumental tone of the karaoke accompaniment part in parallel to the karaoke harmony part.

Type: Grant

Filed: July 30, 1996

Date of Patent: March 2, 1999

Assignee: Yamaha Corporation

Inventor: Shuichi Matsumoto
Frequency-domain spectral envelope estimation for monophonic and polyphonic signals

Patent number: 5870704

Abstract: Estimating the time-varying spectrum envelope of a time-varying signal facilitates pitch modification and other shifting of signal content in the frequency domain. Local maxima of a spectrum of the signal are identified by applying a masking curve. The masking curve has a peak at the particular maximum and descends away therefrom the local maximum. Local maxima falling below the local maximum are eliminated. The slope of the masking curve is varied in accordance with measured parameters of the spectrum to decrease or eliminate spurious peaks. Thereafter, a smoothing procedure may be applied to smooth the spectrum in frequency.

Type: Grant

Filed: November 7, 1996

Date of Patent: February 9, 1999

Assignee: Creative Technology Ltd.

Inventor: Jean Laroche
Estimation of excitation parameters

Patent number: 5826222

Abstract: A method of encoding speech by analyzing a digitized speech signal to determine excitation parameters for the digitized speech signal is disclosed. The method includes dividing the digitized speech signal into at least two frequency bands, determining a first preliminary excitation parameter by performing a nonlinear operation on at least one of the frequency band signals to produce a modified frequency band signal and determining the first preliminary excitation parameter using the modified frequency band signal, determining a second preliminary excitation parameter using a method different from the first method, and using the first and second preliminary excitation parameters to determine an excitation parameter for the digitized speech signal. The method is useful in encoding speech. Speech synthesized using the parameters estimated based on the invention generates high quality speech at various bit rates useful for applications such as satellite voice communication.

Type: Grant

Filed: April 14, 1997

Date of Patent: October 20, 1998

Assignee: Digital Voice Systems, Inc.

Inventor: Daniel Wayne Griffin
Filter for speech modification or enhancement, and various apparatus, systems and method using same

Patent number: 5822732

Abstract: A speech modification or enhancement filter, and apparatus, system and method using the same. Synthesized speech signals are filtered to generate modified synthesized speech signals. From spectral information represented as a multi-dimensional vector, a filter coefficient is determined so as to ensure that formant characteristics of the modified synthesized speech signals are enhanced in comparison with those of the synthesized speech signal and in accordance with the spectral information. The spectral information can be any one of LSP information, PARCOR information and LAR information. A degree of freedom of design of the speech modification filter used for the aural suppression of quantizing noise contained in the synthesized speech signals is thus heightened leading to the improvement of intelligibility of said synthesized speech signals. A good formant enhancement effect can be obtained without allowing any perceptible level of distortions to occur within a range of permissible spectral gradients.

Type: Grant

Filed: May 2, 1996

Date of Patent: October 13, 1998

Assignee: Mitsubishi Denki Kabushiki Kaisha

Inventor: Hirohisa Tasaki
Pitch searching time reducing method for code excited linear prediction vocoder using line spectral pair

Patent number: 5812966

Abstract: An improved pitch searching time reducing method for a CELP vocoder using a Line Spectral Pair (LSP) frequency which is capable of significantly reducing the pitch search time by separating the speech signal using a first formant frequency of the line spectral pair of the digital type personal communication system, which includes the steps of computing a decimation interval of a pitch search interval using an LSP frequency of a first formant computed by a formant filter so as to compute a preparatory pitch of a given speech; determining a preparatory pitch to be used when searching a pitch by detecting a peak and a valley within each decimation interval; and computing a preparatory pitch by adapting a first formant frequency of an LSP computed by a formant filter with a decimation rate and performing a pitch search with respect to the obtained preparatory pitch.

Type: Grant

Filed: September 19, 1996

Date of Patent: September 22, 1998

Assignee: Electronics and Telecommunications Research Institute

Inventors: Kyung-Jin Byun, Hah-Yong Yoo, Ki-Chun Han, Jong-Jae Kim, Myung-Jin Bae
Voice synthesis system utilizing a transfer function

Patent number: 5806037

Abstract: A voice synthesis system is fundamentally configured by a sound-source model, which simulates human voices and the like, and a voice-path model which simulates properties of voice paths between vocal cords and lips. The sound-source model is embodied by a code book which stores a plurality of code words, representative of waveform patterns, with respect to each of the voices. Each of the code words is selected by an information index. The voice-path model is embodied by a full-pole synthesis filter whose characteristic curve provides multiple poles, each of which is represented by polar coordinates. There is further provided a pitch filter and an all-pass filter. Data representative of the code word selected is supplied to the pitch filter, in which a first delay time, set by a number of delay-time units, is imparted to the data. Then, the all-pass filter imparts a second delay time, which is smaller than the delay-time unit, to the data in response to pitch-variation information.

Type: Grant

Filed: March 29, 1995

Date of Patent: September 8, 1998

Assignee: Yamaha Corporation

Inventor: Akira Sogo
Method for reducing pitch search time for vocoder

Patent number: 5799271

Abstract: The present invention relates to the method to receive a speech signal, to perform a recognition weighting process on it, to synthesize a synthetic speech signal, to calculate an autocorrelation of the synthetic speech signal whose delay is a predetermined value and an autocorrelation whose delay is 0, to divide the square of the former by the latter, to calculate a pitch lag and a pitch filter coefficient by calculating only the part of a positive peak with skipping over the part of a negative peak by using the results from the dividing operation, and to calculate and output the pitch lag and the pitch filter coefficient by repeating the above process Thus, real-time implementation of CELP vocoder can be achieved.

Type: Grant

Filed: June 24, 1996

Date of Patent: August 25, 1998

Assignee: Electronics and Telecommunications Research Institute

Inventors: Kyung-Jin Byun, Ha-Young Yoo, Jong-Jae Kim, Ki-Chun Han, Jae-Suk Kim, Myung-Jin Bae
System for automatically morphing audio information

Patent number: 5749073

Abstract: In the first step of a sound morphing process, each sound which forms the basis for the morph is converted into one or more quantitative representations, such as spectrograms. After the representations have been obtained, the temporal axes of the two sounds are matched, so that similar components of the two sounds, such as onsets, harmonic regions and inharmonic regions, are aligned with one another. Other characteristics of the sounds, such as pitch, formant frequencies, or the like, are then matched. Once the energy in each of the sounds has been accounted for and matched to that of the other sound, the two sounds are cross-faded, to produce a representation of a new sound. This representation is then inverted, to generate the morphed sound.

Type: Grant

Filed: March 15, 1996

Date of Patent: May 5, 1998

Assignee: Interval Research Corporation

Inventor: Malcolm Slaney
Noise reduction apparatus using spectral subtraction or scaling and signal attenuation between formant regions

Patent number: 5742927

Abstract: A noise reduction apparatus and method for enhancing noisy speech signal which applies to the spectral component signals of a time-varying input signal either a spectral substraction process or a spectral scaling process followed by signal attenuation in regions of the frequency spectrum lying between identified formant regions.

Type: Grant

Filed: October 11, 1995

Date of Patent: April 21, 1998

Assignee: British Telecommunications Public Limited Company

Inventors: Philip Mark Crozier, Barry Michael George Cheetham
Method and apparatus for enhancement of telephonic speech signals

Patent number: 5737719

Abstract: A method and apparatus for enhancing the intelligibility of a telephonic speech signal within the available bandwidth and intensity limits of a telephone communication network. The method combines enhancement of both the formant ratio and the consonant/vowel energy ratio to realize a speech signal more intelligible to a hearing impaired user. The invention uses an auditory model of the human ear. A speech signal is put through a filter bank designed to simulate the cochlear filter shapes and filter spacing of a healthy cochlea. The energy output from each of a plurality of filters is computed and used to form an auditory spectrum. The peaks associated with strong first and second formants are identified, and the second formant is enhanced relative to the first formant by attenuating the first formant. Also, consonants in the speech signal are identified as having an energy level below a threshold associated with vowels, but above the threshold associated with silent regions. Consonant regions are amplified.

Type: Grant

Filed: December 19, 1995

Date of Patent: April 7, 1998

Assignee: U S West, Inc.

Inventor: Alvin Mark Terry
Method, apparatus and recording medium for a coder with a spectral-shape-adaptive subband configuration

Patent number: 5737718

Abstract: A method for encoding the information is provided which realizes a high encoding efficiency especially for tonal acoustic signals without lowering the sound quality. An acoustic signal from a terminal is transformed by a transform circuit into spectral signals which are then normalized and quantized by a signal component encoding circuit for encoding from one encoding unit to another. The encoding unit configuration is selected by an encoding unit configuration decision circuit from plural encoding unit configurations depending upon the shape of distribution of the spectral components. An encoding unit of narrow low bandwidth is selected for a tonal signal.

Type: Grant

Filed: June 8, 1995

Date of Patent: April 7, 1998

Assignee: Sony Corporation

Inventor: Kyoya Tsutsui

prev 1 2 3 4 5