Formant Patents (Class 704/209)
  • Patent number: 6233552
    Abstract: An adaptive time-domain post-filtering technique is based on the modified Yule-Walker filter. This technique eliminates the problem of spectral tilt in speech spectrum that can be applied to various speech coders. The new post-filter has a flat frequency response at the formant peaks of speech spectrum. Information is gathered about the relation between poles and formants and then the formants and their bandwidths are estimated. The information about the formants and their bandwidths is then used to design the modified Yule-Walker filter based on a least squares fit in time domain.
    Type: Grant
    Filed: March 12, 1999
    Date of Patent: May 15, 2001
    Assignee: Comsat Corporation
    Inventors: Azhar Mustapha, Suat Yeldener
  • Patent number: 6208959
    Abstract: A digital input symbol is transmitted to a receiver by determining one or more formant frequencies that correspond to the digital input symbol. In one embodiment, a pre-programmed addressable memory is used to map the set of possible digital input symbols onto a set of corresponding speech units, each comprising a superposition of one or more formant frequencies. A signal is then generated having the speech units. The signal is supplied for transmission over a voice channel. This may include supplying the signal to a voice coder prior to transmission. In another aspect of the invention, a forward error correction code (FEC) is determined for the digital input symbol, and the one or more speech units are modified as a function of the forward error correction code. In this way, the FEC may also be transmitted with the encoded input symbol. The modification may affect any of a number of attributes of the speech units, including a volume attribute and a pitch attribute.
    Type: Grant
    Filed: December 15, 1997
    Date of Patent: March 27, 2001
    Assignee: Telefonaktibolaget LM Ericsson (publ)
    Inventors: Björn Jonsson, Jan Swerup, Krister Törnqvist, Per-Olof Nerbrant
  • Patent number: 6205423
    Abstract: A method of coding speech under background noise conditions or during noise-like speech periods wherein during active voice speech segments an analysis-by-synthesis method is used. However, when a background noise segment or noise-like speech segment is detected, an adaptive code book (pitch prediction) contribution is used as a source of a pseudo-random sequence in order to provide a better representation of the background noise or the noise-like speech. An improved gain quantization scheme is also employed when a background noise segment is detected, wherein energy of the total excitation with quantized gains is matched to the energy of total excitation with unquantized gains.
    Type: Grant
    Filed: October 19, 1999
    Date of Patent: March 20, 2001
    Assignee: Conexant Systems, Inc.
    Inventors: Huan-Yu Su, Eric Kwok Fung Yuen, Adil Benyassine, Jes Thyssen
  • Patent number: 6151571
    Abstract: A method and system for monitoring a conversation between a pair of speakers for detecting an emotion of at least one of the speakers is provided. First, a voice signal is received after which a particular feature is extracted from the voice signal. Next, an emotion associated with the voice signal is determined based on the extracted feature. The emotion is screened and feedback is provided only if the emotion is determined to be a negative emotion selected from the group of negative emotions consisting of anger, sadness, and fear. Such determined negative emotion is then outputted to a third party during the conversation.
    Type: Grant
    Filed: August 31, 1999
    Date of Patent: November 21, 2000
    Assignee: Andersen Consulting
    Inventor: Valery A. Pertrushin
  • Patent number: 6148285
    Abstract: The allophonic text-to-speech generator (ATTG) 10 includes a CPU 100. The CPU has a random access memory 102 and a read only memory 104 for holding the operating system, application programs, and data for the CPU 100. A keyboard 110 provides a user with control over the CPU 100. A database 130 holds phonetic transcritps of words. Such databases are well-known in the field of telephone directory assistance. A second database 140 maps allophonic text to parse and pre-recorded allophones. The CPU 100 converts a phonetic transcript of a word into an allophonic text string in accordance with a rules program 120. Then the CPU 100 extracts the audio allophone files of the allophonic string and concatenates the audio files to form the new word in the same voice as the other words fromed from the allophones in database 140.
    Type: Grant
    Filed: October 30, 1998
    Date of Patent: November 14, 2000
    Assignee: Nortel Networks Corporation
    Inventor: Philip John Busardo
  • Patent number: 6047254
    Abstract: The present invention comprises an improved vocoder system and method for estimating the pitch of a speech signal. The speech signal comprises a stream of digitized speech samples. The speech samples are partitioned into frames. For each frame of the speech signal, an optimal order-two inverse filter is determined. The optimal order-two inverse filter is determined by computing an order-two inverse filter at various locations within the speech frame. For each order-two inverse filter an energy value is calculated which represents the proportion of energy which would remain if the speech signal were filtered with the order-two inverse filter. The order-two inverse filter which minimizes the energy proportion is chosen to be the optimal order-two inverse filter. The optimal order-two inverse filter is then used to filter the samples of the speech frame. An autocorrelation is performed on the filtered signal for a range of tine-delay values.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: April 4, 2000
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark A. Ireton, John G. Bartkowiak
  • Patent number: 6041296
    Abstract: In a frequently used speech synthesis for voice output an excitation signal is applied to a number of resonators whose frequency and amplitude are adjusted in accordance with the sound to be produced. These parameters for adjusting the resonators may be gained from natural speech signals. Such parameters gained from natural speech signals may also be used for speech recognition, in which these parameter values are compared with comparison values. According to the invention, the parameters, particularly the formant frequencies, are determined by forming the power density spectrum via discrete frequencies from which autocorrelation coefficients are formed for consecutive frequency segments of the power density spectrum from which, in turn, error values are formed, while the sum of the error values is minimized over all segments and the optimum boundary frequencies of the segments are determined for this minimum.
    Type: Grant
    Filed: April 21, 1997
    Date of Patent: March 21, 2000
    Assignee: U.S. Philips Corporation
    Inventors: Lutz Welling, Hermann Ney
  • Patent number: 6026357
    Abstract: A vocoder system and method for estimating the pitch of a speech signal. The speech signal comprises a stream of digitized speech samples. The speech samples are partitioned into frames. For each frame of the speech signal, the following processing steps are performed. First, an optimal order-two inverse filter is determined based on the samples of the speech frame. Second, a dominant formant frequency is calculated from the coefficients of the optimal order-two inverse filter. Third, an autocorrelation function is calculated on the samples of the speech frame. The autocorrelation is performed for a range of time-delay values over which the pitch period and its multiples might be expected to occur. Fourth, the peaks of the autocorrelation function are analyzed incorporating the knowledge of the dominant formant period (which is the inverse of the dominant formant frequency). Normally, the dominant formant is the first formant.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: February 15, 2000
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark A. Ireton, John G. Bartkowiak
  • Patent number: 6012026
    Abstract: A transmission system with a transmitter and a receiver. The transmitter has a speech encoder with analysis means, has calculation means, and has control means. The receiver has a speech decoder. Through a transmission medium, the transmitter transmits frames of data to the receiver. The analysis means determine analysis coefficients from a speech signal. From a bitrate setting, the calculation means calculate a fraction of the frames of data to carry more information about the analysis coefficients than a remaining number of the frames of data. The control means control the transmitter to transmit the fraction of the frames of data and to transmit the remaining number of the frames of data. The receiver receives the frames of data. The receiver derives a reconstructed speech signal from the received frames of data.
    Type: Grant
    Filed: March 31, 1998
    Date of Patent: January 4, 2000
    Assignee: U.S. Philips Corporation
    Inventors: Rakesh Taori, Andreas J. Gerrits
  • Patent number: 6003000
    Abstract: A method and system for representing speech with greatly reduced harmonic and intermodulation distortion using a fixed interval scale, known as Tru-Scale. Speech is reproduced in accordance with a frequency matrix which reduces intermodulation interference and harmonic distortion (overtone collision). Enhanced speech quality and reduced noise results from increasing the signal-to-noise ratio in the processed speech signal. The method and system use an Auto-Regressive (AR) modeling technique, using, among other approaches, Linear Predictive Coding (LPC) analysis. In accordance with another aspect of the invention, a Fourier transform-based modeling technique also is used. The application of the system to speech coders also is contemplated.
    Type: Grant
    Filed: April 29, 1997
    Date of Patent: December 14, 1999
    Assignee: Meta-C Corporation
    Inventors: Michele L. Ozzimo, Matthew C. Cobb, James A. Dinnan
  • Patent number: 5956686
    Abstract: An adaptive transform coding/and decoding arrangement is provided to effectively exploit different redundancies between the bands of a spectrum envelope to effect coding at a low bit rate for an audio signal. In the adaptive transform coding method, the spectrum envelope is divided into bands so that different coding methods may be applied to the spectrum envelopes of the individual bands. By applying the present invention to the adaptive transform coding of an audio signal, the spectrum envelope can be adjusted to the coding/and transmission method which is suitable for the time fluctuation in each frequency band, so that the different redundancies for the individual bands can be effectively exploited to realize a highly efficient audio signal coding/and decoding method which has its bits reduced as required for coding the spectrum envelope.
    Type: Grant
    Filed: June 30, 1995
    Date of Patent: September 21, 1999
    Assignee: Hitachi, Ltd.
    Inventors: Makoto Takashima, Yoshiaki Asakawa, Hidetoshi Sekine
  • Patent number: 5953696
    Abstract: Nasalized sound effects during reproduction of low-pitch sounds are suppressed to produce playback sounds of high clarity. Amplitude data is processed with high range formant emphasis of crests and valleys of the envelope of the frequency spectrum on the high frequency range and with deepening of the valley of the frequency spectrum over the entire frequency range, above all, over the low to mid frequency range. Next, the amplitude data is processed for emphasizing the peak values of the formant of the voiced frame in the portion of the speech signal which is rising in magnitude and for unconditionally emphasizing the spectral envelope on the high frequency range. The voiced speech spectrum is generated by synthesizing the cosine wave based upon the emphasized amplitude data.
    Type: Grant
    Filed: September 23, 1997
    Date of Patent: September 14, 1999
    Assignee: Sony Corporation
    Inventors: Masayuki Nishiguchi, Jun Matsumoto
  • Patent number: 5946650
    Abstract: A method and means to estimate the pitch of a speech or acoustic signal within a vocoder begins with the center clipping and low-pass filtering of the speech or acoustic signal to eliminate the formants from the speech or acoustic signal. An error function for each pitch is calculated for each pitch within the speech or acoustic signal. A fast tracking method is used to select the estimated pitch for the pitch or acoustic signal. A final check for the doubling of the pitch will minimize any incorrect estimation of the pitch.
    Type: Grant
    Filed: June 19, 1997
    Date of Patent: August 31, 1999
    Assignee: Tritech Microelectronics, Ltd.
    Inventor: Ma Wei
  • Patent number: 5937374
    Abstract: An improved vocoder system and method for estimating pitch in a speech waveform which pre-filters speech data with improved efficiency and reduced computational requirements. The vocoder system is preferably a low bit rate speech coder which analyzes a plurality of frames of speech data in parallel. Once the LPC filter coefficients and the pitch for a first frame have been calculated, the vocoder then looks ahead to the next frame to estimate the pitch, i.e., to estimate the pitch of the next frame. In the preferred embodiment of the invention, the vocoder filters speech data in a second frame using a plurality of the coefficients from a first frame as a multi pole analysis filter. These coefficients are used as a "crude" two pole analysis filter.
    Type: Grant
    Filed: May 15, 1996
    Date of Patent: August 10, 1999
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John G. Bartkowiak, Mark A. Ireton
  • Patent number: 5933805
    Abstract: A speech system includes a speech encoding system and a speech decoding system. The speech encoding system includes a speech analyzer for identifying each of the speech segments (i.e., phonemes) in the received digitized speech signal. A pitch detector, a duration detector, and an amplitude detector are each coupled to the memory and the analyzer and detect various prosodic parameters of each received speech segment. A speech encoder generates a data signal that includes the speech segment IDs and the values of the corresponding prosodic parameters. The speech decoding system includes a digital data decoder and a speech synthesizer for generating a speech signal based on the segment IDs and prosodic parameter values.
    Type: Grant
    Filed: December 13, 1996
    Date of Patent: August 3, 1999
    Assignee: Intel Corporation
    Inventors: Dale Boss, Sridhar Iyengar, T. Don Dennis
  • Patent number: 5897614
    Abstract: When a speech signal that may include a sibilant consisting of one or more formants is received, frequencies and selectivity factors are determined for each sibilant formant in the speech signal. Then, the frequencies and selectivity factors are compared to a set of empirically derived criteria to classify the sibilant sound.
    Type: Grant
    Filed: December 20, 1996
    Date of Patent: April 27, 1999
    Assignee: International Business Machines Corporation
    Inventor: Frank Albert McKiel, Jr.
  • Patent number: 5886276
    Abstract: An audio signal analyzer and encoder is based on a model that considers audio signals to be composed of deterministic or sinusoidal components, transient components representing the onset of notes or other events in an audio signal, and stochastic components. Deterministic components are represented as a series of overlapping sinusoidal waveforms. To generate the deterministic components, the input signal is divided into a set of frequency bands by a multi-complementary filter bank. The frequency band signals are oversampled so as to suppress cross-band aliasing energy in each band. Each frequency band is analyzed and encoded as a set of spectral components using a windowing time frame whose length is inversely proportional to the frequency range in that band. Low frequency bands are encoded using longer time frames than higher frequency bands.
    Type: Grant
    Filed: January 16, 1998
    Date of Patent: March 23, 1999
    Assignee: The Board of Trustees of the Leland Stanford Junior University
    Inventors: Scott N. Levine, Tony S. Verma
  • Patent number: 5878387
    Abstract: The coding apparatus comprises an adaptive codebook storing excitation signals as vectors, a synthesis filter for forming a synthesis signal, referring to the vectors stored in the adaptive codebook, a similarity computation circuit for computing a similarity between the synthesis signal obtained by the synthesis filter and a target signal, and a coding scheme determining circuit for deciding one coding scheme from a plurality of coding schemes respectively having coding bit rates different from each other, on the basis of the similarity obtained by the similarity computation circuit.
    Type: Grant
    Filed: September 29, 1995
    Date of Patent: March 2, 1999
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masahiro Oshikiri, Kimio Miseki, Masami Akamine, Tadashi Amada
  • Patent number: 5876213
    Abstract: A karaoke apparatus is constructed to perform a karaoke accompaniment part and a karaoke harmony part for accompanying a live vocal part. A pickup device collects a singing voice of the live vocal part. A detector device analyzes the collected singing voice to detect a musical register thereof at which the live vocal part is actually performed. A harmony generator device generates a harmony voice of the karaoke harmony part according to the detected musical register so that the karaoke harmony part is made consonant with the live vocal part. A tone generator device generates an instrumental tone of the karaoke accompaniment part in parallel to the karaoke harmony part.
    Type: Grant
    Filed: July 30, 1996
    Date of Patent: March 2, 1999
    Assignee: Yamaha Corporation
    Inventor: Shuichi Matsumoto
  • Patent number: 5870704
    Abstract: Estimating the time-varying spectrum envelope of a time-varying signal facilitates pitch modification and other shifting of signal content in the frequency domain. Local maxima of a spectrum of the signal are identified by applying a masking curve. The masking curve has a peak at the particular maximum and descends away therefrom the local maximum. Local maxima falling below the local maximum are eliminated. The slope of the masking curve is varied in accordance with measured parameters of the spectrum to decrease or eliminate spurious peaks. Thereafter, a smoothing procedure may be applied to smooth the spectrum in frequency.
    Type: Grant
    Filed: November 7, 1996
    Date of Patent: February 9, 1999
    Assignee: Creative Technology Ltd.
    Inventor: Jean Laroche
  • Patent number: 5826222
    Abstract: A method of encoding speech by analyzing a digitized speech signal to determine excitation parameters for the digitized speech signal is disclosed. The method includes dividing the digitized speech signal into at least two frequency bands, determining a first preliminary excitation parameter by performing a nonlinear operation on at least one of the frequency band signals to produce a modified frequency band signal and determining the first preliminary excitation parameter using the modified frequency band signal, determining a second preliminary excitation parameter using a method different from the first method, and using the first and second preliminary excitation parameters to determine an excitation parameter for the digitized speech signal. The method is useful in encoding speech. Speech synthesized using the parameters estimated based on the invention generates high quality speech at various bit rates useful for applications such as satellite voice communication.
    Type: Grant
    Filed: April 14, 1997
    Date of Patent: October 20, 1998
    Assignee: Digital Voice Systems, Inc.
    Inventor: Daniel Wayne Griffin
  • Patent number: 5822732
    Abstract: A speech modification or enhancement filter, and apparatus, system and method using the same. Synthesized speech signals are filtered to generate modified synthesized speech signals. From spectral information represented as a multi-dimensional vector, a filter coefficient is determined so as to ensure that formant characteristics of the modified synthesized speech signals are enhanced in comparison with those of the synthesized speech signal and in accordance with the spectral information. The spectral information can be any one of LSP information, PARCOR information and LAR information. A degree of freedom of design of the speech modification filter used for the aural suppression of quantizing noise contained in the synthesized speech signals is thus heightened leading to the improvement of intelligibility of said synthesized speech signals. A good formant enhancement effect can be obtained without allowing any perceptible level of distortions to occur within a range of permissible spectral gradients.
    Type: Grant
    Filed: May 2, 1996
    Date of Patent: October 13, 1998
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventor: Hirohisa Tasaki
  • Patent number: 5812966
    Abstract: An improved pitch searching time reducing method for a CELP vocoder using a Line Spectral Pair (LSP) frequency which is capable of significantly reducing the pitch search time by separating the speech signal using a first formant frequency of the line spectral pair of the digital type personal communication system, which includes the steps of computing a decimation interval of a pitch search interval using an LSP frequency of a first formant computed by a formant filter so as to compute a preparatory pitch of a given speech; determining a preparatory pitch to be used when searching a pitch by detecting a peak and a valley within each decimation interval; and computing a preparatory pitch by adapting a first formant frequency of an LSP computed by a formant filter with a decimation rate and performing a pitch search with respect to the obtained preparatory pitch.
    Type: Grant
    Filed: September 19, 1996
    Date of Patent: September 22, 1998
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Kyung-Jin Byun, Hah-Yong Yoo, Ki-Chun Han, Jong-Jae Kim, Myung-Jin Bae
  • Patent number: 5806037
    Abstract: A voice synthesis system is fundamentally configured by a sound-source model, which simulates human voices and the like, and a voice-path model which simulates properties of voice paths between vocal cords and lips. The sound-source model is embodied by a code book which stores a plurality of code words, representative of waveform patterns, with respect to each of the voices. Each of the code words is selected by an information index. The voice-path model is embodied by a full-pole synthesis filter whose characteristic curve provides multiple poles, each of which is represented by polar coordinates. There is further provided a pitch filter and an all-pass filter. Data representative of the code word selected is supplied to the pitch filter, in which a first delay time, set by a number of delay-time units, is imparted to the data. Then, the all-pass filter imparts a second delay time, which is smaller than the delay-time unit, to the data in response to pitch-variation information.
    Type: Grant
    Filed: March 29, 1995
    Date of Patent: September 8, 1998
    Assignee: Yamaha Corporation
    Inventor: Akira Sogo
  • Patent number: 5799271
    Abstract: The present invention relates to the method to receive a speech signal, to perform a recognition weighting process on it, to synthesize a synthetic speech signal, to calculate an autocorrelation of the synthetic speech signal whose delay is a predetermined value and an autocorrelation whose delay is 0, to divide the square of the former by the latter, to calculate a pitch lag and a pitch filter coefficient by calculating only the part of a positive peak with skipping over the part of a negative peak by using the results from the dividing operation, and to calculate and output the pitch lag and the pitch filter coefficient by repeating the above process Thus, real-time implementation of CELP vocoder can be achieved.
    Type: Grant
    Filed: June 24, 1996
    Date of Patent: August 25, 1998
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Kyung-Jin Byun, Ha-Young Yoo, Jong-Jae Kim, Ki-Chun Han, Jae-Suk Kim, Myung-Jin Bae
  • Patent number: 5749073
    Abstract: In the first step of a sound morphing process, each sound which forms the basis for the morph is converted into one or more quantitative representations, such as spectrograms. After the representations have been obtained, the temporal axes of the two sounds are matched, so that similar components of the two sounds, such as onsets, harmonic regions and inharmonic regions, are aligned with one another. Other characteristics of the sounds, such as pitch, formant frequencies, or the like, are then matched. Once the energy in each of the sounds has been accounted for and matched to that of the other sound, the two sounds are cross-faded, to produce a representation of a new sound. This representation is then inverted, to generate the morphed sound.
    Type: Grant
    Filed: March 15, 1996
    Date of Patent: May 5, 1998
    Assignee: Interval Research Corporation
    Inventor: Malcolm Slaney
  • Patent number: 5742927
    Abstract: A noise reduction apparatus and method for enhancing noisy speech signal which applies to the spectral component signals of a time-varying input signal either a spectral substraction process or a spectral scaling process followed by signal attenuation in regions of the frequency spectrum lying between identified formant regions.
    Type: Grant
    Filed: October 11, 1995
    Date of Patent: April 21, 1998
    Assignee: British Telecommunications Public Limited Company
    Inventors: Philip Mark Crozier, Barry Michael George Cheetham
  • Patent number: 5737719
    Abstract: A method and apparatus for enhancing the intelligibility of a telephonic speech signal within the available bandwidth and intensity limits of a telephone communication network. The method combines enhancement of both the formant ratio and the consonant/vowel energy ratio to realize a speech signal more intelligible to a hearing impaired user. The invention uses an auditory model of the human ear. A speech signal is put through a filter bank designed to simulate the cochlear filter shapes and filter spacing of a healthy cochlea. The energy output from each of a plurality of filters is computed and used to form an auditory spectrum. The peaks associated with strong first and second formants are identified, and the second formant is enhanced relative to the first formant by attenuating the first formant. Also, consonants in the speech signal are identified as having an energy level below a threshold associated with vowels, but above the threshold associated with silent regions. Consonant regions are amplified.
    Type: Grant
    Filed: December 19, 1995
    Date of Patent: April 7, 1998
    Assignee: U S West, Inc.
    Inventor: Alvin Mark Terry
  • Patent number: 5737718
    Abstract: A method for encoding the information is provided which realizes a high encoding efficiency especially for tonal acoustic signals without lowering the sound quality. An acoustic signal from a terminal is transformed by a transform circuit into spectral signals which are then normalized and quantized by a signal component encoding circuit for encoding from one encoding unit to another. The encoding unit configuration is selected by an encoding unit configuration decision circuit from plural encoding unit configurations depending upon the shape of distribution of the spectral components. An encoding unit of narrow low bandwidth is selected for a tonal signal.
    Type: Grant
    Filed: June 8, 1995
    Date of Patent: April 7, 1998
    Assignee: Sony Corporation
    Inventor: Kyoya Tsutsui