Formant Patents (Class 704/209)
-
Patent number: 6233552Abstract: An adaptive time-domain post-filtering technique is based on the modified Yule-Walker filter. This technique eliminates the problem of spectral tilt in speech spectrum that can be applied to various speech coders. The new post-filter has a flat frequency response at the formant peaks of speech spectrum. Information is gathered about the relation between poles and formants and then the formants and their bandwidths are estimated. The information about the formants and their bandwidths is then used to design the modified Yule-Walker filter based on a least squares fit in time domain.Type: GrantFiled: March 12, 1999Date of Patent: May 15, 2001Assignee: Comsat CorporationInventors: Azhar Mustapha, Suat Yeldener
-
Patent number: 6208959Abstract: A digital input symbol is transmitted to a receiver by determining one or more formant frequencies that correspond to the digital input symbol. In one embodiment, a pre-programmed addressable memory is used to map the set of possible digital input symbols onto a set of corresponding speech units, each comprising a superposition of one or more formant frequencies. A signal is then generated having the speech units. The signal is supplied for transmission over a voice channel. This may include supplying the signal to a voice coder prior to transmission. In another aspect of the invention, a forward error correction code (FEC) is determined for the digital input symbol, and the one or more speech units are modified as a function of the forward error correction code. In this way, the FEC may also be transmitted with the encoded input symbol. The modification may affect any of a number of attributes of the speech units, including a volume attribute and a pitch attribute.Type: GrantFiled: December 15, 1997Date of Patent: March 27, 2001Assignee: Telefonaktibolaget LM Ericsson (publ)Inventors: Björn Jonsson, Jan Swerup, Krister Törnqvist, Per-Olof Nerbrant
-
Patent number: 6205423Abstract: A method of coding speech under background noise conditions or during noise-like speech periods wherein during active voice speech segments an analysis-by-synthesis method is used. However, when a background noise segment or noise-like speech segment is detected, an adaptive code book (pitch prediction) contribution is used as a source of a pseudo-random sequence in order to provide a better representation of the background noise or the noise-like speech. An improved gain quantization scheme is also employed when a background noise segment is detected, wherein energy of the total excitation with quantized gains is matched to the energy of total excitation with unquantized gains.Type: GrantFiled: October 19, 1999Date of Patent: March 20, 2001Assignee: Conexant Systems, Inc.Inventors: Huan-Yu Su, Eric Kwok Fung Yuen, Adil Benyassine, Jes Thyssen
-
Patent number: 6151571Abstract: A method and system for monitoring a conversation between a pair of speakers for detecting an emotion of at least one of the speakers is provided. First, a voice signal is received after which a particular feature is extracted from the voice signal. Next, an emotion associated with the voice signal is determined based on the extracted feature. The emotion is screened and feedback is provided only if the emotion is determined to be a negative emotion selected from the group of negative emotions consisting of anger, sadness, and fear. Such determined negative emotion is then outputted to a third party during the conversation.Type: GrantFiled: August 31, 1999Date of Patent: November 21, 2000Assignee: Andersen ConsultingInventor: Valery A. Pertrushin
-
Patent number: 6148285Abstract: The allophonic text-to-speech generator (ATTG) 10 includes a CPU 100. The CPU has a random access memory 102 and a read only memory 104 for holding the operating system, application programs, and data for the CPU 100. A keyboard 110 provides a user with control over the CPU 100. A database 130 holds phonetic transcritps of words. Such databases are well-known in the field of telephone directory assistance. A second database 140 maps allophonic text to parse and pre-recorded allophones. The CPU 100 converts a phonetic transcript of a word into an allophonic text string in accordance with a rules program 120. Then the CPU 100 extracts the audio allophone files of the allophonic string and concatenates the audio files to form the new word in the same voice as the other words fromed from the allophones in database 140.Type: GrantFiled: October 30, 1998Date of Patent: November 14, 2000Assignee: Nortel Networks CorporationInventor: Philip John Busardo
-
Patent number: 6047254Abstract: The present invention comprises an improved vocoder system and method for estimating the pitch of a speech signal. The speech signal comprises a stream of digitized speech samples. The speech samples are partitioned into frames. For each frame of the speech signal, an optimal order-two inverse filter is determined. The optimal order-two inverse filter is determined by computing an order-two inverse filter at various locations within the speech frame. For each order-two inverse filter an energy value is calculated which represents the proportion of energy which would remain if the speech signal were filtered with the order-two inverse filter. The order-two inverse filter which minimizes the energy proportion is chosen to be the optimal order-two inverse filter. The optimal order-two inverse filter is then used to filter the samples of the speech frame. An autocorrelation is performed on the filtered signal for a range of tine-delay values.Type: GrantFiled: October 24, 1997Date of Patent: April 4, 2000Assignee: Advanced Micro Devices, Inc.Inventors: Mark A. Ireton, John G. Bartkowiak
-
Patent number: 6041296Abstract: In a frequently used speech synthesis for voice output an excitation signal is applied to a number of resonators whose frequency and amplitude are adjusted in accordance with the sound to be produced. These parameters for adjusting the resonators may be gained from natural speech signals. Such parameters gained from natural speech signals may also be used for speech recognition, in which these parameter values are compared with comparison values. According to the invention, the parameters, particularly the formant frequencies, are determined by forming the power density spectrum via discrete frequencies from which autocorrelation coefficients are formed for consecutive frequency segments of the power density spectrum from which, in turn, error values are formed, while the sum of the error values is minimized over all segments and the optimum boundary frequencies of the segments are determined for this minimum.Type: GrantFiled: April 21, 1997Date of Patent: March 21, 2000Assignee: U.S. Philips CorporationInventors: Lutz Welling, Hermann Ney
-
Patent number: 6026357Abstract: A vocoder system and method for estimating the pitch of a speech signal. The speech signal comprises a stream of digitized speech samples. The speech samples are partitioned into frames. For each frame of the speech signal, the following processing steps are performed. First, an optimal order-two inverse filter is determined based on the samples of the speech frame. Second, a dominant formant frequency is calculated from the coefficients of the optimal order-two inverse filter. Third, an autocorrelation function is calculated on the samples of the speech frame. The autocorrelation is performed for a range of time-delay values over which the pitch period and its multiples might be expected to occur. Fourth, the peaks of the autocorrelation function are analyzed incorporating the knowledge of the dominant formant period (which is the inverse of the dominant formant frequency). Normally, the dominant formant is the first formant.Type: GrantFiled: October 24, 1997Date of Patent: February 15, 2000Assignee: Advanced Micro Devices, Inc.Inventors: Mark A. Ireton, John G. Bartkowiak
-
Patent number: 6012026Abstract: A transmission system with a transmitter and a receiver. The transmitter has a speech encoder with analysis means, has calculation means, and has control means. The receiver has a speech decoder. Through a transmission medium, the transmitter transmits frames of data to the receiver. The analysis means determine analysis coefficients from a speech signal. From a bitrate setting, the calculation means calculate a fraction of the frames of data to carry more information about the analysis coefficients than a remaining number of the frames of data. The control means control the transmitter to transmit the fraction of the frames of data and to transmit the remaining number of the frames of data. The receiver receives the frames of data. The receiver derives a reconstructed speech signal from the received frames of data.Type: GrantFiled: March 31, 1998Date of Patent: January 4, 2000Assignee: U.S. Philips CorporationInventors: Rakesh Taori, Andreas J. Gerrits
-
Method and system for speech processing with greatly reduced harmonic and intermodulation distortion
Patent number: 6003000Abstract: A method and system for representing speech with greatly reduced harmonic and intermodulation distortion using a fixed interval scale, known as Tru-Scale. Speech is reproduced in accordance with a frequency matrix which reduces intermodulation interference and harmonic distortion (overtone collision). Enhanced speech quality and reduced noise results from increasing the signal-to-noise ratio in the processed speech signal. The method and system use an Auto-Regressive (AR) modeling technique, using, among other approaches, Linear Predictive Coding (LPC) analysis. In accordance with another aspect of the invention, a Fourier transform-based modeling technique also is used. The application of the system to speech coders also is contemplated.Type: GrantFiled: April 29, 1997Date of Patent: December 14, 1999Assignee: Meta-C CorporationInventors: Michele L. Ozzimo, Matthew C. Cobb, James A. Dinnan -
Patent number: 5956686Abstract: An adaptive transform coding/and decoding arrangement is provided to effectively exploit different redundancies between the bands of a spectrum envelope to effect coding at a low bit rate for an audio signal. In the adaptive transform coding method, the spectrum envelope is divided into bands so that different coding methods may be applied to the spectrum envelopes of the individual bands. By applying the present invention to the adaptive transform coding of an audio signal, the spectrum envelope can be adjusted to the coding/and transmission method which is suitable for the time fluctuation in each frequency band, so that the different redundancies for the individual bands can be effectively exploited to realize a highly efficient audio signal coding/and decoding method which has its bits reduced as required for coding the spectrum envelope.Type: GrantFiled: June 30, 1995Date of Patent: September 21, 1999Assignee: Hitachi, Ltd.Inventors: Makoto Takashima, Yoshiaki Asakawa, Hidetoshi Sekine
-
Patent number: 5953696Abstract: Nasalized sound effects during reproduction of low-pitch sounds are suppressed to produce playback sounds of high clarity. Amplitude data is processed with high range formant emphasis of crests and valleys of the envelope of the frequency spectrum on the high frequency range and with deepening of the valley of the frequency spectrum over the entire frequency range, above all, over the low to mid frequency range. Next, the amplitude data is processed for emphasizing the peak values of the formant of the voiced frame in the portion of the speech signal which is rising in magnitude and for unconditionally emphasizing the spectral envelope on the high frequency range. The voiced speech spectrum is generated by synthesizing the cosine wave based upon the emphasized amplitude data.Type: GrantFiled: September 23, 1997Date of Patent: September 14, 1999Assignee: Sony CorporationInventors: Masayuki Nishiguchi, Jun Matsumoto
-
Patent number: 5946650Abstract: A method and means to estimate the pitch of a speech or acoustic signal within a vocoder begins with the center clipping and low-pass filtering of the speech or acoustic signal to eliminate the formants from the speech or acoustic signal. An error function for each pitch is calculated for each pitch within the speech or acoustic signal. A fast tracking method is used to select the estimated pitch for the pitch or acoustic signal. A final check for the doubling of the pitch will minimize any incorrect estimation of the pitch.Type: GrantFiled: June 19, 1997Date of Patent: August 31, 1999Assignee: Tritech Microelectronics, Ltd.Inventor: Ma Wei
-
Patent number: 5937374Abstract: An improved vocoder system and method for estimating pitch in a speech waveform which pre-filters speech data with improved efficiency and reduced computational requirements. The vocoder system is preferably a low bit rate speech coder which analyzes a plurality of frames of speech data in parallel. Once the LPC filter coefficients and the pitch for a first frame have been calculated, the vocoder then looks ahead to the next frame to estimate the pitch, i.e., to estimate the pitch of the next frame. In the preferred embodiment of the invention, the vocoder filters speech data in a second frame using a plurality of the coefficients from a first frame as a multi pole analysis filter. These coefficients are used as a "crude" two pole analysis filter.Type: GrantFiled: May 15, 1996Date of Patent: August 10, 1999Assignee: Advanced Micro Devices, Inc.Inventors: John G. Bartkowiak, Mark A. Ireton
-
Patent number: 5933805Abstract: A speech system includes a speech encoding system and a speech decoding system. The speech encoding system includes a speech analyzer for identifying each of the speech segments (i.e., phonemes) in the received digitized speech signal. A pitch detector, a duration detector, and an amplitude detector are each coupled to the memory and the analyzer and detect various prosodic parameters of each received speech segment. A speech encoder generates a data signal that includes the speech segment IDs and the values of the corresponding prosodic parameters. The speech decoding system includes a digital data decoder and a speech synthesizer for generating a speech signal based on the segment IDs and prosodic parameter values.Type: GrantFiled: December 13, 1996Date of Patent: August 3, 1999Assignee: Intel CorporationInventors: Dale Boss, Sridhar Iyengar, T. Don Dennis
-
Patent number: 5897614Abstract: When a speech signal that may include a sibilant consisting of one or more formants is received, frequencies and selectivity factors are determined for each sibilant formant in the speech signal. Then, the frequencies and selectivity factors are compared to a set of empirically derived criteria to classify the sibilant sound.Type: GrantFiled: December 20, 1996Date of Patent: April 27, 1999Assignee: International Business Machines CorporationInventor: Frank Albert McKiel, Jr.
-
Patent number: 5886276Abstract: An audio signal analyzer and encoder is based on a model that considers audio signals to be composed of deterministic or sinusoidal components, transient components representing the onset of notes or other events in an audio signal, and stochastic components. Deterministic components are represented as a series of overlapping sinusoidal waveforms. To generate the deterministic components, the input signal is divided into a set of frequency bands by a multi-complementary filter bank. The frequency band signals are oversampled so as to suppress cross-band aliasing energy in each band. Each frequency band is analyzed and encoded as a set of spectral components using a windowing time frame whose length is inversely proportional to the frequency range in that band. Low frequency bands are encoded using longer time frames than higher frequency bands.Type: GrantFiled: January 16, 1998Date of Patent: March 23, 1999Assignee: The Board of Trustees of the Leland Stanford Junior UniversityInventors: Scott N. Levine, Tony S. Verma
-
Patent number: 5878387Abstract: The coding apparatus comprises an adaptive codebook storing excitation signals as vectors, a synthesis filter for forming a synthesis signal, referring to the vectors stored in the adaptive codebook, a similarity computation circuit for computing a similarity between the synthesis signal obtained by the synthesis filter and a target signal, and a coding scheme determining circuit for deciding one coding scheme from a plurality of coding schemes respectively having coding bit rates different from each other, on the basis of the similarity obtained by the similarity computation circuit.Type: GrantFiled: September 29, 1995Date of Patent: March 2, 1999Assignee: Kabushiki Kaisha ToshibaInventors: Masahiro Oshikiri, Kimio Miseki, Masami Akamine, Tadashi Amada
-
Patent number: 5876213Abstract: A karaoke apparatus is constructed to perform a karaoke accompaniment part and a karaoke harmony part for accompanying a live vocal part. A pickup device collects a singing voice of the live vocal part. A detector device analyzes the collected singing voice to detect a musical register thereof at which the live vocal part is actually performed. A harmony generator device generates a harmony voice of the karaoke harmony part according to the detected musical register so that the karaoke harmony part is made consonant with the live vocal part. A tone generator device generates an instrumental tone of the karaoke accompaniment part in parallel to the karaoke harmony part.Type: GrantFiled: July 30, 1996Date of Patent: March 2, 1999Assignee: Yamaha CorporationInventor: Shuichi Matsumoto
-
Patent number: 5870704Abstract: Estimating the time-varying spectrum envelope of a time-varying signal facilitates pitch modification and other shifting of signal content in the frequency domain. Local maxima of a spectrum of the signal are identified by applying a masking curve. The masking curve has a peak at the particular maximum and descends away therefrom the local maximum. Local maxima falling below the local maximum are eliminated. The slope of the masking curve is varied in accordance with measured parameters of the spectrum to decrease or eliminate spurious peaks. Thereafter, a smoothing procedure may be applied to smooth the spectrum in frequency.Type: GrantFiled: November 7, 1996Date of Patent: February 9, 1999Assignee: Creative Technology Ltd.Inventor: Jean Laroche
-
Patent number: 5826222Abstract: A method of encoding speech by analyzing a digitized speech signal to determine excitation parameters for the digitized speech signal is disclosed. The method includes dividing the digitized speech signal into at least two frequency bands, determining a first preliminary excitation parameter by performing a nonlinear operation on at least one of the frequency band signals to produce a modified frequency band signal and determining the first preliminary excitation parameter using the modified frequency band signal, determining a second preliminary excitation parameter using a method different from the first method, and using the first and second preliminary excitation parameters to determine an excitation parameter for the digitized speech signal. The method is useful in encoding speech. Speech synthesized using the parameters estimated based on the invention generates high quality speech at various bit rates useful for applications such as satellite voice communication.Type: GrantFiled: April 14, 1997Date of Patent: October 20, 1998Assignee: Digital Voice Systems, Inc.Inventor: Daniel Wayne Griffin
-
Patent number: 5822732Abstract: A speech modification or enhancement filter, and apparatus, system and method using the same. Synthesized speech signals are filtered to generate modified synthesized speech signals. From spectral information represented as a multi-dimensional vector, a filter coefficient is determined so as to ensure that formant characteristics of the modified synthesized speech signals are enhanced in comparison with those of the synthesized speech signal and in accordance with the spectral information. The spectral information can be any one of LSP information, PARCOR information and LAR information. A degree of freedom of design of the speech modification filter used for the aural suppression of quantizing noise contained in the synthesized speech signals is thus heightened leading to the improvement of intelligibility of said synthesized speech signals. A good formant enhancement effect can be obtained without allowing any perceptible level of distortions to occur within a range of permissible spectral gradients.Type: GrantFiled: May 2, 1996Date of Patent: October 13, 1998Assignee: Mitsubishi Denki Kabushiki KaishaInventor: Hirohisa Tasaki
-
Patent number: 5812966Abstract: An improved pitch searching time reducing method for a CELP vocoder using a Line Spectral Pair (LSP) frequency which is capable of significantly reducing the pitch search time by separating the speech signal using a first formant frequency of the line spectral pair of the digital type personal communication system, which includes the steps of computing a decimation interval of a pitch search interval using an LSP frequency of a first formant computed by a formant filter so as to compute a preparatory pitch of a given speech; determining a preparatory pitch to be used when searching a pitch by detecting a peak and a valley within each decimation interval; and computing a preparatory pitch by adapting a first formant frequency of an LSP computed by a formant filter with a decimation rate and performing a pitch search with respect to the obtained preparatory pitch.Type: GrantFiled: September 19, 1996Date of Patent: September 22, 1998Assignee: Electronics and Telecommunications Research InstituteInventors: Kyung-Jin Byun, Hah-Yong Yoo, Ki-Chun Han, Jong-Jae Kim, Myung-Jin Bae
-
Patent number: 5806037Abstract: A voice synthesis system is fundamentally configured by a sound-source model, which simulates human voices and the like, and a voice-path model which simulates properties of voice paths between vocal cords and lips. The sound-source model is embodied by a code book which stores a plurality of code words, representative of waveform patterns, with respect to each of the voices. Each of the code words is selected by an information index. The voice-path model is embodied by a full-pole synthesis filter whose characteristic curve provides multiple poles, each of which is represented by polar coordinates. There is further provided a pitch filter and an all-pass filter. Data representative of the code word selected is supplied to the pitch filter, in which a first delay time, set by a number of delay-time units, is imparted to the data. Then, the all-pass filter imparts a second delay time, which is smaller than the delay-time unit, to the data in response to pitch-variation information.Type: GrantFiled: March 29, 1995Date of Patent: September 8, 1998Assignee: Yamaha CorporationInventor: Akira Sogo
-
Patent number: 5799271Abstract: The present invention relates to the method to receive a speech signal, to perform a recognition weighting process on it, to synthesize a synthetic speech signal, to calculate an autocorrelation of the synthetic speech signal whose delay is a predetermined value and an autocorrelation whose delay is 0, to divide the square of the former by the latter, to calculate a pitch lag and a pitch filter coefficient by calculating only the part of a positive peak with skipping over the part of a negative peak by using the results from the dividing operation, and to calculate and output the pitch lag and the pitch filter coefficient by repeating the above process Thus, real-time implementation of CELP vocoder can be achieved.Type: GrantFiled: June 24, 1996Date of Patent: August 25, 1998Assignee: Electronics and Telecommunications Research InstituteInventors: Kyung-Jin Byun, Ha-Young Yoo, Jong-Jae Kim, Ki-Chun Han, Jae-Suk Kim, Myung-Jin Bae
-
Patent number: 5749073Abstract: In the first step of a sound morphing process, each sound which forms the basis for the morph is converted into one or more quantitative representations, such as spectrograms. After the representations have been obtained, the temporal axes of the two sounds are matched, so that similar components of the two sounds, such as onsets, harmonic regions and inharmonic regions, are aligned with one another. Other characteristics of the sounds, such as pitch, formant frequencies, or the like, are then matched. Once the energy in each of the sounds has been accounted for and matched to that of the other sound, the two sounds are cross-faded, to produce a representation of a new sound. This representation is then inverted, to generate the morphed sound.Type: GrantFiled: March 15, 1996Date of Patent: May 5, 1998Assignee: Interval Research CorporationInventor: Malcolm Slaney
-
Patent number: 5742927Abstract: A noise reduction apparatus and method for enhancing noisy speech signal which applies to the spectral component signals of a time-varying input signal either a spectral substraction process or a spectral scaling process followed by signal attenuation in regions of the frequency spectrum lying between identified formant regions.Type: GrantFiled: October 11, 1995Date of Patent: April 21, 1998Assignee: British Telecommunications Public Limited CompanyInventors: Philip Mark Crozier, Barry Michael George Cheetham
-
Patent number: 5737719Abstract: A method and apparatus for enhancing the intelligibility of a telephonic speech signal within the available bandwidth and intensity limits of a telephone communication network. The method combines enhancement of both the formant ratio and the consonant/vowel energy ratio to realize a speech signal more intelligible to a hearing impaired user. The invention uses an auditory model of the human ear. A speech signal is put through a filter bank designed to simulate the cochlear filter shapes and filter spacing of a healthy cochlea. The energy output from each of a plurality of filters is computed and used to form an auditory spectrum. The peaks associated with strong first and second formants are identified, and the second formant is enhanced relative to the first formant by attenuating the first formant. Also, consonants in the speech signal are identified as having an energy level below a threshold associated with vowels, but above the threshold associated with silent regions. Consonant regions are amplified.Type: GrantFiled: December 19, 1995Date of Patent: April 7, 1998Assignee: U S West, Inc.Inventor: Alvin Mark Terry
-
Patent number: 5737718Abstract: A method for encoding the information is provided which realizes a high encoding efficiency especially for tonal acoustic signals without lowering the sound quality. An acoustic signal from a terminal is transformed by a transform circuit into spectral signals which are then normalized and quantized by a signal component encoding circuit for encoding from one encoding unit to another. The encoding unit configuration is selected by an encoding unit configuration decision circuit from plural encoding unit configurations depending upon the shape of distribution of the spectral components. An encoding unit of narrow low bandwidth is selected for a tonal signal.Type: GrantFiled: June 8, 1995Date of Patent: April 7, 1998Assignee: Sony CorporationInventor: Kyoya Tsutsui