Frequency Element Patents (Class 704/268)
  • Patent number: 6317713
    Abstract: Sound generating parameters are used for outputting fundamental frequency and a command regarding prosody, and a sound source generator. The sound generation device further includes use of an accent command and a descent command for calculating fundamental frequency and incorporates a rhythm command, which is representable by a sine wave. The device also uses character string analysis for analyzing a character string and generating a command concerning phoneme and prosody, a calculating element for outputting fundamental frequency as sound generation parameters, which depends on prosody, a sound source generator, and an articulator that depends on a phoneme command.
    Type: Grant
    Filed: January 6, 1999
    Date of Patent: November 13, 2001
    Assignee: Arcadia, Inc.
    Inventor: Seiichi Tenpaku
  • Patent number: 6311158
    Abstract: Techniques for synthesizing a time-domain signal. The time-domain signal is partitioned into a number of time-domain frames and a waveform in generated for each time-domain frame. Each waveform includes one or more sinusoids. The waveform is generated by selecting a sinusoid for synthesis and computing a set of parameter values (e.g. the start and end amplitude, frequency, and phase values) for the selected sinusoid. A template is determined for the selected sinusoid based on the computed parameter values and a selected window function. The frequency-domain template is such that the amplitude of the selected sinusoid in the time domain matches, at a time-domain frame boundary, the amplitude of a corresponding sinusoid in an adjacent time-domain frame. The template is added to a frequency-domain frame. The process is repeated for each sinusoid in the waveform. After all sinusoids have been processed, the frequency-domain frame is transformed to a time-domain frame.
    Type: Grant
    Filed: March 16, 1999
    Date of Patent: October 30, 2001
    Assignee: Creative Technology Ltd.
    Inventor: Jean Laroche
  • Patent number: 6308156
    Abstract: A digital speech synthesis process in which utterances in a language are recorded, and the recorded utterances are divided into speech segments which are stored so as to allow their allocation to specific phonemes. A text which is to be output as speech is converted to a phoneme chain and the stored segments are output in a sequence defined by the phoneme chain. An analysis of the text to be output as speech is carried out and thus provides information which completes the phoneme chain and modifies the timing sequence signal for the speech segments which are to be strung together for output as speech.
    Type: Grant
    Filed: September 14, 1998
    Date of Patent: October 23, 2001
    Assignee: G Data Software GmbH
    Inventors: William Barry, Ralf Benzmüller, Andreas Luning
  • Patent number: 6301556
    Abstract: An apparatus and method for reducing sparseness in a coded speech signal. Sparse codebook values are generated from a codebook. An anti-sparseness operation is performed on the sparse codebook values to produce output codebook values having a greater density of non-zero values than the sparse codebook values. The output codebook values are processed by a speech processor to generate an encoded speech signal during an encoding operation or a decoded speech signal during a decoding operation.
    Type: Grant
    Filed: December 22, 1999
    Date of Patent: October 9, 2001
    Assignee: Telefonaktiebolaget L M. Ericsson (publ)
    Inventors: Roar Hagen, Björn Stig Erik Johansson, Erik Ekudden, Willem Baastian Kleijn
  • Patent number: 6292775
    Abstract: A speech processing system (10) incorporates an analogue to digital converter (16) to digitize input speech signals for Fourier transformation to produce short-term spectral cross-sections. These cross-sections are compared with one hundred and fifty reference patterns in a store (34), the patterns having respective stored sets of formant frequencies assigned thereto by a human expert. Six stored patterns most closely matching each input cross-section are selected for further processing by dynamic programming, which indicates the pattern which is a best match to the input cross-section by using frequency-scale warping to achieve alignment. The stores formant frequencies of the best matching pattern are modified by the frequency warping, and the results are used as formant frequency estimates for the input cross-section. The frequencies are further refined on the basis of the shape of the input cross-section near to the chosen formants.
    Type: Grant
    Filed: February 18, 1999
    Date of Patent: September 18, 2001
    Assignee: The Secretary of State for Defence in Her Britannic Majesty's Government of the United Kingdom of Great Britain and Northern Ireland
    Inventor: John N Holmes
  • Patent number: 6289310
    Abstract: An apparatus and method for screening an individual's ability to process acoustic events is provided. The invention provides sequences (or trials) of acoustically processed target and distractor phonemes to a subject for identification. The acoustic processing includes amplitude emphasis of selected frequency envelopes, stretching (in the time domain) of selected portions of phonemes, and phase adjustment of selection portions of phonemes relative to a base frequency. After a number of trials, the method of the present invention develops a profile for an individual that indicates whether the individual's ability to process acoustic events is within a normal range, and if not, what processing can provide the individual with optimal hearing. The individual's profile can then be used by a listening or processing device to particularly emphasize, stretch, or otherwise manipulate an audio stream to provide the individual with an optimal chance of distinguishing between similar acoustic events.
    Type: Grant
    Filed: October 7, 1998
    Date of Patent: September 11, 2001
    Assignee: Scientific Learning Corp.
    Inventors: Steven L. Miller, Bret E. Peterson, Athanassios Protopapas
  • Patent number: 6289311
    Abstract: A method and apparatus for sound synthesizing and sound band expanding of a narrow band input signal uses wide-band voiced and unvoiced sound code books and also uses narrow-band voiced and unvoiced sound code books. Coded input sound parameters are decoded and quantized using the narrow-band voiced and unvoiced sound code books and are then de-quantized using the wide-band voiced and unvoiced sound code books. The sound is synthesized based on the de-quantized data and a so-called innovation-related parameter formed by a zero-filling circuit filing zeros between samples of the framed input signal, so that the result is an upsampled aliased wide-band signal used with the de-quantized data to synthesize the sound.
    Type: Grant
    Filed: October 20, 1998
    Date of Patent: September 11, 2001
    Assignee: Sony Corporation
    Inventors: Shiro Omori, Masayuki Nishiguchi
  • Patent number: 6289309
    Abstract: A spectrum-based speech enhancement system estimates and tracks the noise spectrum of a mixed speech and noise signal. The system frames and windows a digitized signal and applies the frames to a fast Fourier transform processor to generate discrete Fourier transformed (DFT) signals representing the speech plus noise signal. The system calculates the power spectrum of each frame. The speech enhancement system employs a leaky integrator that is responsive to identified noise-only components of the signal. The leaky integrator has an adaptive time-constant which compensates for non-stationary environmental noise. In addition, the speech enhancement system identified noise-only intervals by using a technique that monitors the Teager energy of the signal. The transition between noise-only signals and speech plus noise signals is softened by being made non-binary. Once the noise spectrum has been estimated, it is used to generate gain factors that multiply the DFT signals to produce noise-reduced DFT signals.
    Type: Grant
    Filed: December 15, 1999
    Date of Patent: September 11, 2001
    Assignee: Sarnoff Corporation
    Inventor: Albert deVries
  • Publication number: 20010018655
    Abstract: A voicing probability determination method is provided for estimating a percentage of unvoiced and voiced energy for each harmonic within each of a plurality of bands of a speech signal spectrum. Initially, a synthetic speech spectrum is generated based on the assumption that speech is purely voiced. The original and synthetic speech spectra are then divided into plurality of bands. The synthetic and original speech spectra are compared harmonic by harmonic, and a voicing determination is made based on this comparison. In one embodiment, each harmonic of the original speech spectrum is assigned a voicing decision as either completely voiced or unvoiced by comparing the difference with an adaptive threshold. If the difference for each harmonic is less than the adaptive threshold, the corresponding harmonic is declared as voiced; otherwise the harmonic is declared as unvoiced. The voicing probability for each band is then computed based on the amount of energy in the voiced harmonics in that decision band.
    Type: Application
    Filed: February 28, 2001
    Publication date: August 30, 2001
    Inventor: Suat Yeldener
  • Patent number: 6278974
    Abstract: The present invention is related to a speech synthesizer which includes a sampled signal storing device storing therein a sampled signal and outputting the sampled signal in response to an input signal, and a speech signal synthesizing circuit electrically connected to the sampled signal storing device, receiving an operation signal, having the sampled signal outputted by the sampled signal storing device be repeatedly operated in response to the operation signal, and then outputting a speech synthesized signal, wherein a frequency of the operation signal is higher than that of the input signal to allow the sampled signal to be repeatedly operated during a single cycle of the input signal. The present invention proceeds a plurality of times of operation for each entry of data in the storing device so that the synthesizing performance of the present synthesizer can be improved without increasing the storage amount of the sampled signals.
    Type: Grant
    Filed: November 21, 1997
    Date of Patent: August 21, 2001
    Assignee: Winbond Electronics Corporation
    Inventor: James J. Y. Lin
  • Patent number: 6253182
    Abstract: The present invention provides a method for synthesizing speech by modifying the prosody of individual components of a training speech signal and then combining the modified speech segments. The method includes selecting an input speech segment and identifying an output prosody. The prosody of the input speech segment is then changed by independently changing the prosody of a voiced component and an unvoiced component of the input speech signal. These changes produce an output voiced component and an output unvoiced component that are combined to produce an output speech segment. The output speech segment is then combined with other speech segments to form synthesized speech.
    Type: Grant
    Filed: November 24, 1998
    Date of Patent: June 26, 2001
    Assignee: Microsoft Corporation
    Inventor: Alejandro Acero
  • Patent number: 6240388
    Abstract: An audio data coding device includes a frequency/time converter circuit for decoding audio data in form of decoded frequency-region signal made by time/frequency conversion, and adjusting circuit for adjusting the frequency-region signal prior to frequency/time conversion by the frequency/time converter circuit to enhance specific frequency components contained in the signal. Since adjustment is made in the frequency region, the processing is easily performed. An audio data coding and decoding system includes a coding device for converting an audio signal into a frequency-region signal by time/frequency conversion and for coding the signal by quantization, and a decoding device for decoding the audio data coded by the coding device.
    Type: Grant
    Filed: July 8, 1997
    Date of Patent: May 29, 2001
    Inventor: Hiroyuki Fukuchi
  • Patent number: 6233558
    Abstract: A method and apparatus are provided for use in tracing transmission lines from a first location to a second location. The apparatus comprises a multi-channel speech synthesizer, a digital to analog convertor and a demultiplexer. The digital synthesized speech which can take any form is applied to a digital to analog convertor where it is transformed into an analog stream of distinct speech segments. In accordance with the corresponding method, the distinct speech segments are selectively connected by the multiplexer to one or more of the transmission lines to be traced. At the second location, the distinct speech segments are detected. As each distinct speech segment is associated with a particular transmission line, it is possible to distinguish between the many transmission lines which are being simultaneously traced.
    Type: Grant
    Filed: February 11, 1998
    Date of Patent: May 15, 2001
    Assignee: Tempo Research Corporation
    Inventor: Paul L. Whalley
  • Patent number: 6208958
    Abstract: A pitch determination apparatus and method using spectro-temporal autocorrelation to prevent pitch determination errors are provided.
    Type: Grant
    Filed: January 7, 1999
    Date of Patent: March 27, 2001
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yong-duk Cho, Moo-Young Kim
  • Patent number: 6199039
    Abstract: An MPEG-II audio decoder with a synthesis subband filter includes a fast IMDCT (Inverse Modified Discrete Cosine Transform) module and an IPQMF (Inverse Pseudo Quadrature Mirror Filter) module. The fast IMDCT module involves a butterfly stage of input subband samples which requires only about ¼ the amount of multiplier-accumulate computation of the ISO suggested method. The IPQMF module involves an efficient memory configuration which requires only half size of the standard synthesis subband filter bank.
    Type: Grant
    Filed: August 3, 1998
    Date of Patent: March 6, 2001
    Assignee: National Science Council
    Inventors: Liang-Gee Chen, Tsung-Han Tsai, Yuan-Chen Liu
  • Patent number: 6182042
    Abstract: A system and method for modifying a subportion of information contained in an audio, such as magnitude information, without substantially effecting the remaining information contained therein, such a phase information. An incoming audio signal is segmented into a sequence of overlapping windowed DFT representations, during an analysis step, and during a synthesis step the DFT representations are converted back to a time domain signal. Each of the DFT representations consists of a plurality of frequency components obtained during a period of time. Each of the frequency components is associated with a unique increment of the period. Subsequent to the analysis step, but before the synthesis step, the frequency components of the DFT representations are re-mapped so as to have a differing temporal relationship with respect to the increments of the period of time.
    Type: Grant
    Filed: July 7, 1998
    Date of Patent: January 30, 2001
    Assignee: Creative Technology Ltd.
    Inventor: Alan Peevers
  • Patent number: 6173263
    Abstract: A method and system are provided for performing concatenative speech synthesis using half-phonemes to allow the full utilization of both diphone synthesis and unit selection techniques in order to provide synthesis quality that can combine intelligibility achieved using diphone synthesis with a naturalness achieved using unit selection. The concatenative speech synthesis system may include a speech synthesizer that may comprise a linguistic processor, a unit selector and a speech processor. A speech training module may input trained speech off-line to the unit selector. The concatenative speech synthesis may normalize the input text in order to distinguish sentence boundaries from abbreviations. The normalized text is then grammatically analyzed to identify the syntactic structure of each constituent phrase. Orthographic characters used in normal text are mapped into appropriate strings of phonetic segments representing units of sound and speech.
    Type: Grant
    Filed: August 31, 1998
    Date of Patent: January 9, 2001
    Assignee: AT&T Corp.
    Inventor: Alistair Conkie
  • Patent number: 6163769
    Abstract: A text-to-speech system includes a storage device for storing a clustered set of context-dependent phoneme-based units of a target speaker. In one embodiment, decision trees are used wherein each decision tree based context-dependent phoneme-based unit is arranged based on context of at least one immediately preceding and succeeding phoneme. At least one of the context-dependent phoneme-based units represents other non-stored context-dependent phoneme units of similar sound due to similar contexts. A text analyzer obtains a string of phonetic symbols representative of text to be converted to speech. A concatenation module selects stored decision tree based context-dependent phoneme-based units from the set decision tree based context-dependent phoneme-based units based on the context of the phonetic symbols and synthesizes the selected phoneme-based units to generate speech corresponding to the text.
    Type: Grant
    Filed: October 2, 1997
    Date of Patent: December 19, 2000
    Assignee: Microsoft Corporation
    Inventors: Alejandro Acero, Hsiao-Wuen Hon, Xuedong D. Huang
  • Patent number: 6144939
    Abstract: The concatenative speech synthesizer employs demi-syllable subword units to generate speech. The synthesizer is based on a source-filter model that uses source signals that correspond closely to the human glottal source and that uses filter parameters that correspond closely to the human vocal tract. Concatenation of the demi-syllable units is facilitated by two separate cross fade techniques, one applied in the time domain to the demi-syllable source signal waveforms, and one applied in the frequency domain by interpolating the corresponding filter parameters of the concatenated demi-syllables. The dual cross fade technique results in natural sounding synthesis that avoids time-domain glitches without degrading or smearing characteristic resonances in the filter domain.
    Type: Grant
    Filed: November 25, 1998
    Date of Patent: November 7, 2000
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Steve Pearson, Nicholas Kibre, Nancy Niedzielski
  • Patent number: 6125346
    Abstract: A speech synthesizing system using a redundancy-reduced waveform database is disclosed. Each waveform of a sample set of voice segments necessary and sufficient for speech synthesis is divided into pitch waveforms, which are classified into groups of pitch waveforms closely similar to one another. One of the pitch waveforms of each group is selected as a representative of the group and is given a pitch waveform ID. The waveform database at least comprises a pitch waveform pointer table each record of which comprises a voice segment ID of each of the voice segments and pitch waveform IDs the pitch waveforms of which, when combined in the listed order, constitute a waveform identified by the voice segment ID and a pitch waveform table of pitch waveform IDs and corresponding pitch waveforms. This enables the waveform database size to be reduced.
    Type: Grant
    Filed: December 5, 1997
    Date of Patent: September 26, 2000
    Assignee: Matsushita Electric Industrial Co., Ltd
    Inventors: Hirofumi Nishimura, Toshimitsu Minowa, Yasuhiko Arai
  • Patent number: 6125344
    Abstract: The present invention relates to an improved pitch modification method by glottal closure interval extrapolation. It is an object of the present invention to modify pitches of speech signals by the glottal closure interval extrapolation and to maintain quality of the modified speech, when concatenating original speech segments to synthesize speech. An input speech signal is converted into a digital speech signal. A glottal closure interval is detected in the digital speech signal so as to estimate vocal tract parameters by using pitch synchronous analysis. Vocal tract characteristic signals of the glottal closure interval and glottal characteristic signals of a glottal open interval are separated from each other according to the detected glottal closure interval. The separated vocal tract characteristic signals are extrapolated and reduced to a desired pitch length by the estimated vocal tract parameter.
    Type: Grant
    Filed: August 21, 1998
    Date of Patent: September 26, 2000
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Dong Gyu Kang, Jung Chul Lee, Sang Hun Kim, Jun Park
  • Patent number: 6101470
    Abstract: A method for automatically generating pitch contours in a text to speech (TtS) system, the system converting input text into an output acoustic signal simulating natural speech, the method comprising the steps of: storing a plurality of associated stress and pitch level pairs, each of the plurality of pairs including a lexical stress level and a pitch level; calculating lexical stress levels of the input text; comparing the stress levels of the input text to the stored stress levels of the plurality of associated stress and pitch level pairs to find the stored stress levels closest to the stress levels of the input text; and copying the pitch levels associated with the closest stored stress levels of the stress and pitch level pairs to generate the pitch contours of the input text.
    Type: Grant
    Filed: May 26, 1998
    Date of Patent: August 8, 2000
    Assignee: International Business Machines Corporation
    Inventors: Ellen M. Eide, Robert E. Donovan
  • Patent number: 6085157
    Abstract: The present invention can obtain a clear velocity converted sound in a sound signal which is recorded in recording media, without changing an interval of the sound signal. An input sound signal (1a) is transmitted from a sound signal storage memory (1) to a voiced sound/unvoiced sound deciding portion (2). In the voiced sound/unvoiced sound deciding portion (2), it is decided whether the input sound signal (1a) is a voiced sound or an unvoiced sound. A decision result is transmitted to a speech velocity converter (4) as a switching flag (1b). The speech velocity converter (4) outputs the unvoiced sound as it is. A predetermined windowing and adding processing is performed to the voiced sound, a time compression is carried out so as to output the voiced sound. An output signal (1e) from the speech velocity converter (4) is output as a frame output signal (1g) through an output sound signal frame buffer (8). In another mode, a switch and an adder may be used.
    Type: Grant
    Filed: September 12, 1997
    Date of Patent: July 4, 2000
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Hiroaki Takeda
  • Patent number: 6081781
    Abstract: Data in the same range of the fundamental frequency F.sub.0 as speech segments are used as learning data to prepare a reference codebook CB.sub.M for a spectrum envelope. The same learning data for a higher range than F.sub.0 and the same learning data for a lower range are subject to a linear stretch matching with respect to the learning data for the range F.sub.0. For each vector code in the reference codebook CB.sub.M, the spectrum envelope is clustered to prepare a high range codebook CB.sub.H and a low range codebook CB.sub.L. The spectrum envelope of input speech segments are fuzzy vector quantized (S402) with the reference codebook, and depending on the synthesized F.sub.0, a high, middle or low codebooks is selected. The selected codebook is used to decode the fuzzy vector quantized code, and the decoded output is subject to the inverse FFT. Alternatively, codebooks CM.sub.MH and CB.sub.ML each comprising differential vectors for corresponding code vectors between CB.sub.M and CB.sub.H and between CB.
    Type: Grant
    Filed: September 9, 1997
    Date of Patent: June 27, 2000
    Assignee: Nippon Telegragh and Telephone Corporation
    Inventors: Kimihito Tanaka, Masanobu Abe
  • Patent number: 6073094
    Abstract: A communication system includes a transmitter for transmitting messages to a plurality of receiving devices of the communication system, and a processing system. The processing system is adapted to convert a caller's voice message to a sequence of phonemes whereby the caller's voice message is intended for a receiving device. To accomplish the conversion, steps of Fourier transform, spectral subdivision, envelope filtering autocorrelation function determination of each subdivision, and voiceness determination for each subdivision are performed. The processing system is further adapted to generate a sequence of phoneme indexes and voice features corresponding to the sequence of phonemes, and to cause the transmitter to transmit the sequence of phoneme indexes to the receiving device for generating a voice signal representative of the caller's voice message. The voice features can include spectral features, average energy, duration, and pitch to improve the quality of the voice signal.
    Type: Grant
    Filed: June 2, 1998
    Date of Patent: June 6, 2000
    Assignee: Motorola
    Inventors: Lu Chang, Jian-Cheng Huang, Robert J. Schwendeman
  • Patent number: 6067519
    Abstract: Portions of spoon waveform are joined by forming extrapolations at the end of one and the beginning of the next portion to create an overlap region with synchronous pitchmarks, and then forming a weighted sum across the overlap to provide a smooth transition.
    Type: Grant
    Filed: November 7, 1996
    Date of Patent: May 23, 2000
    Assignee: British Telecommunications public limited company
    Inventor: Andrew Lowry
  • Patent number: 6064955
    Abstract: A MBE synthesizer (2200) for generating speech from information received by a receiver (114) includes a voiced signal generator (2280) for generating voiced signal components in the time domain using an IDFT in a pitch wave generator (2210) and and a pitch wave resampler (2232) and an unvoiced signal generator (2290) for generating unvoiced signal components in the time domain. The MBE synthesizer also includes a voicing processor (2218) responsive to band voicing flags within the excitation information for controlling selection of a voiced spectral component or an unvoiced spectral component from a harmonic amplitude spectrum.
    Type: Grant
    Filed: April 13, 1998
    Date of Patent: May 16, 2000
    Assignee: Motorola
    Inventors: Jian-Cheng Huang, Kenneth D. Finlon, Floyd D. Simpson
  • Patent number: 6029125
    Abstract: Sparseness is reduced in an input digital signal which includes a first sequence of sample values. An output digital signal is produced in response to the input digital signal. The output digital signal includes a second sequence of sample values, which second sequence of sample values has a greater density of non-zero sample values than the first sequence of sample values.
    Type: Grant
    Filed: July 7, 1998
    Date of Patent: February 22, 2000
    Assignee: Telefonaktiebolaget L M Ericsson, (publ)
    Inventors: Roar Hagen, Bjorn Stig Erik Johansson, Erik Ekudden, Willem Baastian Kleijn
  • Patent number: 6029134
    Abstract: A speech synthesizing method and apparatus arranged to use a sinusoidal waveform synthesis technique provide for preventing degradation of acoustic quality caused by the shift of the phase when synthesizing a sinusoidal waveform. A decoding unit decodes the data from an encoding side. The decoded data is transformed into the voiced/unvoiced data through a bad frame mask unit. Then, an unvoiced frame detecting circuit detects an unvoiced frame from the data. If there exist two or more continuous unvoiced frames, a voiced sound synthesizing unit initializes the phases of a fundamental wave and its harmonic into a given value such as 0 or .pi./2. This makes it possible to initialize the phase shift between the unvoiced and the voiced frames at a start point of the voiced frame, thereby preventing degradation of acoustic quality such as distortion of a synthesized sound caused by dephasing.
    Type: Grant
    Filed: September 20, 1996
    Date of Patent: February 22, 2000
    Assignee: Sony Corporation
    Inventors: Masayuki Nishiguchi, Jun Matsumoto
  • Patent number: 6021388
    Abstract: A speech synthesis apparatus for outputting synthesized speech on the basis of a parameter sequence of a speech waveform includes a parameter generation unit which generates a parameter sequence for speech synthesis on the basis of a character sequence input by a character sequence input unit, and stores the generated parameter sequence in a parameter storage unit. A waveform generation unit is also provided that generates pitch waveforms each for one pitch period on the basis of synthesis parameters and pitch scales included in the parameter sequence, and generates a speech waveform by connecting the generated pitch waveforms in accordance with frame lengths set by a frame length setting unit.
    Type: Grant
    Filed: December 19, 1997
    Date of Patent: February 1, 2000
    Assignee: Canon Kabushiki Kaisha
    Inventors: Mitsuru Otsuka, Yasunori Ohora, Takashi Aso, Yasuo Okutani
  • Patent number: 5987413
    Abstract: Method envelope-invariant for audio signal synthesis from elementary audio waveforms stored in a dictionary wherein:the waveforms are perfectly periodic, and stored as one of their period,synthesis is obtained by overlap-adding of the waveforms obtained from time-domain repetition of the periodic waveforms with a weighting window whose size is approximately two times the period of the signals to weight, and whose relative position inside of the period is fixed to any value identical for all the periods, each extracted from a reharmonized and thus periodic waveform, obtained by modifying, without changing the spectral envelope, the frequencies and amplitudes of harmonics in the spectrum of a frame of the original continuous speech waveform,whereby the time shift between two successive waveforms obtained by weighting the original signals is set to the imposed fundamental frequency of the signal to synthesize.
    Type: Grant
    Filed: June 5, 1997
    Date of Patent: November 16, 1999
    Inventors: Thierry Dutoit, Vincent Pagel, Nicolas Pierret
  • Patent number: 5950152
    Abstract: A composite pitch pattern of an artificial waveform of a composite sound indicating characters is produced according to a general pitch pattern producing model, and a pitch pattern of a VCV phoneme-chain waveform of each of VCV phoneme-chains corresponding to the characters is produced from an actual voice sample. Each VCV phoneme-chain composed of a preceding vowel, a consonant and a succeeding vowel has a pitch fine structure and a pitch fluctuation. Thereafter, an overall inclination of the pitch pattern of each VCV phoneme-chain waveform is adjusted to that of a portion of the composite pitch pattern corresponding to the-same VCV phoneme-chain to overlap transitional portions of preceding and succeeding vowels in a changed pitch pattern of each VCV phoneme-chain waveform with those in the corresponding portion of the composite pitch pattern.
    Type: Grant
    Filed: September 19, 1997
    Date of Patent: September 7, 1999
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Yasuhiko Arai, Hirofumi Nishimura, Toshimitsu Minowa, Ryou Mochizuki, Takashi Honda
  • Patent number: 5905972
    Abstract: Prosodic databases hold fundamental frequency templates for use in a speech synthesis system. Prosodic database templates may hold fundamental frequency values for syllables in a given sentence. These fundamental frequency values may be applied in synthesizing a sentence of speech. The templates are indexed by tonal pattern markings. A predicted tonal marking pattern is generated for each sentence of text that is to be synthesized, and this predicted pattern of tonal markings is used to locate a best-matching template. The templates are derived by calculating fundamental frequencies on a pursuable basis for sentences that are spoken by a human trainer for a given unlabeled corpus.
    Type: Grant
    Filed: September 30, 1996
    Date of Patent: May 18, 1999
    Assignee: Microsoft Corporation
    Inventors: Xuedong D. Huang, James L. Adcock, John A. Goldsmith
  • Patent number: 5899966
    Abstract: A signal decoding method and apparatus in which the speech signal reproducing speed is controlled without changing the phoneme or the pitch, in which the apparatus has a data number convertor for converting the number of orthogonal transform coefficients entering a transmission signal input terminal from N to M, an inverse orthogonal transform unit for inverse orthogonal-transforming the M number of the orthogonal transform coefficients obtained by the data number convertor, and a linear predictive coding synthesis filter for performing predictive synthesis based on the short-term prediction residuals obtained by the inverse orthogonal transform unit. For an input signal, short-term prediction residuals are found and are orthogonally transformed to form the orthogonal transform coefficients at a rate of N coefficients per transform unit. The frequency positions of the N transform coefficients may be rearranged to M values by M/N or by oversampling to change N to M.
    Type: Grant
    Filed: October 25, 1996
    Date of Patent: May 4, 1999
    Assignee: Sony Corporation
    Inventors: Jun Matsumoto, Masayuki Nishiguchi, Shiro Omori, Kazuyuki Iijima
  • Patent number: 5890117
    Abstract: Improved automated synthesis of human audible speech from text is disclosed. Performance enhancement of the underlying text comprehensibility is obtained through prosodic treatment of the synthesized material, improved speaking rate treatment, and improved methods of spelling words or terms for the system user. Prosodic shaping of text sequences appropriate for the discourse in large groupings of text segments, with prosodic boundaries developed to indicate conceptual units within the text groupings, is implemented in a preferred embodiment.
    Type: Grant
    Filed: March 14, 1997
    Date of Patent: March 30, 1999
    Assignee: Nynex Science & Technology, Inc.
    Inventor: Kim Ernest Alexander Silverman
  • Patent number: 5886276
    Abstract: An audio signal analyzer and encoder is based on a model that considers audio signals to be composed of deterministic or sinusoidal components, transient components representing the onset of notes or other events in an audio signal, and stochastic components. Deterministic components are represented as a series of overlapping sinusoidal waveforms. To generate the deterministic components, the input signal is divided into a set of frequency bands by a multi-complementary filter bank. The frequency band signals are oversampled so as to suppress cross-band aliasing energy in each band. Each frequency band is analyzed and encoded as a set of spectral components using a windowing time frame whose length is inversely proportional to the frequency range in that band. Low frequency bands are encoded using longer time frames than higher frequency bands.
    Type: Grant
    Filed: January 16, 1998
    Date of Patent: March 23, 1999
    Assignee: The Board of Trustees of the Leland Stanford Junior University
    Inventors: Scott N. Levine, Tony S. Verma
  • Patent number: 5864812
    Abstract: A method and apparatus for synthesizing speech. According to one variation of the method and apparatus, a plurality of speech segment data units is prepared for all desired speech waveforms. Speech is then synthesized by reading out from memory the appropriate speech segment data units, and a desired pitch is obtained by overlapping the appropriate speech segment data units according to a pitch period interval. According to a second variation of the method and apparatus, speech segment data units are prepared for only initial speech waveforms and first pitch waveforms, and differential waveforms. With this variation, subsequent pitch waveforms for speech synthesis are generated by combining the first pitch waveform with the corresponding differential waveform.
    Type: Grant
    Filed: November 30, 1995
    Date of Patent: January 26, 1999
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Takahiro Kamai, Kenji Matsui, Noriyo Hara
  • Patent number: 5864814
    Abstract: An information communication system, having host and remote terminal devices, and method for generating a voice in which one voice tone data is selected from a plurality of types of voice tone data and stored according to received voice generating information. The voice is reproduced by generating a voice waveform according to a meter pattern and selected voice tone data. The discrete voice data may be presented for either one or both of velocity and pitch of a voice correlated to a time lag between discrete voice data. The discrete data is dispensed so that each voice data is not dependent on a time lag between phonemes and at the same time is present at a level relative to a reference value. Voice tone data indicating a sound parameter for each voice element such as a phoneme for each voice tone type is stored in a voice tone data storing section in a terminal device.
    Type: Grant
    Filed: March 31, 1997
    Date of Patent: January 26, 1999
    Assignee: Justsystem Corp.
    Inventor: Nobuhide Yamazaki
  • Patent number: 5857171
    Abstract: A karaoke apparatus produces a karaoke accompaniment which accompanies a singing voice of an actual player, and concurrently creates a harmony voice originating from a virtual player. In the karaoke apparatus, a memory device stores voice information of the virtual singer. An input device collects the singing voice of the actual player. An analyzing device analyzes an audio frequency of the collected singing voice. A synthesizing device processes the stored voice information based on the analyzed audio frequency to synthesize the harmony voice having another audio frequency which is set in harmony with the analyzed audio frequency. An output device mixes the collected singing voice and the synthesized harmony voice with each other, and outputs the mixed singing and harmony voices along with the karaoke accompaniment.
    Type: Grant
    Filed: February 26, 1996
    Date of Patent: January 5, 1999
    Assignee: Yamaha Corporation
    Inventors: Yasuo Kageyama, Hiroshi Mino
  • Patent number: 5839101
    Abstract: The invention relates to a method of noise suppression, a mobile station and a noise suppressor for suppressing noise in a speech signal. The suppressor comprises means (20, 50) for dividing the speech signal into a first amount of subsignals (X, P), which subsignals represent certain first frequency ranges, and suppression means (30) for suppressing noise in a subsignal (X, P) based upon a determined suppression coefficient (G). The noise suppressor further comprises recombination means (60) for recombining a second amount of subsignals (X, P) into a calculation signal (S), which represents a certain second frequency range, which is wider than the first frequency ranges and determination means (200) for determining a suppression coefficient (G) for the calculation signal (S) based upon the noise contained by it.
    Type: Grant
    Filed: December 10, 1996
    Date of Patent: November 17, 1998
    Assignee: Nokia Mobile Phones Ltd.
    Inventors: Antti Vahatalo, Juha Hakkinen, Erkki Paajanen, Ville-Veikko Mattila
  • Patent number: 5832437
    Abstract: A method for decoding encoded speech signals uses sine wave synthesis based on harmonics of the original speech signal. The harmonics are obtained by transforming the original speech signal from a time domain to a frequency domain, and the harmonics are arranged as sequential frames with the harmonics of a given frame having a pitch period that may or may not be the same as the pitch period of another frame. According to the decoding method, data arrays respectively containing amplitude data and phase data of the harmonics are zero-padded to provide the arrays with a pre-set number of elements. Inverse orthogonal tarnsformation of the data arrays produces time domain information used to generate a time domain waveform signal for restoring the encoded speech signals. The different pitch periods of the frames are normalized to each other either by smooth (continuous) or acute (discontinuous) interpolation depending on the degree of change in the pitch period between the frames.
    Type: Grant
    Filed: August 16, 1995
    Date of Patent: November 3, 1998
    Assignee: Sony Corporation
    Inventors: Masayuki Nishiguchi, Jun Matsumoto
  • Patent number: 5822732
    Abstract: A speech modification or enhancement filter, and apparatus, system and method using the same. Synthesized speech signals are filtered to generate modified synthesized speech signals. From spectral information represented as a multi-dimensional vector, a filter coefficient is determined so as to ensure that formant characteristics of the modified synthesized speech signals are enhanced in comparison with those of the synthesized speech signal and in accordance with the spectral information. The spectral information can be any one of LSP information, PARCOR information and LAR information. A degree of freedom of design of the speech modification filter used for the aural suppression of quantizing noise contained in the synthesized speech signals is thus heightened leading to the improvement of intelligibility of said synthesized speech signals. A good formant enhancement effect can be obtained without allowing any perceptible level of distortions to occur within a range of permissible spectral gradients.
    Type: Grant
    Filed: May 2, 1996
    Date of Patent: October 13, 1998
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventor: Hirohisa Tasaki
  • Patent number: 5806038
    Abstract: A MBE (Multi-Band Excitation) synthesizer (116) generates excitation components from information received by a receiver (2004). The information received includes spectral information representing a segment of speech. The MBE synthesizer (116) includes an excitation generator (2241) and a nonlinear voicing processor (2211). The excitation generator (2241) generates voiced excitation components and unvoiced excitation components. The nonlinear voicing processor (2211) is responsive to the spectral information and controls a selection of the excitation components from the voiced excitation components and the unvoiced excitation components.
    Type: Grant
    Filed: February 13, 1996
    Date of Patent: September 8, 1998
    Assignee: Motorola, Inc.
    Inventors: Jian-Cheng Huang, Floyd D. Simpson, Xiaojun Li
  • Patent number: 5806039
    Abstract: A data processing apparatus for synchronized audiovisual output has synchronizing signal bits which are assigned to bits of each sound data, represented by a 16-bit PCM code. A predetermined bit of the assigned bits having the least influence upon the human auditory sense is extracted as a synchronizing signal bit for synchronization of the image data output and sound output.
    Type: Grant
    Filed: May 20, 1997
    Date of Patent: September 8, 1998
    Assignee: Canon Kabushiki Kaisha
    Inventors: Toshiaki Fukada, Yasunori Ohora, Takashi Aso, Mitsuru Otsuka
  • Patent number: 5806037
    Abstract: A voice synthesis system is fundamentally configured by a sound-source model, which simulates human voices and the like, and a voice-path model which simulates properties of voice paths between vocal cords and lips. The sound-source model is embodied by a code book which stores a plurality of code words, representative of waveform patterns, with respect to each of the voices. Each of the code words is selected by an information index. The voice-path model is embodied by a full-pole synthesis filter whose characteristic curve provides multiple poles, each of which is represented by polar coordinates. There is further provided a pitch filter and an all-pass filter. Data representative of the code word selected is supplied to the pitch filter, in which a first delay time, set by a number of delay-time units, is imparted to the data. Then, the all-pass filter imparts a second delay time, which is smaller than the delay-time unit, to the data in response to pitch-variation information.
    Type: Grant
    Filed: March 29, 1995
    Date of Patent: September 8, 1998
    Assignee: Yamaha Corporation
    Inventor: Akira Sogo
  • Patent number: 5799271
    Abstract: The present invention relates to the method to receive a speech signal, to perform a recognition weighting process on it, to synthesize a synthetic speech signal, to calculate an autocorrelation of the synthetic speech signal whose delay is a predetermined value and an autocorrelation whose delay is 0, to divide the square of the former by the latter, to calculate a pitch lag and a pitch filter coefficient by calculating only the part of a positive peak with skipping over the part of a negative peak by using the results from the dividing operation, and to calculate and output the pitch lag and the pitch filter coefficient by repeating the above process Thus, real-time implementation of CELP vocoder can be achieved.
    Type: Grant
    Filed: June 24, 1996
    Date of Patent: August 25, 1998
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Kyung-Jin Byun, Ha-Young Yoo, Jong-Jae Kim, Ki-Chun Han, Jae-Suk Kim, Myung-Jin Bae
  • Patent number: 5794201
    Abstract: Making use of a digital acoustic signal processing apparatus arranged by employing memory device for storing a digital acoustic signal, acoustic frequency feature enhancing device for enhancing an acoustic frequency feature, and low-speed sound reproducing device for changing a speed of the stored voice to reproduce this voice as a low speed into a hearing aid and an appliance with an acoustic output, a hearing function difficulty due to an age is aided in utilization of audio output appliances such as a hearing aid, television receiver, and a telephone receiver. After the voice has been stored in the memory device, a process for enhancing the frequency characteristic in order to fit the frequency characteristic to the individual hearing characteristic and the voice reproducing environment is carried out and thereafter represented to the user. The user can repeatedly listen the voice stored in the memory device with employment of control device for controlling the voice reproducing operation.
    Type: Grant
    Filed: June 5, 1995
    Date of Patent: August 11, 1998
    Assignee: Hitachi, Ltd.
    Inventors: Yoshito Nejime, Hiroshi Ikeda, Masao Hotta
  • Patent number: 5787387
    Abstract: A method and system is provided for encoding and decoding of speech signals at a low bit rate. The continuous input speech is divided into voiced and unvoiced time segments of a predetermined length. The encoder of the system uses a linear predictive coding model for the unvoiced speech segments and harmonic frequencies decomposition for the voiced speech segments. Only the magnitudes of the harmonic frequencies are determined using the discrete Fourier transform of the voiced speech segments. The decoder synthesizes voiced speech segments using the magnitudes of the transmitted harmonics and estimates the phase of each harmonic from the signal in the preceding speech segments. Unvoiced speech segments are synthesized using linear prediction coding (LPC) coefficients obtained from codebook entries for the poles of the LPC coefficient polynomial. Boundary conditions between voiced and unvoiced segments are established to insure amplitude and phase continuity for improved output speech quality.
    Type: Grant
    Filed: July 11, 1994
    Date of Patent: July 28, 1998
    Assignee: Voxware, Inc.
    Inventor: Joseph Gerard Aguilar
  • Patent number: 5787398
    Abstract: The pitch of synthesized speech signals is varied by separating the speech signals into a spectral component and an excitation component. The latter is multiplied by a series of overlapping window functions synchronous, in the case of voiced speech, with pitch timing mark information corresponding at least approximately to instants of vocal excitation, to separate it into windowed speech segments which are added together again after the application of a controllable time-shift. The spectral and excitation components are then recombined. The multiplication employs at least two windows per pitch period, each having a duration of less than one pitch period. Alternatively each window has a duration of less than twice the pitch period between timing marks and is asymmetric about the timing mark.
    Type: Grant
    Filed: August 26, 1996
    Date of Patent: July 28, 1998
    Assignee: British Telecommunications PLC
    Inventor: Andrew Lowry
  • Patent number: 5774854
    Abstract: The text to speech (TTS) system comprises two main components, a linguistic processor and an acoustic processor. The former is responsible for receiving an input text, and breaking it down into a sequence of phonemes. Each phoneme is assigned a duration and pitch. The acoustic processor is then responsible for reproducing the phonemes, and concatenating them into the desired acoustic output. The TTS system is driven from the output in that the linguistic processor does not operate until it receives a request from the acoustic processor for input. This request, and a return message that it can now be satisfied, are routed via a process dispatcher. By driving the system from the output, the system can be accurately halted in the event that the acoustic output needs to be interrupted.
    Type: Grant
    Filed: November 22, 1994
    Date of Patent: June 30, 1998
    Assignee: International Business Machines Corporation
    Inventor: Richard Anthony Sharman