Transformation Patents (Class 704/269)
  • Publication number: 20010044725
    Abstract: An information processing apparatus, an information processing method, and a medium that allow the user to have more varied voice chats unique to a three-dimensional virtual reality space than before. The user clicks one of voice tone select radio buttons to select a normal voice, a tone-changed voice, a robot voice, or an intonation-inverted voice. In addition, by operating a voice tone adjusting slider, the user finely adjusts a selected voice tone parameter. A voice signal inputted by the user is filtered with the preset voice tone parameter before being transmitted to another user.
    Type: Application
    Filed: November 12, 1997
    Publication date: November 22, 2001
    Inventors: KOICHI MATSUDA, AKIRA INOUE
  • Patent number: 6311158
    Abstract: Techniques for synthesizing a time-domain signal. The time-domain signal is partitioned into a number of time-domain frames and a waveform in generated for each time-domain frame. Each waveform includes one or more sinusoids. The waveform is generated by selecting a sinusoid for synthesis and computing a set of parameter values (e.g. the start and end amplitude, frequency, and phase values) for the selected sinusoid. A template is determined for the selected sinusoid based on the computed parameter values and a selected window function. The frequency-domain template is such that the amplitude of the selected sinusoid in the time domain matches, at a time-domain frame boundary, the amplitude of a corresponding sinusoid in an adjacent time-domain frame. The template is added to a frequency-domain frame. The process is repeated for each sinusoid in the waveform. After all sinusoids have been processed, the frequency-domain frame is transformed to a time-domain frame.
    Type: Grant
    Filed: March 16, 1999
    Date of Patent: October 30, 2001
    Assignee: Creative Technology Ltd.
    Inventor: Jean Laroche
  • Patent number: 6278971
    Abstract: An apparatus and procedure for performing phase detection in which one-pitch cycle of an input signal waveform is cut out on a time axis. The cut-out one pitch cycle is filled with zeroes to form 2N samples (where N is an integer, 2N is equal to or greater than the number of samples of the one-pitch cycle), and the samples are subjected to an orthogonal conversion such as fast Fourier transform, whereby a real and imaginary part are used to calculate tan−1 to obtain a basic phase information. This basic phase is subjected to linear interpolation to obtain phases of respective higher harmonics of the input signal waveform.
    Type: Grant
    Filed: January 26, 1999
    Date of Patent: August 21, 2001
    Assignee: Sony Corporation
    Inventors: Akira Inoue, Masayuki Nishiguchi
  • Patent number: 6253165
    Abstract: The coder/decoder (codec) system of the present invention includes a coder and a decoder. The coder includes a multi-resolution transform processor, such as a modulated lapped transform (MLT) transform processor, a weighting processor, a uniform quantizer, a masking threshold spectrum processor, an entropy encoder, and a communication device, such as a multiplexor (MUX) for multiplexing (combining) signals received from the above components for transmission over a single medium. The decoder comprises inverse components of the encoder, such as an inverse multi-resolution transform processor, an inverse weighting processor, an inverse uniform quantizer, an inverse masking threshold spectrum processor, an inverse entropy encoder, and an inverse MUX. With these components, the present invention is capable of performing resolution switching, spectral weighting, digital encoding, and parametric modeling.
    Type: Grant
    Filed: June 30, 1998
    Date of Patent: June 26, 2001
    Assignee: Microsoft Corporation
    Inventor: Henrique S. Malvar
  • Patent number: 6240384
    Abstract: In a synthesis unit generator, a plurality of synthesis speech segments are generated by synthesizing training speech segments labeled with phonetic contexts and input speech segments while altering the pitch/duration of the input speech segments in accordance with the pitch/duration of the training speech segments. Typical speech segments are selected from the input speech segments on the basis of a distance between the synthesis speech segments and the training speech segments, and are stored in a storage. In addition, a plurality of phonetic context clusters corresponding to the synthesis units are generated on the basis of the distance, and are stored in a storage. A synthesis speech signal is generated by reading out, from the storage, those of the synthesis units, which correspond to the phonetic context clusters including phonetic contexts of input phonemes, and connecting the selected synthesis units in a speech synthesizer.
    Type: Grant
    Filed: December 3, 1996
    Date of Patent: May 29, 2001
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Takehiko Kagoshima, Masami Akamine
  • Patent number: 6199039
    Abstract: An MPEG-II audio decoder with a synthesis subband filter includes a fast IMDCT (Inverse Modified Discrete Cosine Transform) module and an IPQMF (Inverse Pseudo Quadrature Mirror Filter) module. The fast IMDCT module involves a butterfly stage of input subband samples which requires only about ¼ the amount of multiplier-accumulate computation of the ISO suggested method. The IPQMF module involves an efficient memory configuration which requires only half size of the standard synthesis subband filter bank.
    Type: Grant
    Filed: August 3, 1998
    Date of Patent: March 6, 2001
    Assignee: National Science Council
    Inventors: Liang-Gee Chen, Tsung-Han Tsai, Yuan-Chen Liu
  • Patent number: 6182042
    Abstract: A system and method for modifying a subportion of information contained in an audio, such as magnitude information, without substantially effecting the remaining information contained therein, such a phase information. An incoming audio signal is segmented into a sequence of overlapping windowed DFT representations, during an analysis step, and during a synthesis step the DFT representations are converted back to a time domain signal. Each of the DFT representations consists of a plurality of frequency components obtained during a period of time. Each of the frequency components is associated with a unique increment of the period. Subsequent to the analysis step, but before the synthesis step, the frequency components of the DFT representations are re-mapped so as to have a differing temporal relationship with respect to the increments of the period of time.
    Type: Grant
    Filed: July 7, 1998
    Date of Patent: January 30, 2001
    Assignee: Creative Technology Ltd.
    Inventor: Alan Peevers
  • Patent number: 6173250
    Abstract: An apparatus and method for speech-text-transmit communication over data networks includes speech recognition devices and text to speech conversion devices that translate speech signals input to the terminal into text and text data received from a data network into speech output signals. The speech input signals are translated into text based on phonemes obtained from a spectral analysis of the speech input signals. The text data is transmitted to a receiving party over the data network as a plurality of text data packets such that a continuous stream of text data is obtained. The receiving party's terminal receives the text data and may immediately display the text data and/or translate it into speech output signals using the text to speech conversion device. The text to speech conversion device uses speech pattern data stored in a speech pattern database for synthesizing a human voice for playing of the speech output signals using a speech output device.
    Type: Grant
    Filed: June 3, 1998
    Date of Patent: January 9, 2001
    Assignee: AT&T Corporation
    Inventor: Kenneth Jong
  • Patent number: 6163769
    Abstract: A text-to-speech system includes a storage device for storing a clustered set of context-dependent phoneme-based units of a target speaker. In one embodiment, decision trees are used wherein each decision tree based context-dependent phoneme-based unit is arranged based on context of at least one immediately preceding and succeeding phoneme. At least one of the context-dependent phoneme-based units represents other non-stored context-dependent phoneme units of similar sound due to similar contexts. A text analyzer obtains a string of phonetic symbols representative of text to be converted to speech. A concatenation module selects stored decision tree based context-dependent phoneme-based units from the set decision tree based context-dependent phoneme-based units based on the context of the phonetic symbols and synthesizes the selected phoneme-based units to generate speech corresponding to the text.
    Type: Grant
    Filed: October 2, 1997
    Date of Patent: December 19, 2000
    Assignee: Microsoft Corporation
    Inventors: Alejandro Acero, Hsiao-Wuen Hon, Xuedong D. Huang
  • Patent number: 6115687
    Abstract: An apparatus and method that reproduces a voice signal at different rates without a change in pitch. Neighboring voice waveforms having a same length and minimum form differences from an input voice signal are selected and overlapped. An output voice waveform is then generated that is rate converted by replacing a part of the voice waveform of the input voice signal with the overlapped voice waveforms, or, alternatively, by inserting the overlapped voice waveforms into the voice waveform of the input voice signal.
    Type: Grant
    Filed: July 1, 1998
    Date of Patent: September 5, 2000
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Naoya Tanaka, Hiroaki Takeda
  • Patent number: 6081781
    Abstract: Data in the same range of the fundamental frequency F.sub.0 as speech segments are used as learning data to prepare a reference codebook CB.sub.M for a spectrum envelope. The same learning data for a higher range than F.sub.0 and the same learning data for a lower range are subject to a linear stretch matching with respect to the learning data for the range F.sub.0. For each vector code in the reference codebook CB.sub.M, the spectrum envelope is clustered to prepare a high range codebook CB.sub.H and a low range codebook CB.sub.L. The spectrum envelope of input speech segments are fuzzy vector quantized (S402) with the reference codebook, and depending on the synthesized F.sub.0, a high, middle or low codebooks is selected. The selected codebook is used to decode the fuzzy vector quantized code, and the decoded output is subject to the inverse FFT. Alternatively, codebooks CM.sub.MH and CB.sub.ML each comprising differential vectors for corresponding code vectors between CB.sub.M and CB.sub.H and between CB.
    Type: Grant
    Filed: September 9, 1997
    Date of Patent: June 27, 2000
    Assignee: Nippon Telegragh and Telephone Corporation
    Inventors: Kimihito Tanaka, Masanobu Abe
  • Patent number: 6070138
    Abstract: In order to provide a practical E-mail reader for reading out E-mails phonetically enabling easy grasping of their contents by a user with its vocal output even when quotation codes or header information are included in the E-mails, a phonetic E-mail reader of the invention comprises a speech synthesizer (102) for converting text data into vocal data, quotation code storing means (105) for storing quotation codes used for indicating a quotation line inserted at a top of the quotation line, and quotation code elimination means (106) for detecting and eliminating a quotation code inserted at tops of quotation lines referring to the quotation code storing means (105) before supplying the quotation lines to the speech synthesizer (102).
    Type: Grant
    Filed: December 26, 1996
    Date of Patent: May 30, 2000
    Assignee: NEC Corporation
    Inventor: Kazuhiko Iwata
  • Patent number: 6021388
    Abstract: A speech synthesis apparatus for outputting synthesized speech on the basis of a parameter sequence of a speech waveform includes a parameter generation unit which generates a parameter sequence for speech synthesis on the basis of a character sequence input by a character sequence input unit, and stores the generated parameter sequence in a parameter storage unit. A waveform generation unit is also provided that generates pitch waveforms each for one pitch period on the basis of synthesis parameters and pitch scales included in the parameter sequence, and generates a speech waveform by connecting the generated pitch waveforms in accordance with frame lengths set by a frame length setting unit.
    Type: Grant
    Filed: December 19, 1997
    Date of Patent: February 1, 2000
    Assignee: Canon Kabushiki Kaisha
    Inventors: Mitsuru Otsuka, Yasunori Ohora, Takashi Aso, Yasuo Okutani
  • Patent number: 5974376
    Abstract: The present invention relates to a method for transmitting multiresolution audio signals via wireless devices in a radio frequency communication system wherein audio signals are decomposed into levels of resolution. The audio signal is decomposed into levels including a base signal at a base transmission rate and one or more signal details and input into a code rate selector, controlled by either party to the communication. The base signal represents the coarsest resolution or quality of the signal. Each signal detail, when added to the base signal, improves the resolution of the signal by increasing the detail and the transmission rate. An audio receiving unit transmits a request for audio transmission to the audio transmitting unit. In response to the initial request, the base signal is transmitted to the audio receiving unit. If the base signal is insufficient, the sound quality can be increased incrementally by sending further requests to transmit additional signal detail from the code rate selector.
    Type: Grant
    Filed: October 10, 1996
    Date of Patent: October 26, 1999
    Assignee: Ericsson, Inc.
    Inventors: Amer Hassan, David G. Matthews
  • Patent number: 5970440
    Abstract: A method is described for short-time Fourier-converting a speech signal and for resynthesizing an output speech signal from the modulus of its short-time Fourier transform and from an initial phase. In particular, after the Fourier converting the signal is subjected to a phase-specifying operation. Subsequently speech duration is affected by systematically maintaining, periodically repeating or periodically suppressing result intervals of the successive Fourier converting and phase affecting. Finally, a resynthesizing operation is executed. Speech pitch can likewise be affected through systematically excising or inserting signal intervals. Finally, the two strategies can be combined, so that ultimately, pitch and duration can be affected independently from each other.
    Type: Grant
    Filed: November 22, 1996
    Date of Patent: October 19, 1999
    Assignee: U.S. Philips Corporation
    Inventors: Raymond N. J. Veldhuis, Haiyan He
  • Patent number: 5970454
    Abstract: Synthetic speech is generated by production of a digital waveform from a text in phonemes. A linked database is used which comprises an extended text in phonemes and its equivalent in the form of a digital waveform. The two portions of the database are linked by a parameter which establishes equivalent points in both the phoneme text and the digital waveform. The input text (in phonemes) is analyzed to locate a matching portion in the phoneme portion of the database. This matching utilizes exact equivalence of phonemes where this is possible; otherwise relation between phonemes is utilized. The selection process identifies input phonemes in context whereby improved conversions are obtained. Having analyzed the input exit into matching strings in the input form of the database beginning and ending parameters for the sections are established. The output text is produced by abutting sections of the digital waveform and defined by the beginning and ending parameters.
    Type: Grant
    Filed: April 23, 1997
    Date of Patent: October 19, 1999
    Assignee: British Telecommunications public limited company
    Inventor: Andrew Paul Breen
  • Patent number: 5832437
    Abstract: A method for decoding encoded speech signals uses sine wave synthesis based on harmonics of the original speech signal. The harmonics are obtained by transforming the original speech signal from a time domain to a frequency domain, and the harmonics are arranged as sequential frames with the harmonics of a given frame having a pitch period that may or may not be the same as the pitch period of another frame. According to the decoding method, data arrays respectively containing amplitude data and phase data of the harmonics are zero-padded to provide the arrays with a pre-set number of elements. Inverse orthogonal tarnsformation of the data arrays produces time domain information used to generate a time domain waveform signal for restoring the encoded speech signals. The different pitch periods of the frames are normalized to each other either by smooth (continuous) or acute (discontinuous) interpolation depending on the degree of change in the pitch period between the frames.
    Type: Grant
    Filed: August 16, 1995
    Date of Patent: November 3, 1998
    Assignee: Sony Corporation
    Inventors: Masayuki Nishiguchi, Jun Matsumoto
  • Patent number: 5809468
    Abstract: A voice recording/reproducing apparatus comprises a coding parameter extracting section for extracting a coding parameter by use of either past voice data or past parameter. A coding section codes voice data by use of the coding parameter extracted by the coding parameter extracting section. A predicting section predicts a decoding signal by use of either past decoded voice data corresponding to coded voice data from the voice coding means or the past parameter. A voice decoding section decodes the voice data by use of the predicted decoding signal. A voice synthesizing section outputs voice data synthesized based on an output signal from the predicting section and an output signal from the voice decoding section. An initializing section initializes at least one of either a content of the predicting section or a content of the voice synthesizing section in accordance with a reproducing position of recorded voice data.
    Type: Grant
    Filed: October 18, 1995
    Date of Patent: September 15, 1998
    Assignee: Olympus Optical Co., Ltd.
    Inventors: Hidetaka Takahashi, Hideo Okano
  • Patent number: 5774835
    Abstract: A second spectrum parameter of which degree is lower than that of a first spectrum parameter is calculated based on the first spectrum parameter that is output from an encoder. A spectrum postfilter generates a transfer function having a denominator and a numerator wherein said first spectrum parameter is included in said denominator and said second spectrum parameter is included in said numerator, and filters the reduced signal with this transfer function. In addition, it adaptively generates a compensation coefficient based on the first and second parameters. A compensation filter generates a transfer function based the compensation coefficient and filters an output of the spectrum postfilter with this transfer function.
    Type: Grant
    Filed: August 21, 1995
    Date of Patent: June 30, 1998
    Assignee: NEC Corporation
    Inventor: Kazunori Ozawa
  • Patent number: 5765126
    Abstract: A signal encoding apparatus for encoding an acoustic signal. This signal encoding apparatus includes a transform circuit for transforming an inputted acoustic signal into frequency components, a signal component separating circuit for separating an output of the transform circuit into tone characteristic components and noise characteristic components, a tone characteristic encoding circuit for encoding a signal of tone characteristic components, and a noise characteristic component encoding circuit for encoding a signal of noise characteristic components, wherein the tone characteristic component encoding circuit encodes respective signal components of the signal of tone characteristic components so that they respectively have different code lengths to thereby improve efficiency of encoding without degrading sound quality with respect to acoustic signal of tone characteristic.
    Type: Grant
    Filed: April 17, 1995
    Date of Patent: June 9, 1998
    Assignee: Sony Corporation
    Inventors: Kyoya Tsutsui, Mito Sonohara
  • Patent number: 5758320
    Abstract: A text-to-voice audio output unit includes a storage section for storing analyzed information pertaining to words, boundaries between articulations, and accents obtained by analyzing an input character list, a voice synthesis rule section for changing a reduction or damping characteristic of a phrase component of a fundamental frequency of an output voice, and a voice synthesizing section for generating a composite tone based on the analyzed information from the storage section. The reduction or damping characteristic, calculated for each phrase component, is overdamped, critically damped, or underdamped and is based on speech rate, syntactic information, number of articulations, and positional information. When a prosodic phrase is short, the reduction or damping characteristic causes a decrease in the fundamental frequency for a meaningfully-delimited portion, and when a prosodic phrase is long, the reduction or damping characteristic is controlled over the entire prosodic phrase.
    Type: Grant
    Filed: June 12, 1995
    Date of Patent: May 26, 1998
    Assignee: Sony Corporation
    Inventor: Yasuharu Asano
  • Patent number: 5737718
    Abstract: A method for encoding the information is provided which realizes a high encoding efficiency especially for tonal acoustic signals without lowering the sound quality. An acoustic signal from a terminal is transformed by a transform circuit into spectral signals which are then normalized and quantized by a signal component encoding circuit for encoding from one encoding unit to another. The encoding unit configuration is selected by an encoding unit configuration decision circuit from plural encoding unit configurations depending upon the shape of distribution of the spectral components. An encoding unit of narrow low bandwidth is selected for a tonal signal.
    Type: Grant
    Filed: June 8, 1995
    Date of Patent: April 7, 1998
    Assignee: Sony Corporation
    Inventor: Kyoya Tsutsui