Transformation Patents (Class 704/269)
-
Publication number: 20010044725Abstract: An information processing apparatus, an information processing method, and a medium that allow the user to have more varied voice chats unique to a three-dimensional virtual reality space than before. The user clicks one of voice tone select radio buttons to select a normal voice, a tone-changed voice, a robot voice, or an intonation-inverted voice. In addition, by operating a voice tone adjusting slider, the user finely adjusts a selected voice tone parameter. A voice signal inputted by the user is filtered with the preset voice tone parameter before being transmitted to another user.Type: ApplicationFiled: November 12, 1997Publication date: November 22, 2001Inventors: KOICHI MATSUDA, AKIRA INOUE
-
Patent number: 6311158Abstract: Techniques for synthesizing a time-domain signal. The time-domain signal is partitioned into a number of time-domain frames and a waveform in generated for each time-domain frame. Each waveform includes one or more sinusoids. The waveform is generated by selecting a sinusoid for synthesis and computing a set of parameter values (e.g. the start and end amplitude, frequency, and phase values) for the selected sinusoid. A template is determined for the selected sinusoid based on the computed parameter values and a selected window function. The frequency-domain template is such that the amplitude of the selected sinusoid in the time domain matches, at a time-domain frame boundary, the amplitude of a corresponding sinusoid in an adjacent time-domain frame. The template is added to a frequency-domain frame. The process is repeated for each sinusoid in the waveform. After all sinusoids have been processed, the frequency-domain frame is transformed to a time-domain frame.Type: GrantFiled: March 16, 1999Date of Patent: October 30, 2001Assignee: Creative Technology Ltd.Inventor: Jean Laroche
-
Patent number: 6278971Abstract: An apparatus and procedure for performing phase detection in which one-pitch cycle of an input signal waveform is cut out on a time axis. The cut-out one pitch cycle is filled with zeroes to form 2N samples (where N is an integer, 2N is equal to or greater than the number of samples of the one-pitch cycle), and the samples are subjected to an orthogonal conversion such as fast Fourier transform, whereby a real and imaginary part are used to calculate tan−1 to obtain a basic phase information. This basic phase is subjected to linear interpolation to obtain phases of respective higher harmonics of the input signal waveform.Type: GrantFiled: January 26, 1999Date of Patent: August 21, 2001Assignee: Sony CorporationInventors: Akira Inoue, Masayuki Nishiguchi
-
Patent number: 6253165Abstract: The coder/decoder (codec) system of the present invention includes a coder and a decoder. The coder includes a multi-resolution transform processor, such as a modulated lapped transform (MLT) transform processor, a weighting processor, a uniform quantizer, a masking threshold spectrum processor, an entropy encoder, and a communication device, such as a multiplexor (MUX) for multiplexing (combining) signals received from the above components for transmission over a single medium. The decoder comprises inverse components of the encoder, such as an inverse multi-resolution transform processor, an inverse weighting processor, an inverse uniform quantizer, an inverse masking threshold spectrum processor, an inverse entropy encoder, and an inverse MUX. With these components, the present invention is capable of performing resolution switching, spectral weighting, digital encoding, and parametric modeling.Type: GrantFiled: June 30, 1998Date of Patent: June 26, 2001Assignee: Microsoft CorporationInventor: Henrique S. Malvar
-
Patent number: 6240384Abstract: In a synthesis unit generator, a plurality of synthesis speech segments are generated by synthesizing training speech segments labeled with phonetic contexts and input speech segments while altering the pitch/duration of the input speech segments in accordance with the pitch/duration of the training speech segments. Typical speech segments are selected from the input speech segments on the basis of a distance between the synthesis speech segments and the training speech segments, and are stored in a storage. In addition, a plurality of phonetic context clusters corresponding to the synthesis units are generated on the basis of the distance, and are stored in a storage. A synthesis speech signal is generated by reading out, from the storage, those of the synthesis units, which correspond to the phonetic context clusters including phonetic contexts of input phonemes, and connecting the selected synthesis units in a speech synthesizer.Type: GrantFiled: December 3, 1996Date of Patent: May 29, 2001Assignee: Kabushiki Kaisha ToshibaInventors: Takehiko Kagoshima, Masami Akamine
-
Patent number: 6199039Abstract: An MPEG-II audio decoder with a synthesis subband filter includes a fast IMDCT (Inverse Modified Discrete Cosine Transform) module and an IPQMF (Inverse Pseudo Quadrature Mirror Filter) module. The fast IMDCT module involves a butterfly stage of input subband samples which requires only about ¼ the amount of multiplier-accumulate computation of the ISO suggested method. The IPQMF module involves an efficient memory configuration which requires only half size of the standard synthesis subband filter bank.Type: GrantFiled: August 3, 1998Date of Patent: March 6, 2001Assignee: National Science CouncilInventors: Liang-Gee Chen, Tsung-Han Tsai, Yuan-Chen Liu
-
Patent number: 6182042Abstract: A system and method for modifying a subportion of information contained in an audio, such as magnitude information, without substantially effecting the remaining information contained therein, such a phase information. An incoming audio signal is segmented into a sequence of overlapping windowed DFT representations, during an analysis step, and during a synthesis step the DFT representations are converted back to a time domain signal. Each of the DFT representations consists of a plurality of frequency components obtained during a period of time. Each of the frequency components is associated with a unique increment of the period. Subsequent to the analysis step, but before the synthesis step, the frequency components of the DFT representations are re-mapped so as to have a differing temporal relationship with respect to the increments of the period of time.Type: GrantFiled: July 7, 1998Date of Patent: January 30, 2001Assignee: Creative Technology Ltd.Inventor: Alan Peevers
-
Patent number: 6173250Abstract: An apparatus and method for speech-text-transmit communication over data networks includes speech recognition devices and text to speech conversion devices that translate speech signals input to the terminal into text and text data received from a data network into speech output signals. The speech input signals are translated into text based on phonemes obtained from a spectral analysis of the speech input signals. The text data is transmitted to a receiving party over the data network as a plurality of text data packets such that a continuous stream of text data is obtained. The receiving party's terminal receives the text data and may immediately display the text data and/or translate it into speech output signals using the text to speech conversion device. The text to speech conversion device uses speech pattern data stored in a speech pattern database for synthesizing a human voice for playing of the speech output signals using a speech output device.Type: GrantFiled: June 3, 1998Date of Patent: January 9, 2001Assignee: AT&T CorporationInventor: Kenneth Jong
-
Patent number: 6163769Abstract: A text-to-speech system includes a storage device for storing a clustered set of context-dependent phoneme-based units of a target speaker. In one embodiment, decision trees are used wherein each decision tree based context-dependent phoneme-based unit is arranged based on context of at least one immediately preceding and succeeding phoneme. At least one of the context-dependent phoneme-based units represents other non-stored context-dependent phoneme units of similar sound due to similar contexts. A text analyzer obtains a string of phonetic symbols representative of text to be converted to speech. A concatenation module selects stored decision tree based context-dependent phoneme-based units from the set decision tree based context-dependent phoneme-based units based on the context of the phonetic symbols and synthesizes the selected phoneme-based units to generate speech corresponding to the text.Type: GrantFiled: October 2, 1997Date of Patent: December 19, 2000Assignee: Microsoft CorporationInventors: Alejandro Acero, Hsiao-Wuen Hon, Xuedong D. Huang
-
Patent number: 6115687Abstract: An apparatus and method that reproduces a voice signal at different rates without a change in pitch. Neighboring voice waveforms having a same length and minimum form differences from an input voice signal are selected and overlapped. An output voice waveform is then generated that is rate converted by replacing a part of the voice waveform of the input voice signal with the overlapped voice waveforms, or, alternatively, by inserting the overlapped voice waveforms into the voice waveform of the input voice signal.Type: GrantFiled: July 1, 1998Date of Patent: September 5, 2000Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Naoya Tanaka, Hiroaki Takeda
-
Patent number: 6081781Abstract: Data in the same range of the fundamental frequency F.sub.0 as speech segments are used as learning data to prepare a reference codebook CB.sub.M for a spectrum envelope. The same learning data for a higher range than F.sub.0 and the same learning data for a lower range are subject to a linear stretch matching with respect to the learning data for the range F.sub.0. For each vector code in the reference codebook CB.sub.M, the spectrum envelope is clustered to prepare a high range codebook CB.sub.H and a low range codebook CB.sub.L. The spectrum envelope of input speech segments are fuzzy vector quantized (S402) with the reference codebook, and depending on the synthesized F.sub.0, a high, middle or low codebooks is selected. The selected codebook is used to decode the fuzzy vector quantized code, and the decoded output is subject to the inverse FFT. Alternatively, codebooks CM.sub.MH and CB.sub.ML each comprising differential vectors for corresponding code vectors between CB.sub.M and CB.sub.H and between CB.Type: GrantFiled: September 9, 1997Date of Patent: June 27, 2000Assignee: Nippon Telegragh and Telephone CorporationInventors: Kimihito Tanaka, Masanobu Abe
-
Patent number: 6070138Abstract: In order to provide a practical E-mail reader for reading out E-mails phonetically enabling easy grasping of their contents by a user with its vocal output even when quotation codes or header information are included in the E-mails, a phonetic E-mail reader of the invention comprises a speech synthesizer (102) for converting text data into vocal data, quotation code storing means (105) for storing quotation codes used for indicating a quotation line inserted at a top of the quotation line, and quotation code elimination means (106) for detecting and eliminating a quotation code inserted at tops of quotation lines referring to the quotation code storing means (105) before supplying the quotation lines to the speech synthesizer (102).Type: GrantFiled: December 26, 1996Date of Patent: May 30, 2000Assignee: NEC CorporationInventor: Kazuhiko Iwata
-
Patent number: 6021388Abstract: A speech synthesis apparatus for outputting synthesized speech on the basis of a parameter sequence of a speech waveform includes a parameter generation unit which generates a parameter sequence for speech synthesis on the basis of a character sequence input by a character sequence input unit, and stores the generated parameter sequence in a parameter storage unit. A waveform generation unit is also provided that generates pitch waveforms each for one pitch period on the basis of synthesis parameters and pitch scales included in the parameter sequence, and generates a speech waveform by connecting the generated pitch waveforms in accordance with frame lengths set by a frame length setting unit.Type: GrantFiled: December 19, 1997Date of Patent: February 1, 2000Assignee: Canon Kabushiki KaishaInventors: Mitsuru Otsuka, Yasunori Ohora, Takashi Aso, Yasuo Okutani
-
Patent number: 5974376Abstract: The present invention relates to a method for transmitting multiresolution audio signals via wireless devices in a radio frequency communication system wherein audio signals are decomposed into levels of resolution. The audio signal is decomposed into levels including a base signal at a base transmission rate and one or more signal details and input into a code rate selector, controlled by either party to the communication. The base signal represents the coarsest resolution or quality of the signal. Each signal detail, when added to the base signal, improves the resolution of the signal by increasing the detail and the transmission rate. An audio receiving unit transmits a request for audio transmission to the audio transmitting unit. In response to the initial request, the base signal is transmitted to the audio receiving unit. If the base signal is insufficient, the sound quality can be increased incrementally by sending further requests to transmit additional signal detail from the code rate selector.Type: GrantFiled: October 10, 1996Date of Patent: October 26, 1999Assignee: Ericsson, Inc.Inventors: Amer Hassan, David G. Matthews
-
Patent number: 5970440Abstract: A method is described for short-time Fourier-converting a speech signal and for resynthesizing an output speech signal from the modulus of its short-time Fourier transform and from an initial phase. In particular, after the Fourier converting the signal is subjected to a phase-specifying operation. Subsequently speech duration is affected by systematically maintaining, periodically repeating or periodically suppressing result intervals of the successive Fourier converting and phase affecting. Finally, a resynthesizing operation is executed. Speech pitch can likewise be affected through systematically excising or inserting signal intervals. Finally, the two strategies can be combined, so that ultimately, pitch and duration can be affected independently from each other.Type: GrantFiled: November 22, 1996Date of Patent: October 19, 1999Assignee: U.S. Philips CorporationInventors: Raymond N. J. Veldhuis, Haiyan He
-
Patent number: 5970454Abstract: Synthetic speech is generated by production of a digital waveform from a text in phonemes. A linked database is used which comprises an extended text in phonemes and its equivalent in the form of a digital waveform. The two portions of the database are linked by a parameter which establishes equivalent points in both the phoneme text and the digital waveform. The input text (in phonemes) is analyzed to locate a matching portion in the phoneme portion of the database. This matching utilizes exact equivalence of phonemes where this is possible; otherwise relation between phonemes is utilized. The selection process identifies input phonemes in context whereby improved conversions are obtained. Having analyzed the input exit into matching strings in the input form of the database beginning and ending parameters for the sections are established. The output text is produced by abutting sections of the digital waveform and defined by the beginning and ending parameters.Type: GrantFiled: April 23, 1997Date of Patent: October 19, 1999Assignee: British Telecommunications public limited companyInventor: Andrew Paul Breen
-
Patent number: 5832437Abstract: A method for decoding encoded speech signals uses sine wave synthesis based on harmonics of the original speech signal. The harmonics are obtained by transforming the original speech signal from a time domain to a frequency domain, and the harmonics are arranged as sequential frames with the harmonics of a given frame having a pitch period that may or may not be the same as the pitch period of another frame. According to the decoding method, data arrays respectively containing amplitude data and phase data of the harmonics are zero-padded to provide the arrays with a pre-set number of elements. Inverse orthogonal tarnsformation of the data arrays produces time domain information used to generate a time domain waveform signal for restoring the encoded speech signals. The different pitch periods of the frames are normalized to each other either by smooth (continuous) or acute (discontinuous) interpolation depending on the degree of change in the pitch period between the frames.Type: GrantFiled: August 16, 1995Date of Patent: November 3, 1998Assignee: Sony CorporationInventors: Masayuki Nishiguchi, Jun Matsumoto
-
Patent number: 5809468Abstract: A voice recording/reproducing apparatus comprises a coding parameter extracting section for extracting a coding parameter by use of either past voice data or past parameter. A coding section codes voice data by use of the coding parameter extracted by the coding parameter extracting section. A predicting section predicts a decoding signal by use of either past decoded voice data corresponding to coded voice data from the voice coding means or the past parameter. A voice decoding section decodes the voice data by use of the predicted decoding signal. A voice synthesizing section outputs voice data synthesized based on an output signal from the predicting section and an output signal from the voice decoding section. An initializing section initializes at least one of either a content of the predicting section or a content of the voice synthesizing section in accordance with a reproducing position of recorded voice data.Type: GrantFiled: October 18, 1995Date of Patent: September 15, 1998Assignee: Olympus Optical Co., Ltd.Inventors: Hidetaka Takahashi, Hideo Okano
-
Patent number: 5774835Abstract: A second spectrum parameter of which degree is lower than that of a first spectrum parameter is calculated based on the first spectrum parameter that is output from an encoder. A spectrum postfilter generates a transfer function having a denominator and a numerator wherein said first spectrum parameter is included in said denominator and said second spectrum parameter is included in said numerator, and filters the reduced signal with this transfer function. In addition, it adaptively generates a compensation coefficient based on the first and second parameters. A compensation filter generates a transfer function based the compensation coefficient and filters an output of the spectrum postfilter with this transfer function.Type: GrantFiled: August 21, 1995Date of Patent: June 30, 1998Assignee: NEC CorporationInventor: Kazunori Ozawa
-
Patent number: 5765126Abstract: A signal encoding apparatus for encoding an acoustic signal. This signal encoding apparatus includes a transform circuit for transforming an inputted acoustic signal into frequency components, a signal component separating circuit for separating an output of the transform circuit into tone characteristic components and noise characteristic components, a tone characteristic encoding circuit for encoding a signal of tone characteristic components, and a noise characteristic component encoding circuit for encoding a signal of noise characteristic components, wherein the tone characteristic component encoding circuit encodes respective signal components of the signal of tone characteristic components so that they respectively have different code lengths to thereby improve efficiency of encoding without degrading sound quality with respect to acoustic signal of tone characteristic.Type: GrantFiled: April 17, 1995Date of Patent: June 9, 1998Assignee: Sony CorporationInventors: Kyoya Tsutsui, Mito Sonohara
-
Patent number: 5758320Abstract: A text-to-voice audio output unit includes a storage section for storing analyzed information pertaining to words, boundaries between articulations, and accents obtained by analyzing an input character list, a voice synthesis rule section for changing a reduction or damping characteristic of a phrase component of a fundamental frequency of an output voice, and a voice synthesizing section for generating a composite tone based on the analyzed information from the storage section. The reduction or damping characteristic, calculated for each phrase component, is overdamped, critically damped, or underdamped and is based on speech rate, syntactic information, number of articulations, and positional information. When a prosodic phrase is short, the reduction or damping characteristic causes a decrease in the fundamental frequency for a meaningfully-delimited portion, and when a prosodic phrase is long, the reduction or damping characteristic is controlled over the entire prosodic phrase.Type: GrantFiled: June 12, 1995Date of Patent: May 26, 1998Assignee: Sony CorporationInventor: Yasuharu Asano
-
Patent number: 5737718Abstract: A method for encoding the information is provided which realizes a high encoding efficiency especially for tonal acoustic signals without lowering the sound quality. An acoustic signal from a terminal is transformed by a transform circuit into spectral signals which are then normalized and quantized by a signal component encoding circuit for encoding from one encoding unit to another. The encoding unit configuration is selected by an encoding unit configuration decision circuit from plural encoding unit configurations depending upon the shape of distribution of the spectral components. An encoding unit of narrow low bandwidth is selected for a tonal signal.Type: GrantFiled: June 8, 1995Date of Patent: April 7, 1998Assignee: Sony CorporationInventor: Kyoya Tsutsui