Interpolation Patents (Class 704/265)
  • Publication number: 20080140409
    Abstract: A method for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder receives encoded frames of compressed speech information transmitted from an encoder. The method determines whether an encoded frame has been lost, corrupted in transmission, or erased, synthesizes properly received frames, and decides on an overlap-add window to use in combining a portion of the synthesized speech signal with a subsequent speech signal resulting from a received and decoded packet, where the size of the overlap-add window is based on the unavailability of packets. If it is determined that an encoded frame has been lost, corrupted in transmission, or erased, the method performed an overlap-add operation on the portion of the synthesized speech signal and the subsequent speech signal, using the decided-on overlap-add window.
    Type: Application
    Filed: September 12, 2006
    Publication date: June 12, 2008
    Inventor: David A. Kapilow
  • Publication number: 20080126096
    Abstract: An error concealment method and apparatus for an audio signal and a decoding method and apparatus for an audio signal using the error concealment method and apparatus. The error concealment method includes selecting one of an error concealment in a frequency domain and an error concealment in a time domain as an error concealment scheme for a current frame based on a predetermined criteria when an error occurs in the current frame, selecting one of a repetition scheme and an interpolation scheme in the frequency domain as the error concealment scheme for the current frame based on a predetermined criteria when the error concealment in the frequency domain is selected, and concealing the error of the current frame using the selected scheme.
    Type: Application
    Filed: October 31, 2007
    Publication date: May 29, 2008
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Eun-mi OH, Ki-hyun Choo, Ho-sang Sung, Chang-yong Son, Jung-hoe Kim, Kang-eun Lee
  • Publication number: 20080071541
    Abstract: An audio signal interpolation device comprises a spectral movement calculation unit which determines a spectral movement which is indicative of a difference in each of spectral components between a frequency spectrum of a current frame of an input audio signal and a frequency spectrum of a previous frame of the input audio signal stored in a spectrum storing unit. An interpolation band determination unit determines a frequency band to be interpolated by using the frequency spectrum of the current frame and the spectral movement. A spectrum interpolation unit performs interpolation of spectral components in the frequency band for the current frame by using either the frequency spectrum of the current frame or the frequency spectrum of the previous frame.
    Type: Application
    Filed: July 25, 2007
    Publication date: March 20, 2008
    Applicant: Fujitsu Limited
    Inventors: Masakiyo Tanaka, Masanao Suzuki, Miyuki Shirakawa, Takashi Makiuchi
  • Patent number: 7318034
    Abstract: A voice signal interpolation apparatus is provided which can restore original human voices from human voices in a compressed state while maintaining a high sound quality. When a voice signal representative of a voice to be interpolated is acquired by a voice data input unit 1, a pitch deriving unit 2 filters this voice signal to identify a pitch length from the filtering result. A pitch length fixing unit 3 makes the voice signal have a constant time length of a section corresponding to a unit pitch, and generates pitch waveform data. A sub-band dividing unit 4 converts the pitch waveform data into sub-band data representative of a spectrum. A plurality of sub-band data pieces are averaged by an averaging unit 5 and thereafter a sub-band synthesizing unit 6 converts the sub-band data pieces into a signal representative of a waveform of the voice by a sub-band synthesizing unit 6.
    Type: Grant
    Filed: May 28, 2003
    Date of Patent: January 8, 2008
    Assignee: Kabushiki Kaisha Kenwood
    Inventor: Yasushi Sato
  • Patent number: 7307981
    Abstract: An apparatus for converting voice packets transmitted/received through a network includes a first transcoder for performing at least one of bit-unpacking and unquantization on an encoded packet at a first encoder, namely transmitting party, to obtain an LSP (Line Spectrum Pair) parameter of the first encoder, and converting and unquantizing the LSP parameter to an LSP parameter of a second encoder, namely receiving party, to do bit-packing. A second transcoder performs at least one of bit-unpacking and unquantization on an encoded packet at the second encoder, namely transmitting party, to obtain an LSP parameter of the second encoder, and converts and unquantizes the LSP parameter to an LSP parameter of the first encoder, namely receiving party, to do bit-packing.
    Type: Grant
    Filed: September 19, 2002
    Date of Patent: December 11, 2007
    Assignee: LG Electronics Inc.
    Inventors: Yong Soo Choi, Dae Hee Youn, Kyung Tae Kim
  • Patent number: 7283961
    Abstract: There is disclosed a speech processing device in which prediction taps for finding prediction values of the speech of high sound quality are extracted from the synthesized sound obtained on affording linear prediction coefficients and residual signals, generated from a preset code, to a speech synthesis filter, speech of high sound quality being higher in sound quality than the synthesized sound, and in which the prediction taps are used along with preset tap coefficients to perform preset predictive calculations to find the prediction values of the speech of high sound quality. The speech of high sound quality is higher in sound quality than the synthesized sound.
    Type: Grant
    Filed: August 3, 2001
    Date of Patent: October 16, 2007
    Assignee: Sony Corporation
    Inventors: Tetsujiro Kondo, Tsutomu Watanabe, Masaaki Hattori, Hiroto Kimura, Yasuhiro Fujimori
  • Patent number: 7280968
    Abstract: A method for digitally generating speech with improved prosodic characteristics can include receiving a speech input, determining at least one prosodic characteristic contained within the speech input, and generating a speech output including the prosodic characteristic within the speech output.
    Type: Grant
    Filed: March 25, 2003
    Date of Patent: October 9, 2007
    Assignee: International Business Machines Corporation
    Inventor: Oscar J. Blass
  • Patent number: 7263080
    Abstract: An integrated circuit for streaming media over wireless networks is disclosed. The integrated circuit includes a media module that is designed to process media data. When non-media data is in, switching means is provided to avoid the non-media data being processed in the media module. One of important features in the integrated circuit is the underlying designs that are capable of facilitating wireless communication in different wireless networks. In one embodiment, a baseband processor is provided to facilitate wireless communications in more than one standard. The baseband processor is uniquely designed to facilitate wireless communications in a Wi-Fi network as well as a WiMAX network. As a result, same chips may be used to stream media data across different wireless networks.
    Type: Grant
    Filed: April 15, 2006
    Date of Patent: August 28, 2007
    Assignee: RDW, Inc.
    Inventors: Robin Yubin Zhu, Chung-Hsing Chang, Ted Hsiung
  • Patent number: 7249020
    Abstract: A method and a system of producing a synthesized voice is provided. A voice sound waveform is provided at a voice sampling frequency based on pronunciation informations. A voice-less sound waveform is produced at a voice-less sampling frequency based on the pronunciation informations. The voice sampling frequency is converted into an output sampling frequency to produce a frequency-converted voice sound waveform with the output sampling frequency, wherein each of the voice sampling frequency and the voice-less sampling frequency is independent from the output sampling frequency. The voice-less sampling frequency is converted into the output sampling frequency to produce a frequency-converted voice-less sound waveform with the output sampling frequency.
    Type: Grant
    Filed: April 18, 2002
    Date of Patent: July 24, 2007
    Assignee: NEC Corporation
    Inventor: Reishi Kondo
  • Patent number: 7224853
    Abstract: A set of known data samples are identified and an approximation of an original function from which the known data samples were obtained is created. The approximation function is then resampled to obtain desired values that are not contained in the set of known data samples.
    Type: Grant
    Filed: May 29, 2002
    Date of Patent: May 29, 2007
    Assignee: Microsoft Corporation
    Inventor: Shankar Moni
  • Patent number: 7177812
    Abstract: A method for conversion of input audio frequency data, at an input sample frequency, to output audio frequency data, at an output sample frequency. The input data is subjected to expansion to produce expanded data at an output sample frequency. The expanded data is interpolated to produce output data. In one embodiment of the invention the interpolation is effected by a process that also filters the output data. In another embodiment, the input data is sampled by an integer factor to produce expanded data, the expanded data is then interpolated to produce the output data. Also disclosed is a method of transition of a signal output, at one frequency, to a signal output at another frequency. The signal output at said one frequency is faded out over a period, and the signal output at said other frequency is faded in over that period. Both signal outputs are combined to produce the signal output over said period. Apparatus for effecting the methods is also disclosed.
    Type: Grant
    Filed: June 23, 2000
    Date of Patent: February 13, 2007
    Assignee: STMicroelectronics Asia Pacific PTE Ltd
    Inventors: Mohammed Javed Absar, Sapna George, Antonio Mario Alvarez-Tinoco
  • Patent number: 7143032
    Abstract: A method and system are provided for removing discontinuities associated with synthesizing a corrupted frame output from a decoder including one or more predictive filters. The corrupted frame is representative of one segment of a decoded signal. The method comprises copying a first number of stored samples of the decoded signal in accordance with a time lag and a scaling factor, and calculating a first number of ringing samples output from at least one of the filters.
    Type: Grant
    Filed: June 28, 2002
    Date of Patent: November 28, 2006
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Patent number: 7136876
    Abstract: A method and apparatus for building an abbreviation dictionary involves searching through a set of source documents. The abbreviations having likely definitions are identified and the definitions extracted from the document. The definitions having identical associated abbreviations are grouped together. The definition groups are each arranged into clusters based on an n-gram or other combinatorial method to determine similar definition. Further disambiguation is provided by looking at similarity between clusters using an annotation associated with the source documents from which the definitions were extracted.
    Type: Grant
    Filed: March 3, 2003
    Date of Patent: November 14, 2006
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Eytan Adar, Lada A. Adamic
  • Patent number: 7085724
    Abstract: The invention relates to a linking unit 100, a parametric encoder 400 and a method for generating linking information L indicating components of consecutive extended segments sp and sc which may be linked together in order to form a sinusoidal track. The segments sp and sc approximate consecutive segments of a sinusoidal audio or speech signal s. The linking unit comprises a calculating unit 120 for generating a similarity matrix S(m,n) in response to received sinusoidal code data and an evaluating unit 140 for receiving and evaluating said similarity matrix S in order to generate said linking information by selecting those pairs of components m,n the similarity of which is maximal. According to the invention the calculating unit 120 is adapted to calculate the similarity matrix S by additionally considering information about the phase consistency between the components of the extended previous segment sp and the extended current segment sc.
    Type: Grant
    Filed: January 14, 2002
    Date of Patent: August 1, 2006
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Albertus Cornelis Den Brinker, Arnoldus Werner Johannes Oomen, Fransiscus Marinus Jozephus De Bont, Erik Gosuinus Petrus Schuijers
  • Patent number: 7031912
    Abstract: A speech coding apparatus includes a frequency parameter generating unit that generates LSP coefficients of an input signal. When the input signal is a non-speech signal, it generates the LSP coefficients of the non-speech signal in such a manner that they approach the LSP coefficients of the speech signal. Thus, even when the input signal is the non-speech signal, its LSP coefficients are quantized by referring to the LSP quantization codebook which is specifically prepared for the speech signal. Although a conventional speech coding apparatus has a problem in that even when it transmits the non-speech signal in a good condition, a conventional speech decoding apparatus cannot always decode the non-speech signal correctly, the present speech coding apparatus can solve the problem even when the receiving side uses the conventional speech decoding apparatus.
    Type: Grant
    Filed: July 30, 2001
    Date of Patent: April 18, 2006
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Hisashi Yajima, Shigeaki Suzuki, Hideaki Ebisawa
  • Patent number: 7010488
    Abstract: A system and method is used to compress concatenative acoustic inventories for speech. Instead of using general purpose signal compression methods such as vector quantization, the method of the invention uses multiple properties of acoustic inventories to reduce the size of the acoustic inventories, such as the close acoustic match property and acoustic units that are labeled with sufficiently fine distinctions such that between any two phones no events occur that are substantially distinct from these two phones. The close acoustic match property is where acoustic units that share the same phone are acoustically similar at the points where these units may be concatenated. By utilizing multiple properties of acoustic units, the number of parameters per unit that are stored as LPC parameters are minimized. As a result, smaller storage devices may be used due to the reduction of the size of the storage requirements.
    Type: Grant
    Filed: May 9, 2002
    Date of Patent: March 7, 2006
    Assignee: Oregon Health & Science University
    Inventors: Jan P. H. van Santen, Alexander Kain
  • Patent number: 6967277
    Abstract: Embodiments of the invention comprise a new device and technique to realize a utilization for providing a system, method, and apparatus for providing an improved audio tone control and generation. More specifically, embodiments of the present invention relate to systems, methods, and apparatuses for an electronically improved audio tone control and generation that is adaptable for utilization in cooperation with a Musical Instrument Digital Interface (“MIDI”). In a business method embodiment, the user may pay a monthly fee or a licensing fee for an audio tone control and generation service, or alternatively may pay a per-session fee or a fee based upon data size and/or amount of data manipulation.
    Type: Grant
    Filed: August 12, 2003
    Date of Patent: November 22, 2005
    Inventor: William Querfurth
  • Patent number: 6934684
    Abstract: The invention provides a system, method, and business model for an information system and service having business self-promotion, promotion and promotion tracking, loyalty or frequent participant rewards and redemption, audio coupon, ratings, and other features. A business or organization in which consumers call into a service using ordinary telephone, PC, PDA, or other information appliance, and make requests in plain speech for information on goods and/or services, and the service provides responses to the request in plain speech in real-time.
    Type: Grant
    Filed: January 17, 2003
    Date of Patent: August 23, 2005
    Assignee: Dialsurf, Inc.
    Inventors: Ahmet Alpdemir, Arthur James
  • Patent number: 6934649
    Abstract: The present invention provides a waveform detection system and a state-monitoring system. The waveform detection system features a signal-processing function that characterizes and detects non-cyclic transient variations and performs 1/f fluctuation conversion for input waveforms to derive output waveforms. The waveform detection system characterizes signs of state variation, incorporates multiple digital filters in the digital filter calculator of the computer, uses coefficient patterns derived from non-integer n-time integration as elemental patterns for multiplication coefficient patterns, and incorporates a manner of changing the phase of at least one of the elemental patterns, input signal data, and digital filter output so that the outputs of digital filters that use the elemental patterns are synthesized in a state where a portion of the phases of the characteristic extracting and processing function is changed.
    Type: Grant
    Filed: May 15, 2001
    Date of Patent: August 23, 2005
    Assignee: Synchro Kabushiki Kaisha
    Inventors: Youichi Ageishi, Tetsuyuki Wada
  • Patent number: 6915261
    Abstract: A system and method for matching voice characteristics of a synthetic disc jockey are presented. A first segment of audio signal and a second segment of audio signal are received by a sound characteristic estimator. Corresponding first and second sets of sound characteristics are determined by the sound characteristic estimator. A voice characteristic transition for the disc jockey is interpolated from the first and second set of sound characteristics between a starting and an ending time.
    Type: Grant
    Filed: March 16, 2001
    Date of Patent: July 5, 2005
    Assignee: Intel Corporation
    Inventor: Steven E. Barile
  • Patent number: 6907398
    Abstract: A method is described for compressing the storage space required by HMM prototypes in an electronic memory. For this purpose prescribed HMM prototypes are mapped onto compressed HMM prototypes with the aid of a neural network (encoder). These can be stored with a smaller storage space than the uncompressed HMM prototypes. A second neural network (decoder) serves to reconstruct the HMM prototypes.
    Type: Grant
    Filed: September 6, 2001
    Date of Patent: June 14, 2005
    Assignee: Siemens Aktiengesellschaft
    Inventor: Harald Hoege
  • Patent number: 6901069
    Abstract: A method of compensating within a receiving endpoint for lost audio packets transmitted across an IP network, comprising the steps of storing a packet buffer of samples as a plurality of sub packets within a jitter buffer, inserting at least one interpolated sub packet between successive sub packets in the buffer, and playing out the sub packets from the buffer, such that only small portions of the jitter buffer are replayed at specific times to minimize the negative effects on voice quality. The inventive method inserts the replayed portions to compensate for packet loss in a way that results in only a relatively low processing burden.
    Type: Grant
    Filed: March 6, 2001
    Date of Patent: May 31, 2005
    Assignee: Mitel Networks Corporation
    Inventor: Roger Bastin
  • Patent number: 6804651
    Abstract: Initially, voice signal components (4) are extracted from the audio signal (1) in a procedure for determining a measure of quality (2) of an audio signal (1). Based on this signal, a reference signal (6) is then generated by means of noise suppression (7) and interruption interpolation (8). This signal is compared with the voice signal (4) and an intrusive quality value (10) is determined in this way. A further quality value (15) is determined by establishing and evaluating (12, 14) codec-related signal distortions in the voice signal (4). Another quality value (17) is generated from the information relating to the detected signal interruptions (8). The measure of quality (2) is finally determined as a linear combination (16) of the various quality values (10, 15, 17, 18).
    Type: Grant
    Filed: March 19, 2002
    Date of Patent: October 12, 2004
    Assignee: Swissqual AG
    Inventors: Pero Juric, Bendicht Thomet
  • Patent number: 6760703
    Abstract: A speech synthesis method that generates a speech pitch wave from a reference speech signal by subjecting the reference speech signal to one of Fourier transform and Fourier series expansion to produce a discrete spectrum, that interpolates the discrete spectrum to generate a consecutive spectrum, and that subjects the consecutive spectrum to inverse Fourier transform. A linear prediction coefficient is generated by subjecting the reference speech signal to a linear prediction analysis. The speech pitch wave is subjected to inverse-filtering based on the linear prediction coefficient to produce a residual pitch wave. Information regarding the residual pitch wave is stored as information of a speech synthesis unit in a voice period. A speech is then synthesized using the information of the speech synthesis unit.
    Type: Grant
    Filed: October 7, 2002
    Date of Patent: July 6, 2004
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Takehiko Kagoshima, Masami Akamine
  • Patent number: 6691092
    Abstract: A system determines a voicing measure as a measure of the degree of signal periodicity and uses the determined voicing measure to quantize the spectral magnitude of the slowly evolving waveform (SEW) and the modeling of the SEW and rapidly evolving waveform (REW) phase spectra.
    Type: Grant
    Filed: April 4, 2000
    Date of Patent: February 10, 2004
    Assignee: Hughes Electronics Corporation
    Inventors: Bangalore R. Udaya Bhaskar, Srinivas Nandkumar, Kumar Swaminathan, Gaguk Zakaria
  • Patent number: 6678657
    Abstract: The present invention relates to a method and an apparatus for a robust feature extraction for speech recognition in a noisy environment, wherein the speech signal is segmented and is characterized by spectral components. The speech signal is splitted into a number of short term spectral components in L subbands, with L=1, 2, . . . and a noise spectrum from segments that only contain noise is estimated. Then a spectral subtraction of the estimated noise spectrum from the corresponding short term spectrum is performed and a probability for each short term spectrum component to contain noise is calculated. Finally these spectral component of each short-term spectrum, having a low probability to contain speech are interpolated in order to smooth those short-term, spectra that only contain noise. With the interpolation the spectral components containing noise are interpolated by reliable spectral speech components that could be found in the neighborhood.
    Type: Grant
    Filed: October 23, 2000
    Date of Patent: January 13, 2004
    Assignee: Telefonaktiebolaget LM Ericsson(Publ)
    Inventors: Raymond Brückner, Hans-Günter Hirsch, Rainer Klisch, Volker Springer
  • Patent number: 6668029
    Abstract: Methods and apparatus for implementing digital resampling circuits which create one or more bitstreams which include samples at desired rates, from an input bitstream having a fixed sample rate, are described. The resampling circuits of the present invention achieve the desired sample rates by performing digital interpolation on samples included in the input signal. The interpolation is performed using a filter, e.g., an all-pass infinite impulse response filter which produces an output as a function of a controllable signal delay.
    Type: Grant
    Filed: October 15, 1999
    Date of Patent: December 23, 2003
    Assignee: Hitachi America, Ltd.
    Inventors: Joshua L. Koslov, Frank Anton Lane
  • Publication number: 20030177011
    Abstract: An interpolation device for judging a state of sounds of a frame at which an error or a loss has occurred in the audio data and carrying out the interpolation according to that state is constructed by an input unit for entering the audio data, a detection unit for detecting the error or the loss of each frame of the audio data, an estimation unit for estimating the interpolation information of the frame at which the error or the loss is detected, and an interpolation unit for interpolating the frame at which the error or the loss is detected, by using the interpolation information estimated for that frame by the estimation unit.
    Type: Application
    Filed: December 16, 2002
    Publication date: September 18, 2003
    Inventors: Yasuyo Yasuda, Tomoyuki Ohya, Sanae Hotani
  • Patent number: 6594626
    Abstract: Disclosed is a voice encoding method having a synthesis filter implemented using linear prediction coefficients obtained by dividing an input signal into frames each of a fixed length, and subjecting the input signal to linear prediction analysis in the frame units, generating a reconstructed signal by driving said synthesis filter by a periodicity signal output from an adaptive codebook and a pulsed signal output from an algebraic codebook, and performing encoding in such a manner that an error between the input signal and said reproduced signal is minimized, wherein there are provided an encoding mode 1 that uses pitch lag obtained from an input signal of a present frame and an encoding mode 2 that uses pitch lag obtained from an input signal of a past frame. Encoding is performed in encoding mode 1 and encoding mode 2, the mode in which the input signal can be encoded more precisely is decided frame by frame and encoding is carried out on the basis of the mode decided.
    Type: Grant
    Filed: January 8, 2002
    Date of Patent: July 15, 2003
    Assignee: Fujitsu Limited
    Inventors: Masanao Suzuki, Yasuji Ota, Yoshiteru Tsuchinaga
  • Patent number: 6584438
    Abstract: A frame erasure compensation method in a variable-rate speech coder includes quantizing, with a first encoder, a pitch lag value for a current frame and a first delta pitch lag value equal to the difference between the pitch lag value for the current frame and the pitch lag value for the previous frame. A second, predictive encoder quantizes only a second delta pitch lag value for the previous frame (equal to the difference between the pitch lag value for the previous frame and the pitch lag value for the frame prior to that frame). If the frame prior to the previous frame is processed as a frame erasure, the pitch lag value for the previous frame is obtained by subtracting the first delta pitch lag value from the pitch lag value for the current frame. The pitch lag value for the erasure frame is then obtained by subtracting the second delta pitch lag value from the pitch lag value for the previous frame.
    Type: Grant
    Filed: April 24, 2000
    Date of Patent: June 24, 2003
    Assignee: Qualcomm Incorporated
    Inventors: Sharath Manjunath, Pengjun Huang, Eddie-Lun Tik Choy
  • Publication number: 20030093278
    Abstract: A system and method are disclosed for extending the bandwidth of a narrowband signal such as a speech signal. The method applies a parametric approach to bandwidth extension but does not require training. The parametric representation relates to a discrete acoustic tube model (DATM). The method comprises computing narrowband linear predictive coefficients (LPCs) from a received narrowband speech signal, computing narrowband partial correlation coefficients (parcors) using recursion, computing Mnb area coefficients from the partial correlation coefficient, and extracting Mwb area coefficients using interpolation. Wideband parcors are computed from the Mwb area coefficients and wideband LPCs are computed from the wideband parcors.
    Type: Application
    Filed: October 4, 2001
    Publication date: May 15, 2003
    Inventor: David Malah
  • Publication number: 20030093279
    Abstract: A system and method are disclosed for extending the bandwidth of a narrowband signal such as a speech signal. The method applies a parametric approach to bandwidth extension but does not require training. The parametric representation relates to a discrete acoustic tube model ATH. The method comprises computing narrowband linear predictive coefficients (LPCs) from a received narrowband speech signal, computing narrowband partial correlation coefficients (parcors) using recursion, computing Mnb area coefficients from the partial correlation coefficient, and extracting Mwb area coefficients using interpolation. Wideband parcors are computed from the Mwb area coefficients and wideband LPCs are computed from the wideband parcors.
    Type: Application
    Filed: October 4, 2001
    Publication date: May 15, 2003
    Inventors: David Malah, Richard Vandervoort Cox
  • Patent number: 6553343
    Abstract: A speech synthesis method subjects a reference speech signal to windowing to extract an aperiodic speech pitch wave from the reference speech signal. A linear prediction coefficient is generated by subjecting the reference speech signal to a linear prediction analysis. The aperiodic speech pitch wave is subjected to inverse-filtering based on the linear prediction coefficient to produce a residual pitch wave. Information regarding the residual pitch wave is stored as information of a speech synthesis unit and a voiced period in the storage. The speech is then synthesized using the information of the speech synthesis unit.
    Type: Grant
    Filed: October 29, 2001
    Date of Patent: April 22, 2003
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Takehiko Kagoshima, Masami Akamine
  • Patent number: 6513007
    Abstract: There is provided a synthesized sound generating apparatus and method which can achieve responsive and high-quality speech synthesis based on a real-time convolution operation. Coefficients are generated by using dynamic cutting to extract characteristic information from a first signal. A convolution operation is performed on a second signal using the generated coefficients to generate a synthesized signal. As the convolution operation, an interpolation process is performed on the coefficients to prevent a rapid change in level of the generated synthesized signal upon switching of the coefficients.
    Type: Grant
    Filed: July 20, 2000
    Date of Patent: January 28, 2003
    Assignee: Yamaha Corporation
    Inventor: Akio Takahashi
  • Publication number: 20020188450
    Abstract: The invention relates to a method for defining a sequence of sound modules for synthesis of a speech signal in a tonal language corresponding to a sequence of speech modules. The method according to the invention differs from known methods in that the speech modules represent triphones, which each comprise one phoneme with the respective context, and with syllables in the tonal language being composed of one or more triphones. This results in a high level of flexibility for the synthesis of tonal languages.
    Type: Application
    Filed: April 26, 2002
    Publication date: December 12, 2002
    Applicant: SIEMENS AKTIENGESELLSCHAFT
    Inventors: Martin Holzapfel, Jianhua Tao
  • Patent number: 6493664
    Abstract: Encoding of prototype waveform components applicable to telecommunication systems provides improved voice quality enabling a dual-channel mode of operation which permits more users to communicate over the same physical channel. A prototype word (PW) gain is vector quantized using a vector quantizer (VQ) that explicitly populates a codebook by representative steady state and transient vectors of PW gain for tracking the abrupt variations in speech levels during onsets and other non-stationary events, while maintaining the accuracy of the speech level during stationary conditions.
    Type: Grant
    Filed: April 4, 2000
    Date of Patent: December 10, 2002
    Assignee: Hughes Electronics Corporation
    Inventors: Bangalore R. Udaya Bhaskar, Srinivas Nandkumar, Kumar Swaminathan, Gaguk Zakaria
  • Patent number: 6490562
    Abstract: It is to assign proper pitch marks to voice waveforms, thereby to obtain smoothly synthesized voices and to control pitches of voices very accurately according to pitch marks of recorded messages. Any one of the fixed low-pass filters 3002-a to 3002-d is set so as to pass only fundamental component of voices and each of peak detectors 3003-a to 3003-d detects peaks and the channel selector 3004 is selected, thereby to keep taking out of peak information for fundamental waves. The channel selector 3004 decides a channel to be a correct channel if intervals of peaks detected by the peak detectors 3003-a to d are changed smoothly in the channel. According to this peak information, pitches of voices are analyzed, so that the adaptive filter 3005 passes only fundamental component of voices and the peak detector 3006 detects peaks of fundamental waves, thereby to assign pitch marks to voice waveforms.
    Type: Grant
    Filed: April 9, 1998
    Date of Patent: December 3, 2002
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Takahiro Kamai, Kenji Matsui
  • Publication number: 20020147590
    Abstract: A digital audio receiver stores received frames temporarily for decoding and error concealment. A reconstructing block (14) in the decoder reads stored frames using a read window (43) wherein the latest received frame (+cnnxt) is undecoded. Decoding is carried out in stages so that the correctness of the current frame (0) is examined and possible errors are concealed using corresponding data of other frames in the window. Detection of errors is based on checksums (19, 26) and allowed values of bit combinations in certain parts of the frame. In addition, the receiver maintains an estimate (60) for the signal's bit error ratio and uses it to control the operation of the error concealment algorithm.
    Type: Application
    Filed: June 16, 1999
    Publication date: October 10, 2002
    Inventors: MATTI SYDANMAA, MAURI VAANANEN, AKI MAKIVIRTA
  • Publication number: 20020133349
    Abstract: A system and method for matching voice characteristics of a synthetic disc jockey are presented. A first segment of audio signal and a second segment of audio signal are received by a sound characteristic estimator. Corresponding first and second sets of sound characteristics are determined by the sound characteristic estimator. A voice characteristic transition for the disc jockey is interpolated from the first and second set of sound characteristics between a starting and an ending time.
    Type: Application
    Filed: March 16, 2001
    Publication date: September 19, 2002
    Inventor: Steven E. Barile
  • Patent number: 6453287
    Abstract: A system and method for enhancing the speech quality of the mixed excitation linear predictive (MELP) coder and other low bit-rate speech coders. The system and method employ a plosive analysis/synthesis method, which detects the frame containing a plosive signal, applies a simple model to synthesize the plosive signal, and adds the synthesized plosive to the coded speech. The system and method remains compatible with the existing MELP coder bit stream.
    Type: Grant
    Filed: September 29, 1999
    Date of Patent: September 17, 2002
    Assignee: Georgia-Tech Research Corporation
    Inventors: Takahiro Unno, Thomas P. Barnwell, III, Kwan K. Truong
  • Patent number: 6449590
    Abstract: A multi-rate speech codec supports a plurality of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. To support lower bit rate encoding modes, a variety of techniques,are applied many of which involve the classification of the input signal. The speech encoder continuously warps a weighted speech signal in long term preprocessing. The continuous warping is applied to a linear pitch lag contour that enables fast searching through linear time weighting. Optimal searching is performed within a limited range that is defined at least in part on sharpness and speech classification. The speech encoder generates the linear pitch lag contour from previous and current pitch lag values.
    Type: Grant
    Filed: September 18, 1998
    Date of Patent: September 10, 2002
    Assignee: Conexant Systems, Inc.
    Inventor: Yang Gao
  • Patent number: 6377916
    Abstract: A speech signal is encoded into a set of encoded bits by digitizing the speech signal to produce a sequence of digital speech samples that are divided into a sequence of frames, each of which spans multiple digital speech samples. A set of speech model parameters are estimated for a frame. The speech model parameters include voicing parameters dividing the frame into voiced and unvoiced regions, at least one pitch parameter representing pitch for at least the voiced regions of the frame, and spectral parameters representing spectral information for at least the voiced regions of the frame. The speech model parameters are quantized to produce parameter bits. The frame is also divided into one or more subframes for which transform coefficients are computed. The transform coefficients for unvoiced regions of the frame are quantized to produce transform bits. The parameter bits and the transform bits are included in the set of encoded bits.
    Type: Grant
    Filed: November 29, 1999
    Date of Patent: April 23, 2002
    Assignee: Digital Voice Systems, Inc.
    Inventor: John C. Hardwick
  • Patent number: 6360198
    Abstract: A reproduction part reproduced at a changeable speed ratio r. An A/D conversion part A/D converts, based on sampling frequency fi, an audio signal reproduced at a speed different from that upon recording. A block data division part divides audio data based on an attribute possessed by the audio data. An audio data connection part successively interpolates or thins out the divided audio data based on a ratio of 1/r. A D/A conversion part D/A converts the interpolated or thinned-out audio data based on sampling frequency fo. If a relation of fi/fo=r/c is satisfied, the audio signal is outputted as a sound of high quality constantly synchronized with an image signal and having a pitch which does not change irrespective of the changeable speed ratio r at which the image signal is reproduced.
    Type: Grant
    Filed: May 6, 1999
    Date of Patent: March 19, 2002
    Assignee: Nippon Hoso Kyokai
    Inventors: Atsushi Imai, Nobumasa Seiyama, Tohru Takagi
  • Patent number: 6347295
    Abstract: A computer method and apparatus provide automatic generation of grapheme-to-phoneme rules, used in text-to-speech synthesis systems. The invention method and apparatus are based on a statistical analysis of a subject dictionary. The dictionary preferably contains words and their corresponding phonemic data representations, and is analyzed for subgraph patterns. The phoneme strings for words containing the subgraph patterns are then analyzed for common phoneme substrings (subphones) associated with each subgraph. The subphones associated with each subgraph are then checked for conditions such as the highest occurrence count, the proper length, and for compatibility with both ends of the subgraph to which they are associated. A subphone matching these conditions becomes paired with the subgraph to create a rule for text-to-speech processing. Separate prefix, infix, and suffix rule sets may be generated from the invention dictionary analysis.
    Type: Grant
    Filed: October 26, 1998
    Date of Patent: February 12, 2002
    Assignee: Compaq Computer Corporation
    Inventors: Anthony J. Vitale, Ginger Chun-Che Lin, Thomas Kopec
  • Patent number: 6278971
    Abstract: An apparatus and procedure for performing phase detection in which one-pitch cycle of an input signal waveform is cut out on a time axis. The cut-out one pitch cycle is filled with zeroes to form 2N samples (where N is an integer, 2N is equal to or greater than the number of samples of the one-pitch cycle), and the samples are subjected to an orthogonal conversion such as fast Fourier transform, whereby a real and imaginary part are used to calculate tan−1 to obtain a basic phase information. This basic phase is subjected to linear interpolation to obtain phases of respective higher harmonics of the input signal waveform.
    Type: Grant
    Filed: January 26, 1999
    Date of Patent: August 21, 2001
    Assignee: Sony Corporation
    Inventors: Akira Inoue, Masayuki Nishiguchi
  • Patent number: 6260017
    Abstract: A multipulse interpolative coder for transition speech frames includes an extractor configured to represent a first frame of transitional speech samples by a subset of the samples of the frame. The coder also includes an interpolator configured to interpolate the subset of samples and a subset of samples extracted from an earlier-received frame to synthesize other samples of the first frame that are not included in the subset. The subset of samples is further simplified by selecting a set of pulses from the subset and assigning zero values to unselected pulses. In the alternative, a portion of the unselected pulses may be quantized. The set of pulses may be the pulses having the greatest absolute amplitudes in the subset. In the alternative, the set of pulses may be the most perceptually significant pulses of the subset.
    Type: Grant
    Filed: May 7, 1999
    Date of Patent: July 10, 2001
    Assignee: Qualcomm Inc.
    Inventors: Amitava Das, Sharath Manjunath
  • Patent number: 6240299
    Abstract: A method for speech encoding and decoding usable in a Digital Telephone Answering Machine/Voice Memo for a cellular radiotelephone is provided. The apparatus uses parameter-based speech compression and decompression modules. These modules perform decimation of standard-type speech parameter frames before storing the message, and interpolation before playing the message, in order to substantially reduce the number of parameter bits in parameter frames of the stored speech signal. The result is a decreased demand for storage space and increased speed of speech compression and decompression.
    Type: Grant
    Filed: February 20, 1998
    Date of Patent: May 29, 2001
    Assignee: Conexant Systems, Inc.
    Inventor: Wei-jei Song
  • Patent number: 6208969
    Abstract: A method and an electronic data processing apparatus for wave synthesis that retains the true qualities of naturally occurring sounds, such as those of musical instruments, speech, or other sounds. Transfer functions representative of recorded sound samples are pre-calculated and stored for use in an interpolative process to generate a transfer function representative of the sound to be synthesized. The preferred transfer functions are Chebyshev polynomial-based transfer functions, which assure a highly predictable harmonic content of synthesized sound. Output sound generation is driven by time domain signals produced by reconversion of a sequence of interpolated transfer functions. Non-harmonic sounds are synthesized using multiple frequency inputs to the reconverting (waveshaping) stage, or by parallel waveshaping stages. Speech sibilants and noise envelopes of instruments are synthesized by the input of noise into the waveshaping stage by modulation of a sinusoid with band-limited noise.
    Type: Grant
    Filed: July 24, 1998
    Date of Patent: March 27, 2001
    Assignee: Lucent Technologies Inc.
    Inventor: Steven DeArmond Curtin
  • Patent number: 6175821
    Abstract: A voice message is generated having an invariable portion and a variable portion. Most of the invariable portion is provided in the form of recorded speech whereas the variable portion is provided in the form of synthesized speech. The synthesized speech also extends by half a phoneme into the invariable portion of the message. The synthesized speech and the recorded speech are then concatenated, with a transition signal being formed on the basis of a boundary portion of each of the recorded and synthesized signals about any join. In forming the transition signal, a set of transition signal pitchmarks is created and an overlap-add technique is used to copy the waveform within the boundary portions of the speech signals around the transition signal pitchmarks.
    Type: Grant
    Filed: April 26, 1999
    Date of Patent: January 16, 2001
    Assignee: British Telecommunications public limited company
    Inventors: Julian H. Page, Paul Murrin
  • Patent number: RE39336
    Abstract: The concatenative speech synthesizer employs demi-syllable subword units to generate speech. The synthesizer is based on a source-filter model that uses source signals that correspond closely to the human glottal source and that uses filter parameters that correspond closely to the human vocal tract. Concatenation of the demi-syllable units is facilitated by two separate cross face techniques, one applied in the time domain in the demi-syllable source signal waveforms, and one applied in the frequency domain by interpolating the corresponding filter parameters of the concatenated demi-syllables. The dual cross fade technique results in natural sounding synthesis that avoids time-domain glitches without degrading or smearing characteristic resonances in the filter domain.
    Type: Grant
    Filed: November 5, 2002
    Date of Patent: October 10, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Steve Pearson, Nicholas Kibre, Nancy Niedzielski