Linear Prediction Patents (Class 704/262)
  • Patent number: 8315863
    Abstract: A post filter and a decoder enabling improvement of the sound quality of a decoded signal even when the sound quality of the decoded signal is different from the bands are disclosed. A frequency converting section determines a decoded spectrum. A power spectrum computing section computes the power spectrum from the decoded spectrum. A correction band determining section determines the band in which the power spectrum is corrected according to layer information. A power spectrum correcting section corrects the power spectrum in the corrected band in such a way that the variation along the frequency axis is suppressed. An inverse converting section subjects the corrected power spectrum to inverse conversion to determine an autocorrelation function. An LPC analyzing section determines an LPC coefficient of the determined autocorrelation function.
    Type: Grant
    Filed: June 15, 2006
    Date of Patent: November 20, 2012
    Assignee: Panasonic Corporation
    Inventor: Masahiro Oshikiri
  • Patent number: 8301447
    Abstract: The present invention relates to creating a phonetic index of phonemes from an audio segment that includes speech content from multiple sources. The phonemes in the phonetic index are directly or indirectly associated with the corresponding source of the speech from which the phonemes were derived. By associating the phonemes with a corresponding source, the phonetic index of speech content from multiple sources may be searched based on phonetic content as well as the corresponding source.
    Type: Grant
    Filed: October 10, 2008
    Date of Patent: October 30, 2012
    Assignee: Avaya Inc.
    Inventors: John H. Yoakum, Stephen Whynot
  • Patent number: 8280728
    Abstract: Systems and methods are described for performing packet loss concealment using an extrapolation of an excitation waveform in a sub-band predictive speech coder, such as an ITU-T Recommendation G.722 wideband speech coder. The systems and methods are useful for concealing the quality-degrading effects of packet loss in a sub-band predictive coder and address some sub-band architectural issues when applying excitation extrapolation techniques to such sub-band predictive coders.
    Type: Grant
    Filed: August 8, 2007
    Date of Patent: October 2, 2012
    Assignee: Broadcom Corporation
    Inventors: Juin-Hwey Chen, Jes Thyssen, Robert W. Zopf
  • Patent number: 8271291
    Abstract: A method for identifying a frame type is disclosed. The present invention includes receiving current frame type information, obtaining previously received previous frame type information, generating frame identification information of a current frame using the current frame type information and the previous frame type information, and identifying the current frame using the frame identification information. And, a method for identifying a frame type is disclosed. The present invention includes receiving a backward type bit corresponding to current frame type information, obtaining a forward type bit corresponding to previous frame type information, generating frame identification information of a current frame by placing the backward type bit at a first position and placing the forward type bit at a second position.
    Type: Grant
    Filed: May 8, 2009
    Date of Patent: September 18, 2012
    Assignee: LG Electronics Inc.
    Inventors: Sang Bae Chon, Lae Hoon Kim, Koeng Mo Sung
  • Patent number: 8239193
    Abstract: Provided are a method, apparatus, and medium for encoding/decoding a high frequency band signal by using a low frequency band signal corresponding to an audio signal or a speech signal. Accordingly, since the high frequency band signal is encoded and decoded by using the low frequency band signal, encoding and decoding can be carried out with a small data size while avoiding deterioration of sound quality.
    Type: Grant
    Filed: September 17, 2009
    Date of Patent: August 7, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Eun-mi Oh, Ki-hyun Choo, Jung-hoo Kim
  • Patent number: 8229749
    Abstract: There is provided a wide-band LSP prediction device and others capable of predicting a wide-band LSP from a narrow-band LSP with a high quantization efficiency and a high accuracy while suppressing the size of a conversion table correlating the narrow-band LSP to the wide-band LSP. In this device, a non-linear prediction unit (102) performs non-linear prediction by using a converted wide-band LSP inputted from a narrow-band/wide-band conversion unit (101) and inputs the non-linear prediction result to an amplifier (103). The converted wide-band LSP is inputted to an amplifier (104). An adder (122) adds multiplication results (vectors) inputted from the amplifiers (103, 104).
    Type: Grant
    Filed: December 9, 2005
    Date of Patent: July 24, 2012
    Assignee: Panasonic Corporation
    Inventors: Hiroyuki Ehara, Koji Yoshida, Toshiyuki Morii
  • Publication number: 20120150544
    Abstract: A system for reconstructing speech from an input signal comprising whispers is disclosed. The system comprises an analysis unit configured to analyse the input signal to form a representation of the input signal; an enhancement unit configured to modify the representation of the input signal to adjust a spectrum of the input signal, wherein the adjusting of the spectrum of the input signal comprises modifying a bandwidth of at least one formant in the spectrum to achieve a predetermined spectral energy distribution and amplitude for the at least one formant; and a synthesis unit configured to reconstruct speech from the modified representation of the input signal.
    Type: Application
    Filed: August 25, 2010
    Publication date: June 14, 2012
    Inventors: Ian Vince McLoughlin, Hamid Reza Sharifzadeh, Farzaneh Ahmadi
  • Patent number: 8185384
    Abstract: A method and apparatus for estimating the pitch period of a signal. The method includes identifying a first candidate pitch period by performing a search only over a first range of potential pitch periods. The method further includes determining a second candidate pitch period by dividing the first candidate pitch period by an integer, wherein the second candidate pitch period is outside the first range of potential pitch periods. The method further includes selecting as the estimate of the pitch period of the signal the smaller of the candidate pitch periods that is such that portions of the signal separated by that candidate pitch period are well correlated.
    Type: Grant
    Filed: April 21, 2009
    Date of Patent: May 22, 2012
    Assignee: Cambridge Silicon Radio Limited
    Inventors: Xuejing Sun, Sameer Gadre
  • Publication number: 20120116769
    Abstract: A method applies a parametric approach to bandwidth extension but does not require training. The method computes narrowband linear predictive coefficients from a received narrowband speech signal, computes narrowband partial correlation coefficients using recursion, computes Mnb area coefficients from the partial correlation coefficient, and extracts Mwb area coefficients using interpolation. Wideband parcors are computed from the Mwb area coefficients and wideband LPCs are computed from the wideband parcors. The method further comprises synthesizing a wideband signal using the wideband LPCs and a wideband excitation signal, highpass filtering the synthesized wideband signal to produce a highband signal, and combining the highband signal with the original narrowband signal to generate a wideband signal.
    Type: Application
    Filed: November 7, 2011
    Publication date: May 10, 2012
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: David Malah, Richard Vandervoort Cox
  • Patent number: 8165882
    Abstract: Apparatus and method for generating high quality synthesized speech having smooth waveform concatenation. The apparatus includes a pitch frequency calculation section, a pitch synchronization position calculation section, a unit waveform storage, a unit waveform selection section, a unit waveform generation section, and a waveform synthesis section. The unit waveform generation section includes a conversion ratio calculation section, a sampling rate conversion section, and a unit waveform re-selection section. The conversion ratio calculation section calculates a sampling rate conversion ratio from the pitch information and the position of pitch synchronization, and the sampling rate conversion section converts the sampling rate of the unit waveform, delivered as input, based on the sampling rate conversion ratio.
    Type: Grant
    Filed: September 4, 2006
    Date of Patent: April 24, 2012
    Assignee: NEC Corporation
    Inventors: Masanori Kato, Satoshi Tsukada
  • Patent number: 8160874
    Abstract: An audio decoding device performs frame loss compensation capable of obtaining a decoded audio which is natural for ears with little noise. The audio decoding device includes a non-cyclic pulse waveform detection unit for detecting a non-cyclic pulse waveform section in a n?1-th frame, which is repeatedly used with a pitch cycle in the n-th frame upon compensation of loss of the n-th frame. The audio coding device also includes a non-cyclic pulse waveform suppression unit for suppressing a non-cyclic pulse waveform by replacing an audio source signal existing in the non-cyclic pulse waveform section in the n?1-th frame by a noise signal. The audio coding device further includes a synthesis filter for using a linear prediction coefficient decoded by an LPC decoding unit to perform synthesis by a synthesis filter by using the audio source signal of the n?1-th frame from the non-cyclic pulse waveform suppression unit as a drive audio source, thereby obtaining the decoded audio signal of the n-th frame.
    Type: Grant
    Filed: December 26, 2006
    Date of Patent: April 17, 2012
    Assignee: Panasonic Corporation
    Inventors: Takuya Kawashima, Hiroyuki Ehara
  • Patent number: 8140326
    Abstract: An audio privacy system reduces the intelligibility of speech in an audio signal while preserving prosodic information, such as pitch, relative energy and intonation so that a listener has the ability to recognize environmental sounds but not the speech itself. An audio signal is processed to separate non-vocalic information, such as pitch and relative energy of speech, from vocalic regions, after which syllables are identified within the vocalic regions. Representations of the vocalic regions are computed to produce a vocal tract transfer function and an excitation. The vocal tract transfer function for each syllable is then replaced with the vocal tract transfer function from another prerecorded vocalic sound. In one aspect, the identity of the replacement vocalic sound is independent of the identity of the syllable being replaced.
    Type: Grant
    Filed: June 6, 2008
    Date of Patent: March 20, 2012
    Assignee: Fuji Xerox Co., Ltd.
    Inventors: Francine Chen, John Adcock
  • Publication number: 20120065980
    Abstract: An electronic device for coding a transient frame is described. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor. The electronic device obtains a current transient frame. The electronic device also obtains a residual signal based on the current transient frame. Additionally, the electronic device determines a set of peak locations based on the residual signal. The electronic device further determines whether to use a first coding mode or a second coding mode for coding the current transient frame based on at least the set of peak locations. The electronic device also synthesizes an excitation based on the first coding mode if the first coding mode is determined. The electronic device also synthesizes an excitation based on the second coding mode if the second coding mode is determined.
    Type: Application
    Filed: September 8, 2011
    Publication date: March 15, 2012
    Applicant: QUALCOMM Incorporated
    Inventors: Venkatesh Krishnan, Ananthapadmanabhan Arasanipalai Kandhadai
  • Patent number: 8121831
    Abstract: Provided are a method, apparatus, and medium for encoding/decoding a high frequency band signal by using a low frequency band signal corresponding to an audio signal or a speech signal. Accordingly, since the high frequency band signal is encoded and decoded by using the low frequency band signal, encoding and decoding can be carried out with a small data size while avoiding deterioration of sound quality.
    Type: Grant
    Filed: October 26, 2007
    Date of Patent: February 21, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Eun-mi Oh, Ki-hyun Choo, Jung-hoe Kim
  • Patent number: 8121832
    Abstract: Provided are a method and apparatus for encoding and decoding a high frequency signal by using a low frequency signal. The high frequency signal can be encoded by extracting a coefficient by linear predicting a high frequency signal, and encoding the coefficient, generating a signal by using the extracted coefficient and a low frequency signal, and encoding the high frequency signal by calculating a ratio between the high frequency signal and an energy value of the generated signal. Also, the high frequency signal can be decoded by decoding a coefficient, which is extracted by linear predicting a high frequency signal, and a low frequency signal, and generating a signal by using the decoded coefficient and the decoded low frequency signal, and adjusting the generated signal by decoding a ratio between the generated signal and an energy value of the high frequency signal.
    Type: Grant
    Filed: November 15, 2007
    Date of Patent: February 21, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ki-hyun Choo, Lei Miao, Eun-mi Oh
  • Patent number: 8121833
    Abstract: The exemplary embodiments of the invention provide at least a method and an apparatus to perform operations including dividing a sound signal into a series of successive frames, dividing each frame into a number of subframes, producing a residual signal by filtering the sound signal through a linear prediction analysis filter, locating a last pitch pulse of the sound signal of a previous frame from the residual signal, extracting a pitch pulse prototype of given length around a position of the last pitch pulse of the previous frame using the residual signal, and locating pitch pulses in a current frame using the pitch pulse prototype.
    Type: Grant
    Filed: October 21, 2008
    Date of Patent: February 21, 2012
    Assignee: Nokia Corporation
    Inventors: Mikko Tammi, Milan Jelinek, Claude LaFlamme, Vesa Ruoppila
  • Publication number: 20110270614
    Abstract: A method and an apparatus for switching speech or audio signals, wherein the method for switching speech or audio signals includes when switching of a speech or audio, weighting a first high frequency band signal of a current frame of speech or audio signal and a second high frequency band signal of the previous M frame of speech or audio signals to obtain a processed first high frequency band signal, where M is greater than or equal to 1, and synthesizing the processed first high frequency band signal and a first low frequency band signal of the current frame of speech or audio signal into a wide frequency band signal. In this way, speech or audio signals with different bandwidths can be smoothly switched, thus improving the quality of audio signals received by a user.
    Type: Application
    Filed: June 16, 2011
    Publication date: November 3, 2011
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Zexin Liu, Lei Miao, Chen Hu, Wenhai Wu, Yue Lang, Qing Zhang
  • Patent number: 8019087
    Abstract: A stereo signal generating apparatus capable of obtaining stereo signals that exhibit a low bit rate and an excellent reproducibility. In this stereo signal generating apparatus (90), an FT part (901) converts a monaural signal (M?t) of time domain to a monaural signal (M?) of frequency domain. A power spectrum calculating part (902) determines a power spectrum (PM?). A scaling ratio calculating part (904a) determines a scaling ratio (SL) for a left channel, while a scaling ratio calculating part (904b) determines a scaling ratio (SR) for a right channel. A multiplying part (905a) multiplies the monaural signal (M?) of frequency domain by the scaling ratio (SL) to produce a left channel signal (L?) of a stereo signal, while a multiplying part (905b) multiplies the monaural signal (M?) of frequency domain by the scaling ratio (SR) to produce a right channel signal (R?) of the stereo signal.
    Type: Grant
    Filed: August 29, 2005
    Date of Patent: September 13, 2011
    Assignee: Panasonic Corporation
    Inventors: Michiyo Goto, Chun Woei Teo, Sua Hong Neo, Koji Yoshida
  • Patent number: 7977562
    Abstract: Various technologies for generating a synthesized singing voice waveform. In one implementation, the computer program may receive a request from a user to create a synthesized singing voice using the lyrics of a song and a digital file containing its melody as inputs. The computer program may then dissect the lyrics' text and its melody file into its corresponding sub-phonemic units and musical score respectively. The musical score may be further dissected into a sequence of musical notes and duration times for each musical note. The computer program may then determine a fundamental frequency (F0), or pitch, of each musical note.
    Type: Grant
    Filed: June 20, 2008
    Date of Patent: July 12, 2011
    Assignee: Microsoft Corporation
    Inventors: Yao Qian, Frank Soong
  • Publication number: 20110099014
    Abstract: Systems and methods are described for performing packet loss concealment (PLC) to mitigate the effect of one or more lost frames within a series of frames that represent a speech signal. In accordance with the exemplary systems and methods, PLC is performed by searching a codebook of speech-related parameter profiles to identify content that is being spoken and by selecting a profile associated with the identified content for use in predicting or estimating speech-related parameter information associated with one or more lost frames of a speech signal. The predicted/estimated speech-related parameter information is then used to synthesize one or more frames to replace the lost frame(s) of the speech signal.
    Type: Application
    Filed: September 21, 2010
    Publication date: April 28, 2011
    Applicant: BROADCOM CORPORATION
    Inventor: Robert W. Zopf
  • Patent number: 7921009
    Abstract: A method and device for updating statuses of synthesis filters are provided. The method includes: exciting a synthesis filter corresponding to a first encoding rate by using an excitation signal of the first encoding rate, outputting reconstructed signal information, and updating status information of the synthesis filter and a synthesis filter corresponding to a second encoding rate. In the present disclosure, the status of the synthesis filter corresponding to the current rate and the statuses of the synthesis filters at other rates are updated. Thus, synchronization between the statuses of the synthesis filters corresponding to different rates at the encoding terminal may be realized, thereby facilitating the consistency of the reconstructed signals of the encoding and decoding terminals when the encoding rate is switched, and improving the quality of the reconstructed signal of the decoding terminal.
    Type: Grant
    Filed: September 16, 2010
    Date of Patent: April 5, 2011
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Jinliang Dai
  • Publication number: 20110077945
    Abstract: This invention relates to a method, a computer program product, apparatuses and a system for extracting coded parameter set from an encoded audio/speech stream, said audio/speech stream being distributed to a sequence of packets, and generating a time scaled encoded audio/speech stream in the parameter coded domain using said extracted coded parameter set.
    Type: Application
    Filed: June 6, 2007
    Publication date: March 31, 2011
    Applicant: NOKIA CORPORATION
    Inventors: Pasi Sakari Ojala, Ari Kalevi Lakaniemi
  • Publication number: 20110010179
    Abstract: A method and an apparatus for voice synthesis and processing have been presented. In one exemplary method, a first audio recording of a human speech in a natural language is received. Then speech analysis synthesis algorithm is applied to the first audio recording to synthesize a second audio recording from the first audio recording such that the second audio recording sounds humanistic and consistent, but unintelligible.
    Type: Application
    Filed: July 13, 2009
    Publication date: January 13, 2011
    Inventor: Devang K. Naik
  • Publication number: 20100332232
    Abstract: A method and device for updating statuses of synthesis filters are provided. The method includes: exciting a synthesis filter corresponding to a first encoding rate by using an excitation signal of the first encoding rate, outputting reconstructed signal information, and updating status information of the synthesis filter and a synthesis filter corresponding to a second encoding rate. In the present disclosure, the status of the synthesis filter corresponding to the current rate and the statuses of the synthesis filters at other rates are updated. Thus, synchronization between the statuses of the synthesis filters corresponding to different rates at the encoding terminal may be realized, thereby facilitating the consistency of the reconstructed signals of the encoding and decoding terminals when the encoding rate is switched, and improving the quality of the reconstructed signal of the decoding terminal.
    Type: Application
    Filed: September 16, 2010
    Publication date: December 30, 2010
    Inventor: Jinliang DAI
  • Patent number: 7856353
    Abstract: Method for processing speech signal data. A speech signal is divided into frames. Each frame is characterized by a frame number T representing a unique interval of time. Each speech signal is characterized by a power spectrum with respect to frame T and frequency band ?. A speech segment and a reverberation segment of the speech signal is determined. L filter coefficients W(k) (k=1, 2, . . . , L) respectively corresponding to L frames immediately preceding frame T are computed such that the L filter coefficients minimize a function ? that is a linear combination of sum of squares of a residual speech power in the reverberation segment and a sum of squares of a subtracted speech power in the speech segment. The computed L filter coefficients are stored within storage media of the computing apparatus.
    Type: Grant
    Filed: August 7, 2007
    Date of Patent: December 21, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura
  • Patent number: 7848922
    Abstract: An apparatus and method for encoding and decoding a voice signal. The apparatus includes an encoder configured to generate an output bitstream signal from an input voice signal. The output bitstream signal is associated with at least a first standard of a first plurality of CELP voice compression standards. Additionally, the apparatus includes a decoder configured to generate an output voice signal from an input bitstream signal. The input bitstream signal is associated with at least a first standard of a second plurality of CELP voice compression standards. The CELP encoder includes a plurality of codec-specific encoder modules. Additionally, the CELP encoder includes a plurality of generic encoder modules. The CELP decoder includes a plurality of codec-specific decoder modules. Additionally, the CELP decoder includes a plurality of generic decoder modules.
    Type: Grant
    Filed: August 2, 2007
    Date of Patent: December 7, 2010
    Inventors: Marwan A. Jabri, Nicola Chong-White, Jianwei Wang
  • Patent number: 7835912
    Abstract: The present invention discloses a signal processing method adapted to process a synthesized signal in packet loss concealment. The method includes the following steps: receiving a good frame following a lost frame, obtaining an energy ratio of energy of a signal in the signal of the good frame signal to energy of a synthesized signal corresponding to the same time of the good frame; and adjusting the synthesized signal in accordance with the energy ratio. The present invention also discloses a signal processing apparatus and a voice decoder.
    Type: Grant
    Filed: August 11, 2009
    Date of Patent: November 16, 2010
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Wuzhou Zhan, Dongqi Wang, Yongfeng Tu, Jing Wang, Qing Zhang, Lei Miao, Jianfeng Xu, Chen Hu, Yi Yang, Zhengzhong Du, Fengyan Qi
  • Publication number: 20100191658
    Abstract: A customer service issue prediction engine uses one or more models of issue probability. A method of multi-phase customer issue prediction includes a modeling phase, an application phase, and a learning phase. A telephonic interactive voice response (IVR) system predicts customer issues.
    Type: Application
    Filed: January 25, 2010
    Publication date: July 29, 2010
    Inventors: Pallipuram V. Kannan, Mohit Jain, Ravi Vijayaraghavan
  • Publication number: 20100191534
    Abstract: The subject matter disclosed herein relates generally to a system and method for linear prediction of sample values.
    Type: Application
    Filed: January 20, 2010
    Publication date: July 29, 2010
    Applicant: QUALCOMM Incorporated
    Inventors: Sang-Uk Ryu, Yuriy Reznik
  • Patent number: 7747441
    Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal. In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result.
    Type: Grant
    Filed: January 16, 2007
    Date of Patent: June 29, 2010
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventor: Tadashi Yamaura
  • Publication number: 20100131276
    Abstract: A device (2) for changing the pitch of an audio signal (r), such as a speech signal, comprises a sinusoidal analysis unit (21) for determining sinusoidal parameters of the audio signal (r), a parameter production unit (22) for predicting the phase of a sinusoidal component, and a sinusoidal synthesis unit (23) for synthesizing the parameters to produce a reconstructed signal (r?). The parameter production unit (22) receives, for each time segment of the audio signal, the phase of the previous time segment to predict the phase of the current time segment.
    Type: Application
    Filed: July 6, 2006
    Publication date: May 27, 2010
    Applicant: KONINKLIJKE PHILIPS ELECTRONICS, N.V.
    Inventors: Albertus Cornelis Den Brinker, Robert Johannes Sluijter
  • Patent number: 7711563
    Abstract: A method and system are provided for synthesizing a corrupted frame output from a decoder including one or more predictive filters. The corrupted frame is representative of one segment of a decoded signal output from the decoder. The method comprises extrapolating a replacement frame based upon another segment of the decoded signal and substituting the replacement frame for the corrupted frame. Finally, the internal states of the filters are updated based upon the substituting.
    Type: Grant
    Filed: June 28, 2002
    Date of Patent: May 4, 2010
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Patent number: 7707034
    Abstract: Techniques and tools are described for processing reconstructed audio signals. For example, a reconstructed audio signal is filtered in the time domain using filter coefficients that are calculated, at least in part, in the frequency domain. As another example, producing a set of filter coefficients for filtering a reconstructed audio signal includes clipping one or more peaks of a set of coefficient values. As yet another example, for a sub-band codec, in a frequency region near an intersection between two sub-bands, a reconstructed composite signal is enhanced.
    Type: Grant
    Filed: May 31, 2005
    Date of Patent: April 27, 2010
    Assignee: Microsoft Corporation
    Inventors: Xiaoqin Sun, Tian Wang, Hosam A. Khalil, Kazuhito Koishida, Wei-Ge Chen
  • Patent number: 7684978
    Abstract: The present invention overcomes problems of tandem coding method such as degradation of speech quality, increased system latency and computations. An apparatus for trans-coding between code excited linear prediction (CELP) type codecs with different bandwidths, includes: a format parameter translating unit for generating output formant parameters by translating formant parameters from input CELP format to output CELP format; a formant parameter quantizing unit for receiving the output format formant parameters and quantizing the output format formant filter coefficients; an excited parameter translating unit for generating output excitation parameters by translating excitation parameters from input CELP format to output CELP format; and an excitation quantizing unit for receiving the output format excitation parameters and quantizing the output format excitation parameters.
    Type: Grant
    Filed: October 30, 2003
    Date of Patent: March 23, 2010
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Jongmo Sung, Sang Taick Park, Do Young Kim, Bong Tae Kim
  • Patent number: 7680651
    Abstract: In accordance with the exemplary embodiments of the invention there is disclosed at least a method and apparatus for determining a long-term-prediction delay parameter characterizing a long term prediction in a technique using signal modification for digitally encoding a sound signal, the sound signal is divided into a series of successive frames, a feature of the sound signal is located in a previous frame, a corresponding feature of the sound signal is located in a current frame, and the long-term-prediction delay parameter is determined for the current frame while mapping, with the long term prediction, the signal feature of the previous frame with the corresponding signal feature of the current frame. Each divided frame of the sound signal is partitioned into a plurality of signal segments, and at least a part of the signal segments of the frame are warped while constraining the warped signal segments inside the frame.
    Type: Grant
    Filed: December 13, 2002
    Date of Patent: March 16, 2010
    Assignee: Nokia Corporation
    Inventors: Mikko Tammi, Milan Jelinek, Claude LaFlamme, Vesa Ruoppila
  • Publication number: 20090292542
    Abstract: The present invention discloses a signal processing method adapted to process a synthesized signal in packet loss concealment. The method includes the following steps: receiving a good frame following a lost frame, obtaining an energy ratio of energy of a signal in the signal of the good frame signal to energy of a synthesized signal corresponding to the same time of the good frame; and adjusting the synthesized signal in accordance with the energy ratio. The present invention also discloses a signal processing apparatus and a voice decoder.
    Type: Application
    Filed: August 11, 2009
    Publication date: November 26, 2009
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Wuzhou ZHAN, Dongqi WANG, Yongfeng TU, Jing WANG, Qing ZHANG, Lei MIAO, Jianfeng XU, Chen HU, Yi YANG, Zhengzhong DU, Fengyan QI
  • Publication number: 20090187409
    Abstract: Techniques for efficiently encoding an input signal are described. In one design, a generalized encoder encodes the input signal (e.g., an audio signal) based on at least one detector and multiple encoders. The at least one detector may include a signal activity detector, a noise-like signal detector, a sparseness detector, some other detector, or a combination thereof. The multiple encoders may include a silence encoder, a noise-like signal encoder, a time-domain encoder, a transform-domain encoder, some other encoder, or a combination thereof. The characteristics of the input signal may be determined based on the at least one detector. An encoder may be selected from among the multiple encoders based on the characteristics of the input signal. The input signal may be encoded based on the selected encoder. The input signal may include a sequence of frames, and detection and encoding may be performed for each frame.
    Type: Application
    Filed: October 8, 2007
    Publication date: July 23, 2009
    Applicant: Qualcomm Incorporated
    Inventors: Venkatesh Krishnan, Vivek Rajendran, Ananthapadmanabhan A. Kandhadai
  • Publication number: 20090177474
    Abstract: A speech synthesizer includes a periodic component fusing unit and an aperiodic component fusing unit, and fuses periodic components and aperiodic components of a plurality of speech units for each segment, which are selected by a unit selector, by a periodic component fusing unit and an aperiodic component fusing unit, respectively. The speech synthesizer is further provided with an adder, so that the adder adds, edits, and concatenates the periodic components and the aperiodic components of the fused speech units to generate a speech waveform.
    Type: Application
    Filed: September 18, 2008
    Publication date: July 9, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masahiro Morita, Takehiko Kagoshima
  • Publication number: 20090157397
    Abstract: A voice rule-synthesizer synthesizes a voice waveform based on the voice data stored in a database, which stores a large number of compressed voice data sections in a data stream. Each voice data section is stored as a plurality of frames compressed in a fixed-length frame format. The storage capacity of the database is reduced because the compressed voice data sections are stored as the data stream.
    Type: Application
    Filed: February 19, 2009
    Publication date: June 18, 2009
    Inventor: Reishi Kondo
  • Publication number: 20090157409
    Abstract: A method includes, generating, for each parameter of the prosody vector, an initial parameter prediction model with a plurality of attributes related to difference prosody prediction and at least part of attribute combinations of the plurality of attributes, in which each of the plurality of attributes and the attribute combinations is included as an item, calculating importance of each item in the parameter prediction model, deleting the item having the lowest importance calculated, re-generating a parameter prediction model with the remaining items, determining whether the re-generated parameter prediction model is an optimal model, and repeating the step of calculating importance and the steps following the step of calculating importance with the re-generated parameter prediction model, if the re-generated parameter prediction model is determined as not an optimal model, wherein the difference prosody vector and all parameter prediction models of the difference prosody vector constitute the difference pros
    Type: Application
    Filed: December 4, 2008
    Publication date: June 18, 2009
    Inventors: Yi Lifu, Li Jian, Lou Xiaoyan, Hao Jie
  • Patent number: 7546241
    Abstract: In a speech synthesis process, micro-segments are cut from acquired waveform data and a window function. The obtained micro-segments are re-arranged to implement a desired prosody, and superposed data is generated by superposing the re-arranged micro-segments, so as to obtain synthetic speech waveform data. A spectrum correction filter is formed based on the acquired waveform data. At least one of the waveform data, micro-segments, and superposed data is corrected using the spectrum correction filter. In this way, “blur” of a speech spectrum due to the window function applied to obtain micro-segments is reduced, and speech synthesis with high sound quality is realized.
    Type: Grant
    Filed: June 2, 2003
    Date of Patent: June 9, 2009
    Assignee: Canon Kabushiki Kaisha
    Inventors: Masayuki Yamada, Yasuhiro Komori, Toshiaki Fukada
  • Publication number: 20090106027
    Abstract: An object of the invention is to conveniently increase standard patterns registered in a voice recognition device to efficiently extend the amount of words that can be voice-recognized. New standard patterns are generated by modifying a part of an existing standard pattern. A pattern matching unit 16 of a modifying-part specifying unit 14 performs pattern matching process to specify a part to be modified in the existing standard pattern of a usage source. A standard pattern generating unit 18 generates the new standard patterns by cutting or deleting voice data of the modifying part of the usage-source standard pattern, substituting the voice data of the modifying part of the usage-source standard pattern for another voice data, or combining the voice data of the modifying part of the usage-source standard pattern with another voice data. A standard pattern database update unit 20 adds the new standard patterns to a standard pattern database 24.
    Type: Application
    Filed: May 25, 2006
    Publication date: April 23, 2009
    Applicant: Matsushita Electric Industrial Co., Ltd.
    Inventors: Toshiyuki Teranishi, Kouji Hatano
  • Patent number: 7498959
    Abstract: Encoding and/or decoding a wideband signal produces high frequency band spectra from low frequency band spectral information. Linear prediction filter coefficients are determined for the entire wideband spectrum of an input signal. An energy value in each of a plurality of sub-bands in the high frequency band is determined and encoded. The short-term correlation removed input signal is then down-sampled to form a low frequency band signal. At a decoder, the high frequency band signal is generated using the encoded low frequency band signal. The energy in each sub-band of the high frequency band is adjusted using the encoded energy value. Thus, the spectral envelope for the entire wideband signal is synthesized and decoded using linear predictive synthesis.
    Type: Grant
    Filed: June 21, 2007
    Date of Patent: March 3, 2009
    Assignee: Samsung Electronics Co., Ltd
    Inventors: Kang-eun Lee, Eun-mi Oh, Ho-sang Sung, Chang-yong Son, Ki-hyun Choo, Jung-hoe Kim
  • Patent number: 7477614
    Abstract: A switching system and method are provided to facilitate use of videoconference facilities over a plurality of security levels. The system includes a switch coupled to a plurality of codecs and communication networks. Audio/Visual peripheral components are connected to the switch. The switch couples control and data signals between the Audio/Visual peripheral components and one but nor both of the plurality of codecs. The switch additionally couples communication networks of the appropriate security level to each of the codecs. In this manner, a videoconferencing facility is provided for use on both secure and non-secure networks.
    Type: Grant
    Filed: April 29, 2004
    Date of Patent: January 13, 2009
    Assignee: Sandia Corporation
    Inventor: Michael E. Hansen
  • Patent number: 7467083
    Abstract: The present invention relates to a data processing apparatus capable of obtaining high-quality sound data. A tap generation section 121 generates a prediction tap used for a process in a prediction section 125 by extracting decoded speech data in a predetermined positional relationship with subject data of interest within the decoded speech data such that coded data is decoded by a CELP method and by extracting an I code located in a subframe according to a position of the subject data in the subject subframe. Similarly to the tap generation section 122, a tap generation section 122 generates a class tap used for a process in a classification section 123. The classification section 123 performs classification on the basis of the class tap, and a coefficient memory 124 outputs a tap coefficient corresponding to the classification result. The prediction section 125 performs a linear prediction computation by using the prediction tap and the tap coefficient and outputs high-quality decoded speech data.
    Type: Grant
    Filed: January 24, 2002
    Date of Patent: December 16, 2008
    Assignee: Sony Corporation
    Inventors: Tetsujiro Kondo, Tsutomu Watanabe, Hiroto Kimura
  • Publication number: 20080133242
    Abstract: Provided are a frame error concealment method and apparatus and an error concealment scheme construction method and apparatus. The frame error concealment method includes generating a new signal by synthesizing a plurality of previous signals that are similar to a signal of an error frame and reconstructing the signal of the error frame using the generated signal.
    Type: Application
    Filed: August 22, 2007
    Publication date: June 5, 2008
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Ho-sang SUNG, Ki-hyun Choo, Jung-hoe Kim, Eun-mi Oh, Chang-yong Son, Kang-eun Lee
  • Patent number: 7353177
    Abstract: A system and method of controlling the movement of a virtual agent while the agent is listening to a human user during a conversation is disclosed. The method comprises receiving speech data from the user, performing a prosodic analysis of the speech data and controlling the virtual agent movement according to the prosodic analysis.
    Type: Grant
    Filed: September 28, 2005
    Date of Patent: April 1, 2008
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Thomas M. Isaacson, Volker Franz Storm
  • Patent number: 7349852
    Abstract: A system and method of controlling the movement of a virtual agent while the agent is listening to a human user during a conversation is disclosed. The method comprises receiving speech data from the user, performing a prosodic analysis of the speech data and controlling the virtual agent movement according to the prosodic analysis.
    Type: Grant
    Filed: September 28, 2005
    Date of Patent: March 25, 2008
    Assignee: AT&T Corp.
    Inventors: Eric Cosatto, Peter Graf Hans, Thomas M. Isaacson, Franz Storm Volker
  • Publication number: 20080046249
    Abstract: A technique is described herein for updating a state of a decoder in a predictive coding system after synthesizing an audio output signal corresponding to a lost frame in a series of frames representing an encoded audio signal. In accordance with the technique, an audio signal associated with the synthesized output audio signal is re-encoded in an encoder to generate an encoder state, wherein the encoder is simplified with respect to an encoder used to generate the encoded audio signal. The state of the decoder is then updated based on the generated encoder state.
    Type: Application
    Filed: August 15, 2007
    Publication date: February 21, 2008
    Applicant: BROADCOM CORPORATION
    Inventors: Jes Thyssen, Robert W. Zopf, Juin-Hwey Chen
  • Publication number: 20080046248
    Abstract: A technique is described for concealing the effect of a lost frame in a series of frames representing an encoded audio signal in a sub-band predictive coding system. In accordance with the technique, a first synthesized sub-band audio signal is synthesized, wherein synthesizing the first synthesized sub-band audio signal comprises performing waveform extrapolation based on a stored first sub-band decoded audio signal. A second synthesized sub-band audio signal is also synthesized, wherein synthesizing the second synthesized sub-band audio signal comprises performing waveform extrapolation based on the stored second sub-band decoded audio signal. The first synthesized sub-band audio signal and the second synthesized sub-band audio signal are combined to generate a synthesized full-band output audio signal corresponding to a lost frame.
    Type: Application
    Filed: August 15, 2007
    Publication date: February 21, 2008
    Applicant: BROADCOM CORPORATION
    Inventors: Juin-Hwey Chen, Robert W. Zopf, Jes Thyssen