Pitch Patents (Class 704/207)
  • Patent number: 8520536
    Abstract: An apparatus and method for recovering lost voice packets are provided, in which a packet loss detector determines whether a received packet has been lost, packet information storage stores voice information of previous voice packets and voice information of the received voice packet, a packet error corrector measures the voice information of the received voice packet, stores the measured voice information in the packet information storage, corrects the voice information when necessary, and generates a corrected voice packet, if the received voice packet is normal, and a packet loss recoverer recovers the voice information of the received voice packet using the voice information of previous voice packets stored in the packet information storage and generates a recovered voice packet, if the received voice packet has been lost.
    Type: Grant
    Filed: April 25, 2007
    Date of Patent: August 27, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jung-Woo Ku, Austin Kim, Ho-Chong Park, Jae-Bum Kim, Chul-Yong Ahn, Pavel Martynovich
  • Patent number: 8521519
    Abstract: An adaptive sound source vector quantization device includes a first pitch cycle instructor, a search range calculator, and a second pitch cycle instructor. The first pitch cycle instructor successively instructs pitch cycle search candidates in a predetermined search range having a search resolution which transits over a predetermined pitch cycle candidate for the first sub-frame. The search range calculator calculates a predetermined range before and after the pitch cycle of the first sub-frame as the pitch cycle search range for the second sub-frame, if the predetermined range includes the predetermined pitch cycle search candidate. In the predetermined range, the search resolution transits over a boundary defined by the predetermined pitch cycle. The second pitch cycle instructor successively instructs the pitch cycle search candidates in the search range for the second sub-frame.
    Type: Grant
    Filed: February 29, 2008
    Date of Patent: August 27, 2013
    Assignee: Panasonic Corporation
    Inventors: Kaoru Sato, Toshiyuki Morii
  • Patent number: 8515744
    Abstract: Method, apparatus, and system for encoding and decoding signals are disclosed. The encoding method includes: converting a first-domain signal into a second-domain signal; performing Linear Prediction (LP) processing and Long-Term Prediction (LTP) processing for the second-domain signal; obtaining a long-term flag according to a decision criterion; obtaining a second-domain predictive signal according to the LP processing result and the LTP processing result when the long-term flag is a first flag; or obtaining a second-domain predictive signal according to the LP processing result when the long-term flag is a second flag; converting the second-domain predictive signal into a first-domain predictive signal, calculating a first-domain predictive residual signal; and outputting a bit stream that includes the first-domain predictive residual signal.
    Type: Grant
    Filed: June 29, 2011
    Date of Patent: August 20, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Dejun Zhang, Lei Miao, Jianfeng Xu, Fengyan Qi, Qing Zhang, Lixiong Li, Fuwei Ma, Yang Gao
  • Patent number: 8515742
    Abstract: In an embodiment, a method of transmitting an input audio signal is disclosed. A first coding error of the input audio signal with a scalable codec having a first enhancement layer is encoded, and a second coding error is encoded using a second enhancement layer after the first enhancement layer. Encoding the second coding error includes coding fine spectrum coefficients of the second coding error to produce coded fine spectrum coefficients, and coding a spectral envelope of the second coding error to produce a coded spectral envelope. The coded fine spectrum coefficients and the coded spectral envelope are transmitted.
    Type: Grant
    Filed: September 15, 2009
    Date of Patent: August 20, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Patent number: 8515747
    Abstract: A transmitted data that includes audio data and a transmitted spectral sharpness parameter representing a spectral harmonic/noise sharpness of a plurality of subbands are received. A measured spectral sharpness parameter is estimated from received audio data. The transmitted spectral sharpness parameter is compared with the measured spectral sharpness parameter. A main sharpness control parameter is formed for each of the decoded subbands. The main sharpness control parameter for each of the decoded subbands is analyzed. Ones of the decoded subbands are sharpened if the corresponding main sharpness control indicates that a corresponding subband is not sharp enough, wherein sharpened subbands are formed. Likewise, ones of the decoded subbands are flattened if the corresponding main sharpness control indicates that a corresponding subband is not flat enough, wherein flattened subbands are formed.
    Type: Grant
    Filed: September 4, 2009
    Date of Patent: August 20, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Publication number: 20130211827
    Abstract: The subject disclosure is directed towards dynamically computing anti-aliasing filter coefficients for sample rate conversion in digital audio. In one aspect, for each input-to-output sampling rate ratio (pitch) obtained, anti-aliasing filter coefficients are interpolated based upon the pitch (e.g., using the fractional part of the ratio) from two filters (coefficient sets) selected based upon the pitch (e.g., using the integer part of the ratio). The interpolation provides for fine-grained cutoff frequencies, and by re-computation for each pitch, smooth anti-aliasing with dynamically changing ratios.
    Type: Application
    Filed: February 12, 2013
    Publication date: August 15, 2013
    Applicant: MICROSOFT CORPORATION
    Inventor: Microsoft Corporation
  • Patent number: 8494844
    Abstract: A computerized method and system is provided for automatically selecting from a digitized sound sample a segment of the sample that is optimal for the purpose of measuring clinical metrics for voice and speech assessment. A quality measure based on quality parameters of segments of the sound sample is applied to candidate segments to identify the highest quality segment within the sound sample. The invention can optionally provide feedback to the speaker to help the speaker increase the quality of the sound sample provided. The invention also can optionally perform sound pressure level calibration and noise calibration. The invention may optionally compute clinical metrics on the selected segment and may further include a normative database method or system for storing and analyzing clinical measurements.
    Type: Grant
    Filed: November 19, 2009
    Date of Patent: July 23, 2013
    Assignee: Human Centered Technologies, Inc.
    Inventor: David N. Fernandes
  • Patent number: 8494842
    Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.
    Type: Grant
    Filed: November 3, 2008
    Date of Patent: July 23, 2013
    Assignee: Soundhound, Inc.
    Inventors: Aaron Master, Seyed Majid Emami
  • Patent number: 8494863
    Abstract: The present invention teaches a new audio coding system that can code both general audio and speech signals well at low bit rates. A proposed audio coding system comprises a linear prediction unit for filtering an input signal based on an adaptive filter; a transformation unit for transforming a frame of the filtered input signal into a transform domain; a quantization unit for quantizing a transform domain signal; a long term prediction unit for determining an estimation of the frame of the filtered input signal based on a reconstruction of a previous segment of the filtered input signal; and a transform domain signal combination unit for combining, in the transform domain, the long term prediction estimation and the transformed input signal to generate the transform domain signal.
    Type: Grant
    Filed: December 30, 2008
    Date of Patent: July 23, 2013
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Arijit Biswas, Heiko Purnhagen, Kristofer Kjoerling, Barbara Resch, Lars Villemoes, Per Hedelin
  • Patent number: 8489403
    Abstract: The APPARATUSES, METHODS AND SYSTEMS FOR SPARSE SINUSOIDAL AUDIO PROCESSING AND TRANSMISSION (hereinafter “SS-Audio”) provides a platform for encoding and decoding audio signals based on a sparse sinusoidal structure. In one embodiment, the SS-Audio encoder may encode received audio inputs based on its sparse representation in the frequency domain and transmit the encoded and quantized bit streams. In one embodiment, the SS-Audio decoder may decode received quantized bit streams based on sparse reconstruction and recover the original audio input by reconstructing the sinusoidal parameters in the frequency domain.
    Type: Grant
    Filed: August 25, 2010
    Date of Patent: July 16, 2013
    Assignee: Foundation For Research and Technology—Institute of Computer Science ‘FORTH-ICS’
    Inventors: Anthony Griffin, Athanasios Mouchtaris, Panagiotis Tsakalides
  • Patent number: 8484018
    Abstract: An input frame data producing unit produces from data stored in an input buffer input frames each including a predetermined number of sub-frames of a first hopsize determined based on the first frame size and the overlapping rate. A frame processing unit executes a window function on the input frames and shifts the windowed input frames by the first hopsize and overlaps the shifted input frames, storing the overlapped frames in an output frame. An output buffer data producing frame unit stores data from the output frame to an output buffer including a predetermined number of sub-frames of a second hopsize. A CPU sets the first hopsize and overlapping rate in a slow-speed reproduction when the reproducing speed ratio is set lower than 1 different from in a high-speed reproduction when the reproducing speed ratio is set larger than 1.
    Type: Grant
    Filed: July 15, 2010
    Date of Patent: July 9, 2013
    Assignee: Casio Computer Co., Ltd
    Inventor: Masaru Setoguchi
  • Patent number: 8477050
    Abstract: A system and method for redundant transmission is provided. In one embodiment, an input signal S is encoded as a list of fragments. Each fragment includes an index value and a projection value. The index points to an entry in a dictionary of signal elements. A repetition factor is assigned to each fragment based on its importance. After a fragment is added, a reconstructed signal is generated by decoding the list of fragments. Encoding terminates once the reconstructed signal is sufficiently close to the original signal S.
    Type: Grant
    Filed: September 15, 2011
    Date of Patent: July 2, 2013
    Assignee: Google Inc.
    Inventor: Pascal Massimino
  • Publication number: 20130166287
    Abstract: System and method embodiments for dual modes pitch coding are provided. The system and method embodiments are configured to adaptively code pitch lags of a voiced speech signal using one of two pitch coding modes according to a pitch length, stability, or both. The two pitch coding modes include a first pitch coding mode with relatively high precision and reduced dynamic range, and a second pitch coding mode with relatively large dynamic range and reduced precision. The first pitch coding mode is used upon determining that the voiced speech signal has a relatively short or substantially stable pitch. The second pitch coding mode is used upon determining that the voiced speech signal has a relatively long or less stable pitch or is a substantially noisy signal.
    Type: Application
    Filed: December 21, 2012
    Publication date: June 27, 2013
    Applicant: Huawei Technologies Co., Ltd.
    Inventor: Huawei Technologies Co., Ltd.
  • Publication number: 20130166288
    Abstract: System and method embodiments are provided for very short pitch detection and coding for speech or audio signals. The system and method include detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in time domain and detecting a lack of low frequency energy in the speech or audio signal in frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation that is smaller than the conventional minimum pitch limitation.
    Type: Application
    Filed: December 21, 2012
    Publication date: June 27, 2013
    Applicant: Huawei Technologies Co., Ltd.
    Inventor: Huawei Technologies Co., Ltd.
  • Patent number: 8473283
    Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.
    Type: Grant
    Filed: November 3, 2008
    Date of Patent: June 25, 2013
    Assignee: Soundhound, Inc.
    Inventors: Aaron Master, Seyed Majid Emami
  • Patent number: 8468014
    Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.
    Type: Grant
    Filed: November 3, 2008
    Date of Patent: June 18, 2013
    Assignee: Soundhound, Inc.
    Inventors: Aaron Master, Seyed Majid Emami
  • Publication number: 20130151245
    Abstract: The invention relates to a method for establishing fundamental frequency curves of a plurality of signal sources from a single-channel audio recording of a mix signal, said method including the following steps: a) establishing the spectrogram properties of the pitch states of individual signal sources with use of training data; b) establishing the probabilities of the fundamental frequency combinations of the signal sources contained in the mix signal by a combination of the properties established in a) by means of an interaction model; and c) tracking the fundamental frequency curves of the individual signal sources.
    Type: Application
    Filed: February 22, 2011
    Publication date: June 13, 2013
    Applicant: TECHNISCHE UNIVERSITAT GRAZ
    Inventors: Michael Stark, Michael Wohlmayr, Franz Pernkopf
  • Patent number: 8463600
    Abstract: A system and method for automatically adjusting floor controls based on conversational characteristics is provided. Audio streams are received, which each originate from an audio source. Floor controls for a current configuration including at least a portion of the audio streams are maintained. Conversational characteristics shared by two or more of the audio sources are determined. Possible configurations for the audio streams are identified based on the conversational characteristics. An analysis of the current configuration and the possible configurations is performed. A change threshold comprising a minimum number of timeslices for at least one of the current configuration and one of the possible configurations is applied to the analysis. When the analysis satisfies the change threshold, the floor controls are automatically adjusted. The audio streams are mixed into one or more outputs based on the adjusted floor controls.
    Type: Grant
    Filed: February 27, 2012
    Date of Patent: June 11, 2013
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Paul Masami Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison Gyle Woodruff
  • Patent number: 8463599
    Abstract: A method includes defining a transition band for a signal having a spectrum within a first frequency band, where the transition band is defined as a portion of the first frequency band, and is located near an adjacent frequency band that is adjacent to the first frequency band. The method analyzes the transition band to obtain a transition band spectral envelope and a transition band excitation spectrum; estimates an adjacent frequency band spectral envelope; generates an adjacent frequency band excitation spectrum by periodic repetition of at least a part of the transition band excitation spectrum with a repetition period determined by a pitch frequency of the signal; and combines the adjacent frequency band spectral envelope and the adjacent frequency band excitation spectrum to obtain an adjacent frequency band signal spectrum. A signal processing logic for performing the method is also disclosed.
    Type: Grant
    Filed: February 4, 2009
    Date of Patent: June 11, 2013
    Assignee: Motorola Mobility LLC
    Inventors: Tenkasi Ramabadran, Mark Jasiuk
  • Publication number: 20130144611
    Abstract: A coding device includes: a pitch contour detection unit which detects a pitch contour of an input audio signal; a dynamic time warping unit which determines the number of pitch nodes based on the pitch contour and generates a first time warping parameter including information indicating the determined number of pitch nodes, a pitch change position, and a pitch change ratio; a first encoder which codes the first time warping parameter; a time warping unit which corrects pitch, using the information obtained from the first time warping parameter, to approximate the pitches of the number of pitch nodes to a predetermined reference value; a second encoder which codes the input audio signal at the corrected pitch; and a multiplexer which multiplexes the coded time warping parameter and the coded audio signal to generate a bitstream.
    Type: Application
    Filed: October 5, 2011
    Publication date: June 6, 2013
    Inventors: Tomokazu Ishikawa, Takeshi Norimatsu, Haishan Zhong, Dan Zhao, Kok Seng Chong
  • Patent number: 8457953
    Abstract: In a method of smoothing background noise in a telecommunication speech session; receiving and decoding S1O a signal representative of a speech session, the signal comprising both a speech component and a background noise component. Subsequently, determining LPC parameters S20 and an excitation signal S30 for the received signal. Thereafter, synthesizing and outputting (S40) an output signal based on the determined LPC parameters and excitation signal. In addition, modifying S35 the determined excitation signal by reducing power and spectral fluctuations of the excitation signal to provide a smoothed output signal.
    Type: Grant
    Filed: February 13, 2008
    Date of Patent: June 4, 2013
    Assignee: Telefonaktiebolaget LM Ericsson (Publ)
    Inventor: Stefan Bruhn
  • Patent number: 8457951
    Abstract: Methods, apparatus, and articles of manufacture are disclosed in which auxiliary information is added to or removed from an audio signal. In one example, the information may be added to the audio signal using at least two frequencies that are dictated by two different frequency transformation block sizes, such that the two frequencies are not fully visible when an incorrect block size is used to perform a frequency transformation. Additionally, in another example, a decoder may compensate for time and frequency affects caused by removing old samples and adding new samples, which, in one example, alleviates the need to perform repeated frequency transformation using different frequency transformation block sizes. Other examples are described.
    Type: Grant
    Filed: January 29, 2009
    Date of Patent: June 4, 2013
    Assignee: The Nielsen Company (US), LLC
    Inventors: Venugopal Srinivasan, Alexander Topchy
  • Publication number: 20130136276
    Abstract: A method and apparatus for receiving and playing a signal in a radio receiver to suppress microphonic feedback are provided by alternately pitch shifting a received audio signal. The pitch of the received audio signal is alternately shifted up and then down, repeatedly over successive intervals of the audio signal, to produce a pitch swing signal which is then played over a speaker. The alternating pitch shifting prevents the buildup of regenerative feedback normally caused by acoustic vibrations coupling into the radio receiver.
    Type: Application
    Filed: November 29, 2011
    Publication date: May 30, 2013
    Applicant: MOTOROLA SOLUTIONS, INC.
    Inventors: V. C. PRAKASH VK CHACKO, THEAN HAI OOI, KAR BOON OUNG, CHEAH HENG TAN, HUOY THYNG YOW
  • Patent number: 8452588
    Abstract: It is possible to improve quality of a decoding signal in a band spread for estimating a high band from a low band of a decoding signal. A first layer encoder encodes a lower band portion below a predetermined frequency of an input signal so as to generate first layer encoded information. A first layer decoder decodes the first layer encoded information so as to generate a first layer demodulated signal. A second layer encoder divides a high band portion higher, than a predetermined frequency, of an input signal into a plurality of sub-bands and estimates each of the sub-bands from the input signal or the first layer decoded signal by using the estimation result of the sub-band adjacent to the lower band side so as to generate second encoded information including the estimation results of the sub-bands.
    Type: Grant
    Filed: March 13, 2009
    Date of Patent: May 28, 2013
    Assignee: Panasonic Corporation
    Inventors: Tomofumi Yamanashi, Masahiro Oshikiri
  • Publication number: 20130132075
    Abstract: The present invention relates to a postfilter and a postfilter control to be associated with a postfilter for improving perceived quality of speech reconstructed at a speech decoder. The postfilter control comprises means for measuring stationarity of a speech signal reconstructed at a decoder, means for determining a coefficient to a postfilter control parameter based on the measured stationarity, and means for transmitting the determined coefficient to a postfilter, such that the postfilter can process the reconstructed speech signal by applying the determined coefficient to the postfilter control parameter to obtain an enhanced speech signal.
    Type: Application
    Filed: January 21, 2013
    Publication date: May 23, 2013
    Applicant: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)
    Inventor: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)
  • Patent number: 8447617
    Abstract: There is provided a method or a device for extending a bandwidth of a first band speech signal to generate a second band speech signal wider than the first band speech signal and including the first band speech signal. The method comprises receiving a segment of the first band speech signal having a low cut off frequency and a high cut off frequency; determining the high cut off frequency of the segment; determining whether the segment is voiced or unvoiced; if the segment is voiced, applying a first bandwidth extension function to the segment to generate a first bandwidth extension in high frequencies; if the segment is unvoiced, applying a second bandwidth extension function to the segment to generate a second bandwidth extension in the high frequencies; using the first bandwidth extension and the second bandwidth extension to extend the first band speech signal beyond the high cut off frequency.
    Type: Grant
    Filed: March 15, 2010
    Date of Patent: May 21, 2013
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Norbert Rossello, Fabien Klein
  • Patent number: 8447592
    Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.
    Type: Grant
    Filed: September 13, 2005
    Date of Patent: May 21, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
  • Patent number: 8442817
    Abstract: It is provided a voice activity decision apparatus capable of accurately performing the decision on the state being associated with a sound interval or a silence interval also in terms of the input signal having many aperiodic components and/or plural mixed different periodic components. The apparatus 1 comprises: an autocorrelation calculating unit 11 for calculating autocorrelation values of an input signal; a delay calculating unit 12 for calculating plural delays at which autocorrelation values calculated by the autocorrelation calculating unit 11 become maximums; a noise deciding unit 13 for deciding whether the input signal is a noise or not based on the plurality of delays calculated by the delay calculating unit 12; and an activity decision unit 14 for performing the activity decision in terms of the input signal based on results of decision by the noise deciding unit 13 and the input signal.
    Type: Grant
    Filed: December 23, 2004
    Date of Patent: May 14, 2013
    Assignee: NTT DoCoMo, Inc.
    Inventors: Nobuhiko Naka, Tomoyuki Ohya
  • Publication number: 20130117015
    Abstract: An audio signal decoder includes a context-based spectral value decoder configured to decode a codeword describing one or more spectral values or at least a portion of a number representation thereof in dependence on a context state. The audio signal decoder also includes a context state determinator configured to determine a current context state in dependence on one or more previously decoded spectral values and a time warping frequency-domain-to-time-domain converter configured to provide a time-warped time-domain representation of a given audio frame on the basis of a set of decoded spectral values provided by the context-based spectral value decoder and in dependence on the time warp information. The context-state determinator is configured to adapt the determination of the context state to a change of a fundamental frequency between subsequent audio frames. An audio signal encoder applies a comparable concept.
    Type: Application
    Filed: September 10, 2012
    Publication date: May 9, 2013
    Inventors: Stefan BAYER, Tom BAECKSTROEM, Ralf GEIGER, Bernd EDLER, Sascha DISCH, Lars VILLEMOES
  • Publication number: 20130117014
    Abstract: Disclosed are various embodiments of multiple microphone based pitch detection. In one embodiment, a method includes obtaining a primary signal and a secondary signal associated with multiple microphones. A pitch value is determined based at least in part upon a level difference between the primary and secondary signals. In another embodiment, a system includes a plurality of microphones configured to provide a primary signal and a secondary signal. A level difference detector is configured to determine a level difference between the primary and secondary signals and a pitch identifier is configured to clip the primary and secondary signals based at least in part upon the level difference. In another embodiment, a method determines the presence of voice activity based upon a pitch prediction gain variation that is determined based at least in part upon a pitch lag.
    Type: Application
    Filed: November 7, 2011
    Publication date: May 9, 2013
    Applicant: BROADCOM CORPORATION
    Inventors: Xianxian Zhang, Alfonsus Lunardhi
  • Patent number: 8438014
    Abstract: According to one embodiment, in a speech processing device, an extractor windows a part of the speech signal and extracts a partial waveform. A calculator performs frequency analysis of the partial waveform to calculate a frequency spectrum. An estimator generates an artificial waveform that is a waveform according to an interval between the pitch marks for each harmonic component having a frequency that is a predetermined multiple of a fundamental frequency of the speech signal and estimates harmonic spectral features representing characteristics of the frequency spectrum of the harmonic component from each of the artificial waveforms. A separator separates the partial waveform into a periodic component produced from periodic vocal-fold vibration as an acoustic source and an aperiodic component produced from aperiodic acoustic sources other than the vocal-fold vibration by using the respective harmonic spectral features and the frequency spectrum of the partial waveform.
    Type: Grant
    Filed: January 26, 2012
    Date of Patent: May 7, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masahiro Morita, Javier Latorre, Takehiko Kagoshima
  • Patent number: 8433073
    Abstract: In a sound effect applying apparatus, an input part frequency-analyzes an input signal of sound or voice for detecting a plurality of local peaks of harmonics contained in the input signal. A subharmonics provision part adds a spectrum component of subharmonics between the detected local peaks so as to provide the input signal with a sound effect. An output part converts the input signal of a frequency domain containing the added spectrum component into an output signal of a time domain for generating the sound or voice provided with the sound effect.
    Type: Grant
    Filed: June 22, 2005
    Date of Patent: April 30, 2013
    Assignee: Yamaha Corporation
    Inventors: Yasuo Yoshioka, Alex Loscos
  • Publication number: 20130096912
    Abstract: In one aspect, the invention provides an audio encoding method characterized by a decision being made as to whether the device which will decode the resulting bit stream Bitstream should apply post filtering including attenuation of interharmonic noise. Hence, the decision whether to use the post filter, which is encoded in the bit stream, is taken separately from the decision as to the most suitable coding mode. In another aspect, there is provided an audio decoding method with a decoding step followed by a post-filtering step, including interharmonic noise attenuation, and being characterized in a step of disabling the post filter in accordance with post filtering information encoded in the bit stream signal. Such a method is well suited for mixed-origin audio signals by virtue of its capability to deactivate the post filter in dependence of the post filtering information only, hence independently of factors such as the current coding mode.
    Type: Application
    Filed: June 23, 2011
    Publication date: April 18, 2013
    Applicant: DOLBY INTERNATIONAL AB
    Inventors: Barbara Resch, Kristofer Kjörling, Lars Villemoes
  • Patent number: 8423371
    Abstract: An encoder capable of reducing the degradation of the quality of the decoded signal in the case of band expansion in which the high band of the spectrum of an input signal is estimated from the low band. In this encoder, a first layer encoder encodes an input signal and generates first encoded information, a first layer decoder decodes the first encoded information and generates a first decoded signal, a characteristic judger analyzes the intensity of the harmonic structure of the input signal and generates harmonic characteristic information representing the analysis result, and a second layer encoder changes, on the basis of the harmonic characteristic information, the numbers of bits allocated to parameters included in second encoded information created by encoding the difference between the input signal and the first decoded signal before creating the second information.
    Type: Grant
    Filed: December 22, 2008
    Date of Patent: April 16, 2013
    Assignee: Panasonic Corporation
    Inventors: Tomofumi Yamanashi, Masahiro Oshikiri
  • Patent number: 8423357
    Abstract: Embodiments of the invention provide a communication device and methods for generating enhanced audio signals. An audio signal comprising a speech signal and a noise signals is acquired at the communication device. A noise processor of the communication device detects a pitch estimation of the audio signal. Thereafter, the audio signal is processed based on the pitch estimation and processing parameters of the audio signals to remove noise signals and generate an enhanced audio signal.
    Type: Grant
    Filed: June 16, 2011
    Date of Patent: April 16, 2013
    Inventors: Sandeep Kulakcherla, Alon Konchitsky, Alberto D Berstein
  • Patent number: 8417516
    Abstract: Provided are a method and apparatus for encoding and decoding a high frequency signal by using a low frequency signal. The high frequency signal can be encoded by extracting a coefficient by linear predicting a high frequency signal, and encoding the coefficient, generating a signal by using the extracted coefficient and a low frequency signal, and encoding the high frequency signal by calculating a ratio between the high frequency signal and an energy value of the generated signal. Also, the high frequency signal can be decoded by decoding a coefficient, which is extracted by linear predicting a high frequency signal, and a low frequency signal, and generating a signal by using the decoded coefficient and the decoded low frequency signal, and adjusting the generated signal by decoding a ratio between the generated signal and an energy value of the high frequency signal.
    Type: Grant
    Filed: January 20, 2012
    Date of Patent: April 9, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ki-hyun Choo, Lei Miao, Eun-mi Oh
  • Patent number: 8417524
    Abstract: Analyzing an audio interaction is provided. At least one change in an emotion of a speaker in an audio interaction and at least one aspect of the audio interaction are identified. The at least one change in an emotion is analyzed in conjunction with the at least one aspect to determine a relationship between the at least one change in an emotion and the at least one aspect, and a result of the analysis is provided.
    Type: Grant
    Filed: February 11, 2010
    Date of Patent: April 9, 2013
    Assignee: International Business Machines Corporation
    Inventors: Om D. Deshmukh, Chitra Dorai, Shailesh Joshi, Maureen E. Rzasa, Ashish Verma, Karthik Visweswariah, Gary J. Wright, Sai Zeng
  • Patent number: 8412518
    Abstract: A representation of an audio signal having a first frame, a second frame following the first frame, and a third frame following the second frame, is derived by estimating first warp information for the first and the second frame and second warp information for the second frame and the third frame, the warp information describing a pitch information of the audio signal. First spectral coefficients for the first and the second frame are derived using the first warp information and a first weighted representation of the first and the second frame, the first weighted representation derived by applying a first window function to the first and the second frames, wherein the first window function depends on the first warp information.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: April 2, 2013
    Assignee: Dolby International AB
    Inventor: Lars Villemoes
  • Patent number: 8396703
    Abstract: A band-limited voice signal is processed to reduce its spectral envelope or harmonic structure, or both. The resulting reduced signal is moved into a frequency band above the upper limit frequency of the band-limited voice signal, and then combined with the band-limited voice signal to form a band expanded signal with improved quality and comprehensibility, free of unnatural high-frequency resonances and unnaturally strong high-frequency harmonics.
    Type: Grant
    Filed: March 5, 2009
    Date of Patent: March 12, 2013
    Assignee: Oki Electric Industry Co., Ltd.
    Inventor: Hiromi Aoyagi
  • Patent number: 8392178
    Abstract: A method of encoding speech, the method comprising: receiving a signal representative of speech to be encoded; at each of a plurality of intervals during the encoding, determining a pitch lag between portions of the signal having a degree of repetition; selecting for a set of said intervals a pitch lag vector from a pitch lag codebook of such vectors, each pitch lag vector comprising a set of offsets corresponding to the offset between the pitch lag determined for each said interval and an average pitch lag for said set of intervals, and transmitting an indication of the selected vector and said average over a transmission medium as part of the encoded signal representative of said speech.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: March 5, 2013
    Assignee: Skype
    Inventor: Koen Bernard Vos
  • Patent number: 8391212
    Abstract: In an embodiment, a method of frequency domain post-processing is disclosed. The method includes applying adaptive modification gain factor to each frequency coefficient, and determining gain factors based on Local Masking Magnitude and Local Masked Magnitude.
    Type: Grant
    Filed: May 4, 2010
    Date of Patent: March 5, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Patent number: 8392199
    Abstract: A clipping detection device calculates an amplitude distribution of an input signal for each predetermined period, calculates a deflection degree of the distribution on the basis of the calculated amplitude distribution, and then detects clipping of a communication signal on the basis of the calculated deflection degree of the distribution.
    Type: Grant
    Filed: May 21, 2009
    Date of Patent: March 5, 2013
    Assignee: Fujitsu Limited
    Inventors: Takeshi Otani, Masakiyo Tanaka, Yasuji Ota, Shusaku Ito
  • Patent number: 8386246
    Abstract: A system is described that performs frame erasure concealment to generate frames of an output speech signal corresponding to a series of erased frames of encoded bit-stream in a manner that conceals the quality-degrading effects of such erased frames. In one embodiment, responsive to the detection of a first erased frame in the series, a number of steps are performed. These steps include deriving long-term and short synthesis filters based on previously-generated portions of the output speech signal, calculating a ringing signal segment based on the long-term and short-term synthesis filters, and generating a frame of the output speech signal corresponding to the first erased frame by overlap adding the ringing signal segment to an extrapolated waveform. Deriving the long-term filter includes estimating a pitch period based on a previously-generated portion of the output speech signal by finding a lag that minimizes a sum of magnitude difference function.
    Type: Grant
    Filed: June 27, 2008
    Date of Patent: February 26, 2013
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Patent number: 8386243
    Abstract: A method and system for regenerating wideband speech from narrowband speech. The method comprises: receiving samples of a narrowband speech signal in a first range of frequencies; modulating received samples of the narrowband speech signal with a modulation signal having a modulating frequency adapted to upshift each frequency in the first range of frequencies by an amount determined by the modulating frequency wherein the modulating frequency is selected to translate into a target band a selected frequency band within the first range of signals; filtering the modulated samples using a high pass filter to form a regenerated speech signal in the target band, wherein the lower limit of the high pass filter defines the lowermost frequency in the target band; and combining the narrow band speech signal with the regenerated speech signal in the target band to regenerate a wideband speech signal.
    Type: Grant
    Filed: June 10, 2009
    Date of Patent: February 26, 2013
    Assignee: Skype
    Inventors: Mattias Nilsson, Soren Vang Andersen, Koen Bernard Vos
  • Patent number: 8386245
    Abstract: There is provided a speech encoder for performing an algorithm that comprises obtaining (205) a plurality of open-loop pitch candidates from a current frame of a speech signal, the plurality of open-loop pitch candidates including a first open-loop pitch candidate and a second open-loop pitch candidate; obtaining (205) a voicing information from one or more previous frames; and selecting (280) one of the plurality of open-loop pitch candidates as a final pitch of the current frame using the voicing information from the one or more previous frames. In one aspect, the voicing information from the one or more previous frames includes a previous pitch of the one or more previous frames. In a further aspect, selecting the final pitch of the current frame includes selecting (210) an initial open-loop pitch from that has the maximum long-term correlation value.
    Type: Grant
    Filed: October 27, 2006
    Date of Patent: February 26, 2013
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Yang Gao
  • Publication number: 20130046533
    Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a region of the signal representing speech. The region can comprise a portion of a frame of the signal representing speech classified as a voiced frame. The region can be marked based on one or more pitch estimates for the region. A cord can be identified within the region based on occurrence of one or more events within the region of the signal. For example, the one or more events can comprise one or more glottal pulses. In such cases, cord can begin with onset of a first glottal pulse and extend to a point prior to an onset of a second glottal pulse. The cord may exclude a portion of the region of the signal prior to the onset of the second glottal pulse.
    Type: Application
    Filed: October 19, 2012
    Publication date: February 21, 2013
    Applicant: RED SHIFT COMPANY, LLC
    Inventor: RED SHIFT COMPANY, LLC
  • Patent number: 8380496
    Abstract: A method and device for improving coding efficiency in audio coding. From the pitch values of a pitch contour of an audio signal, a plurality of simplified pitch contour segments are generated to approximate the pitch contour, based on one or more pre-selected criteria. The contour segments can be linear or non-linear with each contour segment represented by a first end point and a second end point. If the contour segments are linear, then only the information regarding the end points, instead of the pitch values, are provided to a decoder for reconstructing the audio signal. The contour segment can have a fixed maximum length or a variable length, but the deviation between a contour segment and the pitch values in that segment is limited by a maximum value.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: February 19, 2013
    Assignee: Nokia Corporation
    Inventors: Anssi Rämö, Jani Nurminen, Sakari Himanen, Ari Heikkinen
  • Publication number: 20130041657
    Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and a representation of harmonic envelope at the estimated pitch. The estimated pitch and the representation of harmonic envelope may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.
    Type: Application
    Filed: August 8, 2011
    Publication date: February 14, 2013
    Applicant: The Intellisis Corporation
    Inventors: David C. BRADLEY, Rodney Gateau, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher
  • Publication number: 20130041656
    Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and an estimated fractional chirp rate of the harmonics at the estimated pitch. The estimated pitch and the estimated fractional chirp rate may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.
    Type: Application
    Filed: August 8, 2011
    Publication date: February 14, 2013
    Applicant: The Intellisis Corporation
    Inventors: David C. BRADLEY, Daniel S. GOLDIN, Rodney GATEAU, Nicholas K. FISHER, Robert N. HILTON, Derrick R. ROOS, Eric WIEWIORA
  • Patent number: 8374856
    Abstract: A method and apparatus for concealing frame loss and an apparatus for transmitting and receiving a speech signal that are capable of reducing speech quality degradation caused by packet loss are provided. In the method, when loss of a current received frame occurs, a random excitation signal having the highest correlation with a periodic excitation signal (i.e., a pitch excitation signal) decoded from a previous frame received without loss is used as a noise excitation signal to recover an excitation signal of a current lost frame. Furthermore, a third, new attenuation constant (AS) is obtained by summing a first attenuation constant (NS) obtained based on the number of continuously lost frames and a second attenuation constant (PS) predicted in consideration of change in amplitude of previously received frames to adjust the amplitude of the recovered excitation signal for the current lost frame.
    Type: Grant
    Filed: January 9, 2009
    Date of Patent: February 12, 2013
    Assignee: Intellectual Discovery Co., Ltd.
    Inventors: Hong Kook Kim, Choong Sang Cho