Normalizing Patents (Class 704/224)
  • Patent number: 8219393
    Abstract: An error concealment method and apparatus for an audio signal and a decoding method and apparatus for an audio signal using the error concealment method and apparatus. The error concealment method includes selecting one of an error concealment in a frequency domain and an error concealment in a time domain as an error concealment scheme for a current frame based on a predetermined criteria when an error occurs in the current frame, selecting one of a repetition scheme and an interpolation scheme in the frequency domain as the error concealment scheme for the current frame based on a predetermined criteria when the error concealment in the frequency domain is selected, and concealing the error of the current frame using the selected scheme.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: July 10, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Eun-mi Oh, Ki-hyun Choo, Ho-sang Sung, Chang-yong Son, Jung-hoe Kim, Kang-eun Lee
  • Patent number: 8217811
    Abstract: Methods and apparatus for iteratively encoding a portion of a signal are described in which the portion of the signal is quantised and an output bit count is estimated based on the sum of logarithms to base n of values of each sample in the plurality of quantised samples and the total number of samples. The output bit count corresponds to an estimate of the output bit count for the portion of the signal once encoded using a code, such as a Huffman code.
    Type: Grant
    Filed: September 9, 2008
    Date of Patent: July 10, 2012
    Assignee: Cambridge Silicon Radio Limited
    Inventors: David Hargreaves, Esfandiar Zavarehei
  • Patent number: 8214205
    Abstract: A speech enhancement apparatus and method and a computer-readable recording medium having a program recorded thereon execute a speech enhancement method. The speech enhancement apparatus includes a spectrum subtraction unit generating a subtracted spectrum by subtracting an estimated noise spectrum from a received speech spectrum, a correction function modeling unit generating a correction function to minimize a noise spectrum using variation of a noise spectrum included in training data, and a spectrum correction unit generating a corrected spectrum by correcting the subtracted spectrum using the correction function.
    Type: Grant
    Filed: February 3, 2006
    Date of Patent: July 3, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Giljin Jang, Jeongsu Kim, Kwangcheol Oh, Sungcheol Kim
  • Patent number: 8209167
    Abstract: The mobile radio terminal includes a speech input unit which inputs a speech signal obtained from speech of a speaking person, an estimating unit which estimates a speech style of the speaking person from the speech signal, and a converting unit which converts the speech signal into a converted speech signal in accordance with the estimated speech style.
    Type: Grant
    Filed: September 17, 2008
    Date of Patent: June 26, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Kazunori Imoto
  • Patent number: 8185386
    Abstract: A method for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder receives encoded frames of compressed speech information transmitted from an encoder. The method determines whether an encoded frame has been lost, corrupted in transmission, or erased, synthesizes properly received frames, and decides on an overlap-add window to use in combining a portion of the synthesized speech signal with a subsequent speech signal resulting from a received and decoded packet, where the size of the overlap-add window is based on the unavailability of packets. If it is determined that an encoded frame has been lost, corrupted in transmission, or erased, the method performed an overlap-add operation on the portion of the synthesized speech signal and the subsequent speech signal, using the decided-on overlap-add window.
    Type: Grant
    Filed: July 2, 2010
    Date of Patent: May 22, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: David A. Kapilow
  • Patent number: 8165874
    Abstract: A system, method, and program product for processing voice data in a conversation between two persons to determine characteristic conversation patterns. The system includes: a variation calculator for calculating a variation of a speech ratio of a first speaker and a variation calculator for calculating a variation of a speech ratio of a second speaker; a difference calculator for calculating a difference data string; a smoother for generating a smoothed difference data string; and a presenter for presenting the difference between the variation of the speech ratio of the first speaker and the speech ratio of the second speaker. The method includes: calculating a variation of a speech ratio of a first speaker and a second speaker; calculating a difference data string; generating a smoothed difference data string; and grouping them according to their patterns.
    Type: Grant
    Filed: March 6, 2009
    Date of Patent: April 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Gakuto Kurata, Masafumi Nishimura
  • Patent number: 8160866
    Abstract: The present invention can recognize both English and Chinese at the same time. The most important skill is that the features of all English words (without samples) are entirely extracted from the features of Chinese syllables. The invention normalizes the signal waveforms of variable lengths for English words (Chinese syllables) such that the same words (syllables) can have the same features at the same time position. Hence the Bayesian classifier can recognize both the fast and slow utterance of sentences. The invention can improve the feature such that the speech recognition of the unknown English (Chinese) is guaranteed to be correct. Furthermore, since the invention can create the features of English words from the features of Chinese syllables, it can also create the features of other languages from the features of Chinese syllables and hence it can also recognize other languages, such as German, French, Japanese, Korean, Russian, etc.
    Type: Grant
    Filed: October 10, 2008
    Date of Patent: April 17, 2012
    Inventors: Tze Fen Li, Tai-Jan Lee Li, Shih-Tzung Li, Shih-Hon Li, Li-Chuan Liao
  • Publication number: 20120090028
    Abstract: The invention features systems and methods for detecting and mitigating network attacks in a Voice-Over-IP (VoIP) network. A server is configured to receive information related to a mitigation action for a call. The information can include a complexity level for administering an audio challenge-response test to the call and an identification of the call. The server also generates i) a routing label based on the identification of the call, and ii) a script defining a plurality of variables that store identifications of a plurality of altered sound files for the audio challenge-response test. Each altered sound file is randomly selected by the server subject to one or more constraints associated with the complexity level. The server is further configured to transmit the script to a guardian module and the routing label to a gateway.
    Type: Application
    Filed: October 12, 2011
    Publication date: April 12, 2012
    Inventors: David Lapsley, Wassim Matragi, Miri Mansur, Jonathan Klotzbach, Ti-yuan Dean Shu, Sri Chary, Joby Joseph, Mark Topham, Kenneth Dumble
  • Patent number: 8145478
    Abstract: An audio signal band expanding apparatus (100a) includes a harmonic generator (3) that receives an input audio signal having a predetermined band and generates, based on the input audio signal, harmonic signals, and an adder (2) that adds the harmonic signals generated by the harmonic generator (3) to the input audio signal. The harmonic generator (3) simulates the input-output characteristics of a predetermined amplifier or that of a device to generate the harmonic signals from the input audio signal.
    Type: Grant
    Filed: May 12, 2006
    Date of Patent: March 27, 2012
    Assignee: Panasonic Corporation
    Inventor: Kazuya Iwata
  • Patent number: 8126708
    Abstract: A dynamic normalization factor for a current frame of a signal is determined to reduce loss in precision for low-level signals. The normalization factor depends on an amplitude of the current frame of the signal. The normalization factor also depends on values of filter states after one or more operations were performed on a previous frame of a normalized signal and on the normalization factor for the previous frame. The current frame of the signal is normalized based on the normalization factor that is determined. The states' normalization factor may be adjusted based on the normalization factor that is determined.
    Type: Grant
    Filed: January 30, 2008
    Date of Patent: February 28, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Vivek Rajendran, Ananthapadmanabhan A. Kandhadai
  • Patent number: 8121834
    Abstract: A method of modifying acoustic characteristics of an original audio signal as a function of modification instructions relating at least to the fundamental frequency and the spectral envelope of the original signal.
    Type: Grant
    Filed: March 12, 2008
    Date of Patent: February 21, 2012
    Assignee: France Telecom
    Inventors: Olivier Rosec, Didier Cadic
  • Patent number: 8103020
    Abstract: A system and method are disclosed for enhancing audio signals by nonlinear spectral operations. Successive portions of the audio signal are processed using a subband filter bank. A nonlinear modification is applied to the output of the subband filter bank for each successive portion of the audio signal to generate a modified subband filter bank output for each successive portion. The modified subband filter bank output for each successive portion is processed using an appropriate synthesis subband filter bank to construct a modified time-domain audio signal. High modulation frequency portions of the audio signal may be emphasized or de-emphasized, as desired. The modification may be applied within one or more frequency bands.
    Type: Grant
    Filed: August 15, 2007
    Date of Patent: January 24, 2012
    Assignee: Creative Technology Ltd
    Inventors: Carlos Avendano, Michael Goodwin, Ramkumar Sridharan, Martin Wolters
  • Patent number: 8054948
    Abstract: A system and associated methods provide an audio experience such that a user spatially perceives one or more audio events. One particular method set forth involves obtaining audio events and presenting the audio events so that an audio experience is provided. According to one embodiment of the method, upon obtaining audio events, the audio events are associated with one or more corresponding audio components. Thereafter, the audio experience is determined based on the audio events and associated audio components. The audio experience is then presented such that the user may spatially perceive the audio events.
    Type: Grant
    Filed: June 28, 2007
    Date of Patent: November 8, 2011
    Assignee: Sprint Communications Company L.P.
    Inventors: Michael A. Gailloux, Michael W. Kanemoto
  • Patent number: 8041561
    Abstract: Methods and apparatus, in the context of speech recognition, for compensating in the cepstral domain for the effect of an interfering signal by using a reference signal.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: October 18, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Sabine Deligne, Ramesh A. Gopinath
  • Patent number: 7966175
    Abstract: Methods, devices, and systems for coding and decoding audio are disclosed. Digital samples of an audio signal are transformed from the time domain to the frequency domain. The resulting transform coefficients are coded with a fast lattice vector quantizer. The quantizer has a high rate quantizer and a low rate quantizer. The high rate quantizer includes a scheme to truncate the lattice. The low rate quantizer includes a table based searching method. The low rate quantizer may also include a table based indexing scheme. The high rate quantizer may further include Huffman coding for the quantization indices of transform coefficients to improve the quantizing/coding efficiency.
    Type: Grant
    Filed: October 18, 2006
    Date of Patent: June 21, 2011
    Assignee: Polycom, Inc.
    Inventor: Minjie Xie
  • Patent number: 7962334
    Abstract: A receiving device receives a transmission unit signal that is sent from a sending end and accommodates a result of dividing, the result of the dividing being obtained by quantizing a value based on relative differences between a plurality of sampling values having temporal prior-posterior relationship therebetween, and dividing data produced in a time series in accordance with a result of the quantizing, at the sending end.
    Type: Grant
    Filed: October 27, 2004
    Date of Patent: June 14, 2011
    Assignee: Oki Electric Industry Co., Ltd.
    Inventor: Atsushi Tashiro
  • Publication number: 20110103603
    Abstract: A noise reduction system and a noise reduction method are provided. The noise reduction system comprises a uni-directional microphone, an omni-directional microphone and a signal processing module. The signal processing module comprises an adaptive noise control (ANC) unit, a main noise reduction unit and an optimizing unit. The uni-directional microphone senses a first audio source to output a first audio signal, and the omni-directional microphone senses a second audio source to output a second audio signal. The ANC unit executes an adaptive noise control to output an estimated signal according to the first audio signal and the second audio signal. The main noise reduction unit executes a main noise reduction process to output a de-noise speech signal according to the estimated signal and the second audio signal. The optimizing unit executes an optimizing process to output an optimized speech signal according to the de-noise speech signal.
    Type: Application
    Filed: April 30, 2010
    Publication date: May 5, 2011
    Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE
    Inventors: Shih-Yu Pan, Min-Qiao Lu, Jiun-Bin Huang, Shyang-Jye Chang
  • Patent number: 7930178
    Abstract: A frame of a speech signal is converted into the spectral domain to identify a plurality of frequency components and an energy value for the frame is determined. The plurality of frequency components is divided by the energy value for the frame to form energy-normalized frequency components. A model is then constructed from the energy-normalized frequency components and can be used for speech recognition and speech enhancement.
    Type: Grant
    Filed: December 23, 2005
    Date of Patent: April 19, 2011
    Assignee: Microsoft Corporation
    Inventors: Zhengyou Zhang, Alejandro Acero, Amarnag Subramanya, Zicheng Liu
  • Patent number: 7912228
    Abstract: In a method and equipment for operating a voice-supported system, such as a communications and/or intercom/two-way intercom device in a motor vehicle, using at least one microphone and at least one loudspeaker to reproduce a signal generated by the microphone, as well as a bandpass filter configured between the microphone and the loudspeaker, a power of the signal as a function of a frequency is determined, and the bandpass filter is adjusted as a function of at least one local maximum of the power of the signal as a function of the frequency.
    Type: Grant
    Filed: July 18, 2003
    Date of Patent: March 22, 2011
    Assignees: Volkswagen AG, Audi AG
    Inventors: Brian Michael Finn, Shawn K. Steenhagen
  • Patent number: 7831025
    Abstract: A method and system for administering a subjective listening test to remote users. A user can participate in a subjective listening test, such as an MOS listening test, over a telephone call. The telephone call is received and audio recordings are sequentially played over the telephone call. Quality ratings corresponding to the audio recordings are input by the user over the telephone call. The user can input digits corresponding to the quality ratings. This allows a user to take part in a subjective listening test without traveling to a lab.
    Type: Grant
    Filed: May 15, 2006
    Date of Patent: November 9, 2010
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: John D. Francis, Laurie F. Garrison, James H. James
  • Patent number: 7797161
    Abstract: A method for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder receives encoded frames of compressed speech information transmitted from an encoder. The method determines whether an encoded frame has been lost, corrupted in transmission, or erased, synthesizes properly received frames, and decides on an overlap-add window to use in combining a portion of the synthesized speech signal with a subsequent speech signal resulting from a received and decoded packet, where the size of the overlap-add window is based on the unavailability of packets. If it is determined that an encoded frame has been lost, corrupted in transmission, or erased, the method performed an overlap-add operation on the portion of the synthesized speech signal and the subsequent speech signal, using the decided-on overlap-add window.
    Type: Grant
    Filed: September 12, 2006
    Date of Patent: September 14, 2010
    Inventor: David A. Kapilow
  • Patent number: 7797157
    Abstract: Channel normalization for automatic speech recognition is provided. Statistics are measured from an initial portion of a speech utterance. Feature normalization parameters are estimated based on the measured statistics and a statistically derived mapping relating measured statistics and feature normalization parameters. In some examples, the measured statistics comprise measures of an energy from the initial portion of the speech utterance. In some examples, measures of the energy comprise extreme values of the energy.
    Type: Grant
    Filed: January 10, 2005
    Date of Patent: September 14, 2010
    Assignee: Voice Signal Technologies, Inc.
    Inventors: Igor Zlokarnik, Laurence S. Gillick, Jordan Cohen
  • Publication number: 20100217584
    Abstract: A speech analysis device which accurately analyzes an aperiodic component included in speech in a practical environment where there is background noise includes: a frequency band division unit which divides, into bandpass signals each associated with a corresponding one of frequency bands, an input signal representing a mixed sound of background noise and speech; a noise interval identification unit which identifies a noise interval and a speech interval of the input signal; an SNR calculation unit which calculates an SN ratio; a correlation function calculation unit which calculates an autocorrelation function of each bandpass signal; a correction amount determination unit which determines a correction amount for an aperiodic component ratio, based on the calculated SN ratio; and an aperiodic component ratio calculation unit which calculates, for each frequency band, an aperiodic component ratio of the aperiodic component, based on the determined correction amount and the calculated autocorrelation function.
    Type: Application
    Filed: May 4, 2010
    Publication date: August 26, 2010
    Inventors: Yoshifumi Hirose, Takahiro Kamai
  • Patent number: 7765221
    Abstract: Methods and apparatus, including computer systems and program products, for normalizing computer-represented collections of objects. A first minimum value can be normalized based on a second minimum value of a universal set object that corresponds to the first set object. The second minimum value is both a minimum value supported by a data type (e.g., 1-byte integer) and a minimum value defined to be in the universal set object (e.g., 0 for a universal set of all natural numbers). Similarly, a first maximum value can be normalized based on a second maximum value of the universal set object where the second maximum value is both a maximum value supported by a data type and in the universal set object. Intervals can be normalized, which can involve replacing half-open intervals with equivalent half-closed intervals. Also, a consecutively ordered, uninterrupted, sequence of values of a set object can be normalized.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: July 27, 2010
    Assignee: SAP AG
    Inventor: Peter K. Zimmerer
  • Publication number: 20100174540
    Abstract: Methods, media and apparatus for smoothing a time-varying level of a signal. A method includes estimating a time-varying probability density of a short-term level of the signal and smoothing a level of the signal by using the probability density. The signal may be an audio signal. The short-term level and the smoothed level may be time series, each having current and previous time indices. Here, before the smoothing, computing a probability of the smoothed level at the previous time index may occur. Before the smoothing, calculating smoothing parameters using the probability density may occur. Calculating the smoothing parameters may include calculating the smoothing parameters using the smoothed level at the previous time index, the short-term level at the current time index and the probability of the smoothed level at the previous time index. Calculating the smoothing parameters may include calculating the smoothing parameters using breadth of the estimated probability density.
    Type: Application
    Filed: July 11, 2008
    Publication date: July 8, 2010
    Applicant: Dolby Laboratories Licensing Corporation
    Inventor: Alan Jeffrey Seefeldt
  • Patent number: 7752039
    Abstract: A method for coding speech or other generic signals includes dividing a speech signal into a plurality of frames, and dividing at least one of the plurality of frames into at least two subframe units. A search for a fixed codebook contribution and an adaptive codebook contribution for subframe units is conducted. At least one subframe unit is selected to be coded without the fixed codebook contribution. The encoder may iteratively arrange and encode subframes differently for the same frame, and select for transmission that arrangement that minimizes an error measure across the frame. Various embodiments are shown, as are embodied computer programs, a decoder, and a communication system.
    Type: Grant
    Filed: November 1, 2005
    Date of Patent: July 6, 2010
    Assignee: Nokia Corporation
    Inventor: Bruno Bessette
  • Publication number: 20100145692
    Abstract: The present invention relates to a postfilter and a postfilter control to be associated with a postfilter for improving perceived quality of speech reconstructed at a speech decoder. The postfilter control comprises means for measuring stationarity of a speech signal reconstructed at a decoder, means for determining a coefficient to a postfilter control parameter based on the measured stationarity, and means for transmitting the determined coefficient to a postfilter, such that the postfilter can process the reconstructed speech signal by applying the determined coefficient to the postfilter control parameter to obtain an enhanced speech signal.
    Type: Application
    Filed: November 10, 2007
    Publication date: June 10, 2010
    Inventor: Volodya Grancharov
  • Publication number: 20100121634
    Abstract: The invention relates to audio signal processing. More specifically, the invention relates to enhancing entertainment audio, such as television audio, to improve the clarity and intelligibility of speech, such as dialog and narrative audio. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.
    Type: Application
    Filed: February 20, 2008
    Publication date: May 13, 2010
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: Hannes Muesch
  • Patent number: 7711556
    Abstract: Methods and systems for filtering synthesized or reconstructed speech are implemented. A filter based on a set of linear predictive coding (LPC) coefficients is constructed by transforming the LPC coefficients to the pseudo-cepstrum, a domain existing between LPC domain and the line spectral frequency (LSF) domain. The resulting filter can emphasize spectral frequencies associated with various formants, or spectral peaks, of an inverse transfer function relating to the LPC coefficients, and can de-emphasize spectral frequencies associated with various spectral minima, or spectral valleys, of the inverse transfer function relating to the LPC coefficients.
    Type: Grant
    Filed: August 1, 2007
    Date of Patent: May 4, 2010
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Hong-Goo Kang, Hong Kook Kim
  • Patent number: 7711552
    Abstract: A filter apparatus for filtering a time domain input signal to obtain a time domain output signal, which is a representation of the time domain input signal filtered using a filter characteristic having an non-uniform amplitude/frequency characteristic, comprises a complex analysis filter bank for generating a plurality of complex subband signals from the time domain input signals, a plurality of intermediate filters, wherein at least one of the intermediate filters of the plurality of the intermediate filters has a non-uniform amplitude/frequency characteristic, wherein the plurality of intermediate filters have a shorter impulse response compared to an impulse response of a filter having the filter characteristic, and wherein the non-uniform amplitude/frequency characteristics of the plurality of intermediate filters together represent the non-uniform filter characteristic, and a complex synthesis filter bank for synthesizing the output of the intermediate filters to obtain the time domain output signal.
    Type: Grant
    Filed: September 1, 2006
    Date of Patent: May 4, 2010
    Assignee: Dolby International AB
    Inventor: Lars Villemoes
  • Publication number: 20100106494
    Abstract: A coded code string from an input terminal 110 is demultiplexed by a demultiplexer circuit 101, normalization coefficient information in the code string is sent to a normalization coefficient information increasing/decreasing circuit 102, addition or subtraction of a positive value is performed, and level adjustment of a signal is performed. A normalization coefficient information cutoff amount calculating circuit 103 calculates the cutoff amount for a case where the subtraction amount of normalization coefficient information is larger than normalization coefficient information and normalization coefficient information after subtraction is cut off at the minimum possible value. A gain control function generation information modifying circuit 104 modifies gain control function generation information according to the cutoff amount.
    Type: Application
    Filed: June 26, 2008
    Publication date: April 29, 2010
    Inventor: Hiroyuki Honma
  • Publication number: 20100094622
    Abstract: Systems, method, and apparatus for processing a speech utterance or audio record that includes receiving one or more feature vectors characterizing the speech utterance or audio record, each feature vector having a plurality of feature elements, each feature element being associated with a spectral representation of a characteristic of one of a plurality of sequential segments of the speech utterance or audio record; and processing the one or more feature vectors in a rank order filter to obtain one or more normalized feature vectors, each normalized feature vector having a plurality of normalized feature elements corresponding to the plurality of feature elements.
    Type: Application
    Filed: September 22, 2009
    Publication date: April 15, 2010
    Applicant: Nexidia Inc.
    Inventors: Peter S. Cardillo, Mark A. Clements
  • Publication number: 20100030555
    Abstract: A clipping detection device calculates an amplitude distribution of an input signal for each predetermined period, calculates a deflection degree of the distribution on the basis of the calculated amplitude distribution, and then detects clipping of a communication signal on the basis of the calculated deflection degree of the distribution.
    Type: Application
    Filed: May 21, 2009
    Publication date: February 4, 2010
    Applicant: FUJITSU LIMITED
    Inventors: Takeshi OTANI, Masakiyo TANAKA, Yasuji OTA, Shusaku ITO
  • Patent number: 7653543
    Abstract: The present invention is directed toward a method, device, and system for providing a high quality communication session. The system provides a way of determining speech characteristics of participants in the communication session and adjusting, if necessary, signals from a speaker to a listener such that the listener can more intelligibly understand what the speaker is saying.
    Type: Grant
    Filed: March 24, 2006
    Date of Patent: January 26, 2010
    Assignee: Avaya Inc.
    Inventors: Colin Blair, Jonathan R. Yee-Hang Choy, Andrew W. Lang, David Preshan Thambiratnam, Paul Roller Michaelis
  • Patent number: 7643991
    Abstract: The present invention provides for processing voice data. The vocalic of at least one word associated with the electronic voice signal is elongated. The magnitude of at least one consonant spike of the at least one word associated with the electronic voice signal is increased. Through the emphasis of the consonants, intelligibility of speech is increased.
    Type: Grant
    Filed: August 12, 2004
    Date of Patent: January 5, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Recep Ismail Haritaoglu, Paula Kwit, Robert Bruce Mahaffey, Thomas Guthrie Zimmerman
  • Patent number: 7630892
    Abstract: A method and apparatus are provided that perform text normalization and inverse text normalization using a single grammar. During text normalization, a finite state transducer identifies a second string of symbols from a first string of symbols it receives. During inverse text normalization, the context free transducer identifies the first string of symbols after receiving the second string of symbols.
    Type: Grant
    Filed: September 10, 2004
    Date of Patent: December 8, 2009
    Assignee: Microsoft Corporation
    Inventors: Qiang Wu, Rachel I. Morton, Li Jiang
  • Patent number: 7627469
    Abstract: Any disagreement of the power level before encoding an audio signal and the power level after encoding the audio signal is adjusted to improve the sound quality to the auditory sense.
    Type: Grant
    Filed: May 19, 2005
    Date of Patent: December 1, 2009
    Assignee: Sony Corporation
    Inventors: Benjamin Frederic Nettre, Keisuke Toyama, Shiro Suzuki
  • Patent number: 7627477
    Abstract: The present invention provides an innovative technique for rapidly and accurately determining whether two audio samples match, as well as being immune to various kinds of transformations, such as playback speed variation. The relationship between the two audio samples is characterized by first matching certain fingerprint objects derived from the respective samples. A set (230) of fingerprint objects (231,232), each occurring at a particular location (242), is generated for each audio sample (210). Each location (242) is determined in dependence upon the content of the respective audio sample (210) and each fingerprint object (232) characterizes one or more local features (222) at or near the respective particular location (242). A relative value is next determined for each pair of matched fingerprint objects. A histogram of the relative values is then generated. If a statistically significant peak is found, the two audio samples can be characterized as substantially matching.
    Type: Grant
    Filed: October 21, 2004
    Date of Patent: December 1, 2009
    Assignee: Landmark Digital Services, LLC
    Inventors: Avery Li-Chun Wang, Daniel Culbert
  • Patent number: 7624008
    Abstract: Methods and devices for objectively predicting perceptual quality of speech signals degraded in a speech processing/transporting system which may have poor prediction results for degraded signals including extremely weak or silent portions. Improvement is achieved by applying a first scaling step in a pre-processing stage with a first scaling factor which is a function of a reciprocal value of power of the output signal increased by an adjustment value, and by a second scaling step with a second scaling factor which is substantially equal to the first scaling factor raised to an exponential value and with an adjustment value between zero and one. The second scaling step may be performed at various locations in the device. The adjustment values are adjusted using test signals with well-defined subjective quality scores.
    Type: Grant
    Filed: March 1, 2002
    Date of Patent: November 24, 2009
    Assignee: Koninklijke KPN N.V.
    Inventors: John Gerard Beerends, Andries Pieter Hekstra
  • Publication number: 20090281800
    Abstract: A speech intelligibility enhancement (SIE) system and method is described that improves the intelligibility of a speech signal to be played back by an audio device when the audio device is located in an environment with loud acoustic background noise. In an embodiment, the audio device comprises a near-end telephony terminal and the speech signal comprises a speech signal received over a communication network from a far-end telephony terminal for playback at the near-end telephony terminal.
    Type: Application
    Filed: May 12, 2009
    Publication date: November 12, 2009
    Applicant: BROADCOM CORPORATION
    Inventors: Wilfrid LeBlanc, Juin-Hwey Chen, Jes Thyssen
  • Patent number: 7613611
    Abstract: Provided is a method and an apparatus for vocal-cord signal recognition. A signal processing unit receives and digitalizes a vocal cord signal, and a noise removing unit which channel noise included in the vocal cord signal. A feature extracting unit extracts a feature vector from the vocal cord signal, which has the channel noise removed therefrom, and a recognizing unit calculates a similarity between the vocal cord signal and the learned model parameter. Consequently, the apparatus is robust in a noisy environment.
    Type: Grant
    Filed: May 26, 2005
    Date of Patent: November 3, 2009
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Kwan Hyun Cho, Mun Sung Han, Young Giu Jung, Hee Sook Shin, Jun Seok Park, Dong Won Han
  • Patent number: 7606702
    Abstract: A code separation/decoding unit restores a vocal tract characteristic sp1 and a vocal source signal r1. A vocal tract characteristic modification unit modifies the vocal tract characteristic sp1 and outputs the modified vocal tract characteristic sp2. In this method, an emphasized vocal tract characteristic sp2 is generated to output by applying formant emphasis, using amplification ratios calculated based on estimated formants, directly to the vocal tract characteristic sp1 for instance. A signal synthesis unit synthesizes the modified vocal tract characteristic sp2 and the vocal source signal r1 to generate and output an output voice, s.
    Type: Grant
    Filed: April 27, 2005
    Date of Patent: October 20, 2009
    Assignee: Fujitsu Limited
    Inventors: Masakiyo Tanaka, Masanao Suzuki, Yasuji Ota, Yoshiteru Tsuchinaga
  • Patent number: 7593849
    Abstract: A normalizer (100, 300) of the accent of accented speech modifies (210, 410) the characteristics of input signals that represent the speech spoken in an individual voice with an accent to form output signals that represent the speech spoken in the same voice but with less or no accent.
    Type: Grant
    Filed: January 28, 2003
    Date of Patent: September 22, 2009
    Assignee: AVAYA, Inc.
    Inventors: Sharmistha S. Das, Richard A. Windhausen
  • Publication number: 20090228268
    Abstract: A system, method, and program product for processing voice data in a conversation between two persons to determine characteristic conversation patterns. The system includes: a variation calculator for calculating a variation of a speech ratio of a first speaker and a variation calculator for calculating a variation of a speech ratio of a second speaker; a difference calculator for calculating a difference data string; a smoother for generating a smoothed difference data string; and a presenter for presenting the difference between the variation of the speech ratio of the first speaker and the speech ratio of the second speaker. The method includes: calculating a variation of a speech ratio of a first speaker and a second speaker; calculating a difference data string; generating a smoothed difference data string; and grouping them according to their patterns.
    Type: Application
    Filed: March 6, 2009
    Publication date: September 10, 2009
    Inventors: Gakuto Kurata, Masafumi Nishimura
  • Patent number: 7587313
    Abstract: The method creates an audio stream comprising tracks of sinusoidal components linked across a plurality of sequential time segments. Segments in each track are weighted with a normal window (WI, W2, W3), and consecutive segments have a normal period of overlap (0) of their trailing edges and leading edges. Segments in which a transient5 component is determined are weighted with a first modified window (WIm) having a modified trailing edge, and the following segment in the track is weighted with a second modified window (W2m) having a modified leading edge, so that the modified trailing edge and the modified leading edge have a modified period of overlap (0m) that comprises the transient component and that is shorter than the normal period of overlap (0), and wherein the audio stream includes sinusoidal codes representing the frequency and the transient. According to the invention, the modified period of overlap (0m) depends on the frequency value (f).
    Type: Grant
    Filed: March 8, 2005
    Date of Patent: September 8, 2009
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Andreas Johannes Gerrits, Albertus Cornelis Den Brinker
  • Patent number: 7565289
    Abstract: A transient echo can be avoided during time stretching of a digital audio signal by detecting a transient in a frame of a digital audio signal, identifying another occurrence of the transient in a subsequent frame of the digital audio signal, rotating the transient occurring in the subsequent frame to align the transient occurring in the subsequent frame with the transient detected in the frame, and aggregating the frame with the subsequent frame. Further, another occurrence of the transient can be identified in another subsequent frame of the digital audio signal and it can be determined that the transient occurring in that subsequent frame cannot be aligned with the transient detected in the frame. The copy of the transient occurring in the another subsequent frame can then be blended across that frame, such as by performing phase accumulation on one or more frequency components.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: July 21, 2009
    Assignee: Apple Inc.
    Inventor: Kevin Christopher Rogers
  • Patent number: 7539614
    Abstract: In a sound reproduction or recording system an audio signal is multiplied by a gain factor (z) which is dependent on the input level (y). The dependence of the gain factor on input level is chosen such that unvoiced phonemes are at least 6 dB, preferably at least 12 dB more enhanced than voiced phonemes, where preferably the average gain is less than 6 dB. This improves the intelligibility.
    Type: Grant
    Filed: May 17, 2004
    Date of Patent: May 26, 2009
    Assignee: NXP B.V.
    Inventor: Christophe Marc Macours
  • Patent number: 7512534
    Abstract: Primary and alternate optimization procedures are used to improve the ITU-T G.723.1 speech coding standard (the “Standard”) by replacing the Hamming window of the Standard with an optimized window, with two windows, or with two windows and an additional performance of an autocorrelation method. When two windows replace the Hamming window, at least one of which is an optimized window, generally the first is used to determine optimized unquantized LP coefficients which are used to define an optimized perceptual weighting filter, and the second is used to determine optimized unquantized LP coefficients which are used to determine optimized synthesis coefficients. Optimized windows created using the primary and alternate optimization procedures and used in the Standard yield improvements in the objective and subjective quality of synthesized speech produced by the Standard. The improved Standard, methods, and widow can all be implemented as computer readable software code.
    Type: Grant
    Filed: November 9, 2006
    Date of Patent: March 31, 2009
    Assignee: NTT DoCoMo, Inc.
    Inventor: Wai C. Chu
  • Patent number: 7493255
    Abstract: To alleviate problems of signal aliasing and to reduce complexity, Linear Predictive Coefficients (LPCS) are calculated from samples of audio signals and Line Spectral Frequency (LSF) vectors are extracted from the LPCs with a rate higher than a desired vector rate, the LSF vectors comprising values of different LSF parameters. Next, an LSF track is formed for at least one of the LSF parameters. At least one of the formed LSF tracks is then low pass filtered. Finally, decimated LSF vectors are reconstructed from the low pass filtered LSF tracks, the decimated number corresponding to the desired vector rate. The invention equally relates to a corresponding computer program, to corresponding devices and to a corresponding communication network.
    Type: Grant
    Filed: April 10, 2003
    Date of Patent: February 17, 2009
    Assignee: Nokia Corporation
    Inventors: Khaldoon Taha Al-Naimi, Stephane Villette, Ahmet Kondoz
  • Patent number: RE40691
    Abstract: An audio type signal is encoded. The signal is first divided into bands. For each band, a yardstick signal element is selected. The yardstick may be the signal element having the largest magnitude in the band, the second largest, closest to the median magnitude, or having some other selected magnitude. This magnitude is used for various purposes, including assigning bits to the different bands, and for establishing reconstruction levels within a band. The magnitude of non yardstick signal elements is also quantized. The encoded signal is also decoded. Apparatus for both encoding and decoding are also disclosed. The location of the yardstick element within its band may also be recorded and encoded, and used for efficiently allocating bits to non-yardstick signal elements. Split bands may be established, such that each split band includes a yardstick signal element and each full band includes a major and a minor yardstick signal element.
    Type: Grant
    Filed: June 17, 1999
    Date of Patent: March 31, 2009
    Assignee: Massachusetts Institute of Technology
    Inventor: Jae S. Lim