Speech Signal Processing Patents (Class 704/200)
  • Patent number: 8165871
    Abstract: Provided are an encoding method and apparatus for efficiently encoding a sinusoidal signal whose magnitude is less than a masking value according to a psychoacoustic model, a decoding method and apparatus for decoding an encoded sinusoidal signal, and a computer-readable recording medium having recorded thereon a program for executing the encoding method/the decoding method. By using a particular code indicating that the magnitude of a first sinusoidal signal is less than a masking value according to a psychoacoustic model to encode the first sinusoidal signal, difference coding for a third sinusoidal signal of a next frame, which is connected to the first sinusoidal signal, is performed using a sinusoidal signal or sinusoidal signals selected according to a method to use the particular code, and a decoding apparatus obtains a sum with a transmitted difference using the selected sinusoidal signal(s).
    Type: Grant
    Filed: June 2, 2008
    Date of Patent: April 24, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Nam-suk Lee, Geon-hyoung Lee, Chul-woo Lee, Han-gil Moon
  • Publication number: 20120095753
    Abstract: A noise power estimation system for estimating noise power of each frequency spectral component includes a cumulative histogram generating section for generating a cumulative histogram for each frequency spectral component of a time series signal, in which the horizontal axis indicates index of power level and the vertical axis indicates cumulative frequency and which is weighted by exponential moving average; and a noise power estimation section for determining an estimated value of noise power for each frequency spectral component of the time series signal based on the cumulative histogram.
    Type: Application
    Filed: September 14, 2011
    Publication date: April 19, 2012
    Applicant: HONDA MOTOR CO., LTD.
    Inventors: Hirofumi NAKAJIMA, Kazuhiro NAKADAI, Yuji HASEGAWA
  • Patent number: 8160889
    Abstract: A bandwidth extension system extends the bandwidth of an acoustic signal. By shifting a portion of the signal by a frequency value, the system generates an upper bandwidth extension signal. An extended bandwidth acoustic signal may be generated from the acoustic signal, the upper bandwidth extension signal, and/or a lower bandwidth extension signal.
    Type: Grant
    Filed: January 17, 2008
    Date of Patent: April 17, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Bernd Iser, Gerhard Nüssle, Gerhard Uwe Schmidt
  • Patent number: 8155971
    Abstract: A method for decoding a multi-audio-object signal having audio signals of first and second types encoded therein, the multi-audio-object signal having a downmix signal and side information having level information of the audio signals of the first and second types in a first predetermined time/frequency resolution, the method including computing a prediction coefficient matrix C based on the level information; and up-mixing the downmix signal based on the prediction coefficients to obtain a first and/or a second up-mix audio signal approximating the audio signals of the first and second types, respectively, wherein up-mixing yields the first and/or second up-mix signals S1 and S2 from the downmix signal d according to a computation representable by ( S 1 S 2 ) = D - 1 ? { ( 1 C ) ? d + H } , with “1” denoting—depending on the number of channels of d—a scalar, or an identity matrix, and D?1 being a matrix uniquely determined by a downmix prescription according
    Type: Grant
    Filed: October 17, 2008
    Date of Patent: April 10, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.
    Inventors: Oliver Hellmuth, Johannes Hilpert, Leonid Terentiev, Cornelia Falch, Andreas Hoelzer, Juergen Herre
  • Patent number: 8150681
    Abstract: A speech enhancement system improves the perceptual quality of an aural signal. A receiver detects and receives an unvoiced signal, a fully voiced signal, or a mixed voice remote signal. A coherence processor identifies the similarities or differences between a local signal and the remote signal. A cancellation processor or controller dampens reflected signals that may be part of the local signal.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: April 3, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Phillip A. Hetherington, Shreyas A. Paranjpe
  • Patent number: 8139788
    Abstract: An audio signal separation apparatus for separating observation signals in the time domain of a mixture of a plurality of signals including audio signals into individual signals by means of independent component analysis to produce isolated signals adapted to produce isolated signals in the time-frequency domain from the observation signals in the time-frequency domain and a separation matrix substituted by initial values, compute the modified value of the separation matrix by using a score function using the isolated signals in the time-frequency domain and a multidimensional probability density function and the separation matrix, modify the separation matrix until the separation matrix substantially converges by using the modified value and produce isolated signals in the time-frequency domain by using the substantially converging separation matrix.
    Type: Grant
    Filed: January 24, 2006
    Date of Patent: March 20, 2012
    Assignee: Sony Corporation
    Inventors: Atsuo Hiroe, Keiichi Yamada, Helmut Lucke
  • Patent number: 8131550
    Abstract: An apparatus for providing improved voice conversion includes a sub-feature generator and a transformation element. The sub-feature generator may be configured to define sub-feature units with respect to a feature of source speech. The transformation element may be configured to perform voice conversion of the source speech to target speech based on the conversion of the sub-feature units to corresponding target speech sub-feature units using a conversion model trained with respect to converting training source speech sub-feature units to training target speech sub-feature units.
    Type: Grant
    Filed: October 4, 2007
    Date of Patent: March 6, 2012
    Assignee: Nokia Corporation
    Inventors: Jani Nurminen, Elina Helander
  • Patent number: 8131541
    Abstract: A two microphone noise reduction system is described. In an embodiment, input signals from each of the microphones are divided into subbands and each subband is then filtered independently to separate noise and desired signals and to suppress non-stationary and stationary noise. Filtering methods used include adaptive decorrelation filtering. A post-processing module using adaptive noise cancellation like filtering algorithms may be used to further suppress stationary and non-stationary noise in the output signals from the adaptive decorrelation filtering and a single microphone noise reduction algorithm may be used to further provide optimal stationary noise reduction performance of the system.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: March 6, 2012
    Assignee: Cambridge Silicon Radio Limited
    Inventors: Kuan-Chieh Yen, Rogerio Guedes Alves
  • Patent number: 8126709
    Abstract: An audio signal is conveyed more efficiently by transmitting or recording a baseband of the signal with an estimated spectral envelope and a noise-blending parameter derived from a measure of the signal's noise-like quality. The signal is reconstructed by translating spectral components of the baseband signal to frequencies outside the baseband, adjusting phase of the regenerated components to maintain phase coherency, adjusting spectral shape according to the estimated spectral envelope, and adding noise according to the noise-blending parameter. Preferably, the transmitted or recorded signal also includes an estimated temporal envelope that is used to adjust the temporal shape of the reconstructed signal.
    Type: Grant
    Filed: February 24, 2009
    Date of Patent: February 28, 2012
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael Mead Truman, Mark Stuart Vinton
  • Publication number: 20120046940
    Abstract: A method for processing multichannel acoustic signals, whereby input signals of a plurality of channels including the voices of a plurality of speaking persons are processed. The method is characterized by comprising: calculating the first feature quantity of the input signals of the multichannels for each channel; calculating similarity of the first feature quantity of each channel between the channels; selecting channels having high similarity; separating signals using the input signals of the selected channels; inputting the input signals of the channels having low similarity and the signals after the signal separation; and detecting a voice section of each speaking person or each channel.
    Type: Application
    Filed: February 8, 2010
    Publication date: February 23, 2012
    Applicant: NEC CORPORATION
    Inventors: Masanori Tsujikawa, Tadashi Emori, Yoshifumi Onishi, Ryosuke Isotani
  • Patent number: 8121834
    Abstract: A method of modifying acoustic characteristics of an original audio signal as a function of modification instructions relating at least to the fundamental frequency and the spectral envelope of the original signal.
    Type: Grant
    Filed: March 12, 2008
    Date of Patent: February 21, 2012
    Assignee: France Telecom
    Inventors: Olivier Rosec, Didier Cadic
  • Patent number: 8118712
    Abstract: Methods and devices for executing a computerized talk test, comprising: following an exercise program; providing auditory signals from time to time; recording a user talking according to the auditory signals; analyzing the recordings to identify predefined properties; and providing an indication of the user's current heart rate training zone based on the analysis results. Other embodiments discuss a device configured to determine when a user approximately sustains a fixed level of physical effort and then execute a computerized talk test, and a device configured to identify a user's heart rate training zone based on identification of involuntary interruptions in the user's voice speech.
    Type: Grant
    Filed: June 13, 2009
    Date of Patent: February 21, 2012
    Inventors: Gil Thieberger, Ari Keren-Yaar, Tal Thieberger
  • Patent number: 8121836
    Abstract: In one embodiment, at least one channel in a frame of the audio signal is subdivided into a plurality of blocks such that at least two of the blocks having different lengths. A length of the frame is a user defined value and is determined within a predetermined value. Furthermore, information indicating the subdivision of the channel into the blocks is generated.
    Type: Grant
    Filed: July 7, 2006
    Date of Patent: February 21, 2012
    Assignee: LG Electronics Inc.
    Inventor: Tilman Liebchen
  • Patent number: 8121411
    Abstract: A linear transformation matrix calculating apparatus linearly transforms a plurality of dictionary subspaces which belong to respective categories by a linear transformation matrix respectively, selects a plurality of sets of two dictionary subspaces from the plurality of linearly transformed dictionary subspaces, calculates a loss function using similarities among the selected sets of dictionary subspaces respectively, calculates a differential parameter obtained by differentiating the loss function by the linear transformation matrix, calculates a new linear transformation matrix from the differential parameter and the linear transformation matrix by Deepest Descent Method, and updates the new linear transformation matrix as the linear transformation matrix used in the linear transformation unit.
    Type: Grant
    Filed: September 14, 2009
    Date of Patent: February 21, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Susumu Kubota, Tomokazu Kawahara
  • Patent number: 8117028
    Abstract: When performing audio communication by using different encoding/decoding methods, a code obtained by encoding audio by a certain method is converted into a code decodable by another method with a high audio quality and a small calculation amount. In a code conversion device for converting a first code string into a second code string, an audio decoding circuit acquires a first linear prediction coefficient and excitation signal information from the first code string and drives the filter having the first linear prediction coefficient by the excitation signal obtained from the excitation signal information, thereby creating a first audio signal. A fixed codebook code generation circuit uses the fixed codebook information and minimizes the distance between the second audio signal generated from the information obtained from the second code string and the first audio signal, thereby obtaining the fixed codebook information in the second code string.
    Type: Grant
    Filed: May 22, 2003
    Date of Patent: February 14, 2012
    Assignee: NEC Corporation
    Inventor: Atsushi Murashima
  • Patent number: 8116361
    Abstract: An alternative approach to coping with the ever increasing demand for faster communications hardware is to design modems that are capable of operating its speeds at a higher data rate than a speed required for a single port of the standard communication rate for that modem. Basically, by utilizing a resource manager, that directs the data in and out of the various portions of the modem in an orderly manner, keeping track of which of the ports is being operated at any given point in time, a standard single port modem can be reconfigured, for example, at an over clocked rate, to manipulate the data input and output of a modem.
    Type: Grant
    Filed: April 25, 2011
    Date of Patent: February 14, 2012
    Assignee: Aware, Inc.
    Inventors: Peter N. Heller, Edmund C. Reiter, Michael A. Tzannes
  • Patent number: 8109765
    Abstract: Methods and related computer program products, systems, and devices for providing intelligent feedback to a user based on audio input associated with a user reading a passage are disclosed. The method can include assessing a level of fluency of a user's reading of the sequence of words using speech recognition technology to compare the audio input with an expected sequence of words and providing feedback to the user related to the level of fluency for a word.
    Type: Grant
    Filed: September 10, 2004
    Date of Patent: February 7, 2012
    Assignee: Scientific Learning Corporation
    Inventors: Valerie L. Beattie, Marilyn Jager Adams, Michael Barrow
  • Publication number: 20120029916
    Abstract: A method for processing multichannel acoustic signals which is characterized by calculating the feature quantity of each channel from the input signals of a plurality of channels, calculating similarity between the channels in the feature quantity of each channel, selecting channels having high similarity, and separating signals using the input signals of the selected channels.
    Type: Application
    Filed: February 8, 2010
    Publication date: February 2, 2012
    Applicant: NEC CORPORATION
    Inventors: Masanori Tsujikawa, Tadashi Emori, Yoshifumi Onishi
  • Patent number: 8108211
    Abstract: A fast accurate multi-channel frequency dependent scheme for analyzing noise in a signal processing system is described herein. Noise is decomposed within each channel into frequency bands and sub-band noise is propagated. To avoid the computational complexity of a convolution, traditional methods either assume the noise to be white, at any point in the signal processing pipeline, or they just ignore spatial operations. By assuming the noise to be white within each frequency band, it is possible to propagate any type of noise (white, colored, Gaussian, non-Gaussian and others) across a spatial transformation in a very fast and accurate manner. To demonstrate the efficacy of this technique, noise propagation is considered across various spatial operations in an image processing pipeline. Furthermore, the computational complexity is a very small fraction of the computational cost of propagating an image through a signal processing system.
    Type: Grant
    Filed: March 29, 2007
    Date of Patent: January 31, 2012
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Farhan A. Baqai, Akira Matsui, Kenichi Nishio
  • Patent number: 8108219
    Abstract: In one embodiment, at least one channel in a frame of the audio signal is subdivided into a plurality of blocks such that at least two of the blocks having different lengths, and information indicating the subdivision of the channel into the blocks is generated.
    Type: Grant
    Filed: July 7, 2006
    Date of Patent: January 31, 2012
    Assignee: LG Electronics Inc.
    Inventor: Tilman Liebchen
  • Patent number: 8103514
    Abstract: Spatial information associated with an audio signal is encoded into a bitstream, which can be transmitted to a decoder or recorded to a storage media. The bitstream can include different syntax related to time, frequency and spatial domains. In some embodiments, the bitstream includes one or more data structures (e.g., frames) that contain ordered sets of slots for which parameters can be applied. The data structures can be fixed or variable. The data structure can include position information that can be used by a decoder to identify the correct slot for which a given parameter set is applied. The slot position information can be encoded with either a fixed number of bits or a variable number of bits based on the data structure type.
    Type: Grant
    Filed: October 7, 2010
    Date of Patent: January 24, 2012
    Assignee: LG Electronics Inc.
    Inventors: Hee Suk Pang, Dong Soo Kim, Jae Hyun Lim, Hyen O Oh, Yang-Won Jung
  • Patent number: 8103512
    Abstract: Disclosed is a method capable of adaptively aligning windows to extract features according to the types and characteristics of voice signals. To this end, window lengths based on the window update points in a corresponding order are determined by employing the concept of a higher order peak, and windows are aligned according to window lengths. When the windows are aligned according to such a manner, the start and end points of each window is known, so that it becomes possible to easily extract and analyze peak feature information.
    Type: Grant
    Filed: January 23, 2007
    Date of Patent: January 24, 2012
    Assignee: Samsung Electronics Co., Ltd
    Inventor: Hyun-Soo Kim
  • Patent number: 8103513
    Abstract: Spatial information associated with an audio signal is encoded into a bitstream, which can be transmitted to a decoder or recorded to a storage media. The bitstream can include different syntax related to time, frequency and spatial domains. In some embodiments, the bitstream includes one or more data structures (e.g., frames) that contain ordered sets of slots for which parameters can be applied. The data structures can be fixed or variable. The data structure can include position information that can be used by a decoder to identify the correct slot for which a given parameter set is applied. The slot position information can be encoded with either a fixed number of bits or a variable number of bits based on the data structure type.
    Type: Grant
    Filed: August 20, 2010
    Date of Patent: January 24, 2012
    Assignee: LG Electronics Inc.
    Inventors: Hee Suk Pang, Dong Soo Kim, Jae Hyun Lim, Hyen O Oh, Yang-Won Jung
  • Patent number: 8099275
    Abstract: A sound encoder having an improved quantization performance while suppressing an increase of the bit rate to a lowest level. In a second layer encoder, a standard deviation calculator calculates a standard deviation ?c of a first layer decoding spectrum after decoding a scale factor ratio multiplication and outputs the standard deviation ?c to a selector. The selector selects a linear transform function as a function for a nonlinear transform of a residual spectrum according to the standard deviation ?c A nonlinear transform function selects one of prepared nonlinear transform functions #1 to #N according to a result of the selection by the selector, and outputs the selected one to an inverse transformer. The inverse transformer subjects an inverse transform (expansion) to a residual spectrum candidate that is stored in a residual spectrum code book using the nonlinear transform function outputted from the nonlinear transform function and outputs the result to an adder.
    Type: Grant
    Filed: October 25, 2005
    Date of Patent: January 17, 2012
    Assignee: Panasonic Corporation
    Inventor: Masahiro Oshikiri
  • Patent number: 8095360
    Abstract: There is provided a method of post-processing a speech signal. The method comprises applying a time-domain post-processing to the speech signal, using LPC coefficients, for a low-band frequency range and applying a frequency-domain post-processing to the speech signal, using MDCT coefficients, for the high-band frequency range. Applying the frequency-domain post-processing includes decoding an encoded speech signal to obtain MDCT coefficients representative of the speech signal divided into a plurality of sub-bands, generating an envelope for each sub-band of the plurality of sub-bands as an average magnitude of the MDCT coefficients of the sub-band, generating an envelope modification factor for each sub-band of the plurality of sub-band using the MDCT coefficients of the sub-band, modifying the envelope by the envelope modification factor for each sub-band of the plurality of sub-bands to provide a modified envelope, and generating the post-processed speech signal using the modified envelope.
    Type: Grant
    Filed: July 17, 2009
    Date of Patent: January 10, 2012
    Assignee: Mindspeed Technologies, Inc.
    Inventor: Yang Gao
  • Patent number: 8095357
    Abstract: The disclosed embodiments include systems, methods, apparatuses, and computer-readable mediums for compensating one or more signals and/or one or more parameters for time delays in one or more signal processing paths.
    Type: Grant
    Filed: August 31, 2010
    Date of Patent: January 10, 2012
    Assignee: LG Electronics Inc.
    Inventors: Hee Suk Pang, Dong Soo Kim, Jae Hyun Lim, Hyen O Oh, Yang-Won Jung
  • Patent number: 8090577
    Abstract: Methods and apparatus are presented for determining the type of acoustic signal and the type of frequency spectrum exhibited by the acoustic signal in order to selectively delete parameter information before vector quantization. The bits that would otherwise be allocated to the deleted parameters can then be re-allocated to the quantization of the remaining parameters, which results in an improvement of the perceptual quality of the synthesized acoustic signal. Alternatively, the bits that would have been allocated to the deleted parameters are dropped, resulting in an overall bit-rate reduction.
    Type: Grant
    Filed: August 8, 2002
    Date of Patent: January 3, 2012
    Assignee: QUALCOMM Incorported
    Inventors: Khaled Helmi El-Maleh, Ananthapadmanabhan Arasanipalai Kandhadai, Sharath Manjunath
  • Patent number: 8090587
    Abstract: Methods of encoding and decoding a multi-channel audio signal and apparatuses for encoding and decoding a multi-channel audio signal are provided. The method of decoding a multi-channel audio signal includes an unpacking unit which extracts a quantized CLD between a pair of channels of a plurality of channels from a bitstream, and an inverse quantization unit which inverse-quantizes the quantized CLD using a quantization table that considers the location properties of the pair of channels. The methods of encoding and decoding a multi-channel audio signal and the apparatuses for encoding and decoding a multi-channel audio signal can enable an efficient encoding/decoding by reducing the number of quantization bits required.
    Type: Grant
    Filed: September 26, 2006
    Date of Patent: January 3, 2012
    Assignee: LG Electronics Inc.
    Inventors: Yang-Won Jung, Hee Suk Pang, Hyen-O Oh, Dong Soo Kim, Jae Hyun Lim
  • Patent number: 8086445
    Abstract: A method and apparatus for creating a signature of a sampled work in real-time is disclosed herein. Unique signatures of an unknown audio work are created by segmenting a file into segments having predetermined segment and hop sizes. The signature then may be compared against reference signatures. One aspect may be characterized in that the hop size of the sampled work signature is less than the hop size of reference signatures. A method for identifying an unknown audio work is also disclosed.
    Type: Grant
    Filed: June 10, 2009
    Date of Patent: December 27, 2011
    Assignee: Audible Magic Corporation
    Inventors: Erling H. Wold, Thomas L. Blum, Douglas F. Keislar, James A. Wheaton
  • Patent number: 8082158
    Abstract: Spatial information associated with an audio signal is encoded into a bitstream, which can be transmitted to a decoder or recorded to a storage media. The bitstream can include different syntax related to time, frequency and spatial domains. In some embodiments, the bitstream includes one or more data structures (e.g., frames) that contain ordered sets of slots for which parameters can be applied. The data structures can be fixed or variable. The data structure can include position information that can be used by a decoder to identify the correct slot for which a given parameter set is applied. The slot position information can be encoded with either a fixed number of bits or a variable number of bits based on the data structure type.
    Type: Grant
    Filed: October 14, 2010
    Date of Patent: December 20, 2011
    Assignee: LG Electronics Inc.
    Inventors: Hee Suk Pang, Dong Soo Kim, Jae Hyun Lim, Hyen O Oh, Yang-Won Jung
  • Publication number: 20110307257
    Abstract: A method and system for indicating in real time that an interaction is associated with a problem or issue, comprising: receiving a segment of an interaction in which a representative of the organization participates; extracting a feature from the segment; extracting a global feature associated with the interaction; aggregating the feature and the global feature; and classifying the segment or the interaction in association with the problem or issue by applying a model to the feature and the global feature. The method and system may also use features extracted from earlier segments within the interaction. The method and system can also evaluate the model based on features extracted from training interactions and manual tagging assigned to the interactions or segments thereof.
    Type: Application
    Filed: June 10, 2010
    Publication date: December 15, 2011
    Applicant: Nice Systems Ltd.
    Inventors: Oren PEREG, Moshe WASSERBLAT, Yuval LUBOWICH, Ronen LAPERDON, Dori SHAPIRA, Vladislav FEIGIN, Oz FOX-KAHANA
  • Publication number: 20110307258
    Abstract: A method and apparatus for providing real-time assistance related to an interaction associated with a contact center, comprising steps or components for: receiving at least a part of an audio signal of an interaction captured by a capturing device associated with an organization, and metadata information associated with the interaction; performing audio analysis of the at least part of the audio signal, while the interaction is still in progress to obtain audio information; categorizing at least a part of the metadata information and the audio information, to determine a category associated with the interaction, while the interaction is still in progress to obtain audio information; and taking an action associated with the category.
    Type: Application
    Filed: June 15, 2010
    Publication date: December 15, 2011
    Applicant: Nice Systems Ltd.
    Inventors: Hadas Liberman, Keren Eshkol, Oren Lewkowicz, Omer Gazit, Zohar Tzfoni, Avi Revivo, Leon Portman, Ronit Ephrat, Oren Pereg, Ronen Laperdon, Dori Shapira, Moshe Wasserblat
  • Patent number: 8065136
    Abstract: In a method of encoding input signals (CH1 to CH3; 400 to 450) in a multi-channel encoder (5; 15) to generate corresponding output data having down-mix output signals (610, 620) together with complementary parametric data (600), the method includes a first step of down-mixing input signals (CH1 to CH3; 400 to 450) to generate the corresponding down-mix output signals (610, 620), and a second step of processing the input signals (CH1 to CH3; 400 to 450) during down-mixing to generate the parametric data (600) complementary to the down-mix output signals (610, 620). Processing of the input signals (CH1 to CH3; 400 to 450) involves including information in the down-mix signals (610, 620) which is useable during subsequent decoding of the down-mix output signals (610, 620) and the parametric data (600) to determine at least some parameter data and thereby enabling representations of the input signals (CH1 to CH3; 400 to 450) to be subsequently regenerated.
    Type: Grant
    Filed: August 30, 2010
    Date of Patent: November 22, 2011
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Gerard Herman Hotho, Dirk Jeroen Breebaart, Evgeny Alexandrovitch Verbitskiy, Albertus Cornelis Den Brinker
  • Patent number: 8060374
    Abstract: Spatial information associated with an audio signal is encoded into a bitstream, which can be transmitted to a decoder or recorded to a storage media. The bitstream can include different syntax related to time, frequency and spatial domains. In some embodiments, the bitstream includes one or more data structures (e.g., frames) that contain ordered sets of slots for which parameters can be applied. The data structures can be fixed or variable. The data structure can include position information that can be used by a decoder to identify the correct slot for which a given parameter set is applied. The slot position information can be encoded with either a fixed number of bits or a variable number of bits based on the data structure type.
    Type: Grant
    Filed: July 26, 2010
    Date of Patent: November 15, 2011
    Assignee: LG Electronics Inc.
    Inventors: Hee Suk Pang, Dong Soo Kim, Jae Hyun Lim, Hyen O Oh, Yang-Won Jung
  • Patent number: 8060375
    Abstract: An improved audio coding technique encodes audio having a low frequency transient signal, using a long block, but with a set of adapted masking thresholds. Upon identifying an audio window that contains a low frequency transient signal, masking thresholds for the long block may be calculated as usual. A set of masking thresholds calculated for the 8 short blocks corresponding to the long block are calculated. The masking thresholds for low frequency critical bands are adapted based on the thresholds calculated for the short blocks, and the resulting adapted masking thresholds are used to encode the long block of audio data. The result is encoded audio with rich harmonic content and negligible coder noise resulting from the low frequency transient signal.
    Type: Grant
    Filed: January 12, 2011
    Date of Patent: November 15, 2011
    Assignee: Apple Inc.
    Inventors: Shyh-Shiaw Kuo, Frank Baumgarte
  • Patent number: 8050914
    Abstract: A system enhances speech by detecting a speaker's utterance through a first microphone positioned a first distance from a source of interference. A second microphone may detect the speaker's utterance at a different position. A monitoring device may estimate the power level of a first microphone signal. A synthesizer may synthesize part of the first microphone signal by processing the second microphone signal. The synthesis may occur when power level is below a predetermined level.
    Type: Grant
    Filed: November 12, 2008
    Date of Patent: November 1, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Gerhard Uwe Schmidt, Mohamed Krini
  • Patent number: 8045571
    Abstract: An audio decoding system comprises a buffer module that receives packets including encoded audio frames that each store audio parameters. A packet loss concealment module that selectively extracts the audio parameters from ones of the encoded audio frames, determines recovered audio parameters based on the extracted audio parameters, and encodes the recovered audio parameters into recovered audio frames. An audio decoding module that decodes the encoded audio frames and the recovered audio frames and outputs decoded audio samples.
    Type: Grant
    Filed: May 15, 2008
    Date of Patent: October 25, 2011
    Assignee: Marvell International Ltd.
    Inventors: Hongxin Li, Beryl Xu
  • Patent number: 8046234
    Abstract: Method and apparatus for encoding/decoding audio data with scalability are provided. The method includes slicing audio data so that sliced audio data corresponds to a plurality of layers, obtaining scale band information and coding band information corresponding to each of the plurality of layers, coding additional information containing scale factor information and coding model information based on scale band information and coding band information corresponding to a first layer, obtaining quantized samples by quantizing audio data corresponding to the first layer with reference to the scale factor information, coding the obtained plurality of quantized samples in units of symbols in order from a symbol formed with most significant bits (MSB) down to a symbol formed with least significant bits (LSB) by referring to the coding model information, and repeatedly performing the steps with increasing the ordinal number of the layer one by one every time, until coding for the plurality of layers is finished.
    Type: Grant
    Filed: December 16, 2003
    Date of Patent: October 25, 2011
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jung-hoe Kim, Sang-wook Kim, Eun-mi Oh
  • Patent number: 8045572
    Abstract: A packet loss concealment system includes first and second buffers that stores audio samples prior to and subsequent to a missing section of audio samples. A forward propagation module generates a forward propagated waveform by propagating a first waveform period that is based on the first buffer. The forward propagation module increases periodicity of the first waveform period nonlinearly when propagating the first waveform period. A backward propagation module generates a backward propagated waveform by propagating a second waveform period that is based on the second buffer. A ratio control module selectively determines a ratio between a first periodicity of the audio samples in the second buffer and a second periodicity of the audio samples in the first buffer. The forward propagation module selectively propagates the first waveform period using the ratio, and the backward propagation module selectively propagates the second waveform period using an inverse of the ratio.
    Type: Grant
    Filed: May 15, 2008
    Date of Patent: October 25, 2011
    Assignee: Marvell International Ltd.
    Inventors: Hongxin Li, Li Xu
  • Patent number: 8036249
    Abstract: A data verification method and system is provided. The data verification method includes the steps of transmitting data from a sender to a receiver over a signaling channel, transmitting a first set of bits to the receiver over a voice channel, wherein the first set of bits is generated using the data in the sender, and verifying the data through comparison between the first set of bits and a second set of bits that is generated based on the data in the receiver. The first and the second sets of bits may be a group of bits that are selected from a hash value using a selection mask in the sender and the receiver respectively, wherein the section mask has the same length as the hash value and the hash value is calculated based on the data, and the selection mask may be pre-defined between the sender and the receiver.
    Type: Grant
    Filed: December 31, 2007
    Date of Patent: October 11, 2011
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Tymur Korkishko, Kyung-Hee Lee
  • Patent number: 8036904
    Abstract: An audio encoder for encoding a multi-channel audio signal includes an encoder combination module (ECM) for generating a dominant signal part (m) and a residual signal part (s) being a combined representation of first and second audio signals (x1, x2), the dominant and residual signal parts (m, s) being obtained by applying a mathematical procedure to the first and second audio signals (x1, x2), wherein the mathematical procedure involves a first spatial parameter (SP1) including a description of spatial properties of the first and second audio signals (x1, x2), a parameter generator (PG) for generating a first parameter (PS1) set including a second spatial parameter (SP2), and a second parameter (PS2) set including a third spatial parameter (SP3), and an output generator for generating an encoded output signal having a first output part (OP1) including the dominant signal part (m) and the first parameter set (PS1), and a second output part (OP2) including the residual signal part (s) and the second parameter
    Type: Grant
    Filed: March 16, 2006
    Date of Patent: October 11, 2011
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Francois Philippus Myburg, Erik Gosuinus Petrus Schuijers
  • Patent number: 8036879
    Abstract: A speech enhancement system improves the perceptual quality of an aural signal. A receiver detects and receives an unvoiced signal, a fully voiced signal, or a mixed voice remote signal. A coherence processor identifies the similarities or differences between a local signal and the remote signal. A cancellation processor or controller dampens reflected signals that may be part of the local signal.
    Type: Grant
    Filed: June 29, 2007
    Date of Patent: October 11, 2011
    Assignee: QNX Software Systems Co.
    Inventors: Phillip A. Hetherington, Shreyas A. Paranjpe
  • Publication number: 20110246185
    Abstract: A frame extracting means 71 extracts frames from sample data as voice data in which whether each frame is an active voice frame or a non-active voice frame is already known. A feature quantity calculating means 72 calculates multiple feature quantities of each of the frames. A feature quantity integrating means 73 calculates an integrated feature quantity of the multiple feature quantities. A judgment means 74 judges whether each of the frames is an active voice frame or a non-active voice frame. An erroneous feature quantity calculation value calculating means 75 obtains a first erroneous feature quantity calculation value and a second erroneous feature quantity calculation value by executing prescribed calculations. A weight updating means 76 updates weights used for weighting so that the rate between the first erroneous feature quantity calculation value and the second erroneous feature quantity calculation value approaches a prescribed value.
    Type: Application
    Filed: December 7, 2009
    Publication date: October 6, 2011
    Applicant: NEC CORPORATION
    Inventors: Takayuki Arakawa, Masanori Tsujikawa
  • Publication number: 20110246076
    Abstract: A method and system of conducting named entity recognition. One method comprises selecting one or more examples for human labelling, each example comprising a word sequence containing a named entity and its context; and retraining a model for the named entity recognition based on the labelled examples as training data.
    Type: Application
    Filed: May 28, 2005
    Publication date: October 6, 2011
    Inventors: Jian Su, Dan Shen, Jie Zhang, Guo Dong Zhou
  • Publication number: 20110243311
    Abstract: Methods and systems for automatic phone call tracking and analysis of the content and outcomes of a call are provided. These systems may provide businesses with the ability to track and view analytics of the number and various outcomes of calls, thereby providing up-to-date real-time analysis of the automatically-generated results of client interactions with staff answering the phones. Methods and systems in accordance with the present invention quantitatively and objectively analyze staff performance and marketing return on investment (ROI), and track patient demand across various procedures. This may automatically provide information on the number of calls with various outcomes, e.g., the customer booked an appointment, the customer hung up while on hold, the customer was connected with voicemail, the customer left a message on voicemail, the customer is an existing client, etc. Other automatically-detected aspects of phone call contents are provided.
    Type: Application
    Filed: March 30, 2010
    Publication date: October 6, 2011
    Inventor: Grant L. Aldrich
  • Patent number: 8032365
    Abstract: A method and corresponding apparatus for coded-domain acoustic echo control is presented. An echo control problem is considered as that of perceptually matching an echo signal to a reference signal. A perceptual similarity function that is based on the coded spectral parameters produced by the speech codec is defined. Since codecs introduce a significant degree of non-linearity into the echo signal, the similarity function is designed to be robust against such effects. The similarity function is incorporated into a coded-domain echo control system that also includes spectrally-matched noise injection for replacing echo frames with comfort noise. Using actual echoes recorded over a commercial mobile network, it is shown herein that the similarity function is robust against both codec non-linearities and additive noise. Experimental results further show that the echo-control is effective at suppressing echoes compared to a Normalized Least Mean Squared (NLMS)-based echo cancellation system.
    Type: Grant
    Filed: October 19, 2007
    Date of Patent: October 4, 2011
    Assignee: Tellabs Operations, Inc.
    Inventor: Rafid A. Sukkar
  • Patent number: 8032372
    Abstract: A computer program product for computing a correction rate predictor for medical record dictations, the computer program product residing on a computer-readable medium includes computer-readable instructions for causing a computer to obtain a draft medical transcription of at least a portion of a dictation, the dictation being from medical personnel and concerning a patient, determine features of the dictation to produce a feature set comprising a combination of features of the dictation, the features being relevant to a quantity of transcription errors in the transcription, analyze the feature set to compute a predicted correction rate associated with the dictation and use the predicted correction rate to determine whether to provide at least a portion of the transcription to a transcriptionist.
    Type: Grant
    Filed: September 13, 2005
    Date of Patent: October 4, 2011
    Assignee: eScription, Inc.
    Inventors: Roger Scott Zimmerman, George Zavaliagkos
  • Patent number: 8032360
    Abstract: A system and method for high-quality variable speed playback of audio-visual (A/V) media is provided. The system receives an encoded visual signal and an encoded audio signal. The encoded visual signal is decoded to generate a decoded visual signal and the encoded audio signal is decoded to generate a decoded audio signal. The decoded audio signal is time scale modified to generate a time scale modified audio signal. The decoded visual signal and the time scale modified audio signal are then synchronized for playback at a predefined playback speed. Only partial decoding of the encoded audio signal may be performed to conserve processing power.
    Type: Grant
    Filed: May 13, 2004
    Date of Patent: October 4, 2011
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Patent number: RE43190
    Abstract: A speech coding apparatus comprises a repetition period pre-selecting unit for generating a plurality of candidates for the repetition period of a driving excitation source by multiplying the repetition period of an adaptive excitation source by a plurality of constant numbers, respectively, and for pre-selecting a predetermined number of candidates from all the candidates generated. A driving excitation source coding unit provides both excitation source location information and excitation source polarity information that minimize a coding distortion, for each of the predetermined number of candidates, and provides an evaluation value associated with the minimum coding distortion for each of the predetermined number of candidates.
    Type: Grant
    Filed: January 28, 2010
    Date of Patent: February 14, 2012
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Hirohisa Tasaki, Tadashi Yamaura
  • Patent number: RE43209
    Abstract: A speech coding apparatus comprises a repetition period pre-selecting unit for generating a plurality of candidates for the repetition period of a driving excitation source by multiplying the repetition period of an adaptive excitation source by a plurality of constant numbers, respectively, and for pre-selecting a predetermined number of candidates from all the candidates generated. A driving excitation source coding unit provides both excitation source location information and excitation source polarity information that minimize a coding distortion, for each of the predetermined number of candidates, and provides an evaluation value associated with the minimum coding distortion for each of the predetermined number of candidates.
    Type: Grant
    Filed: January 28, 2010
    Date of Patent: February 21, 2012
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Hirohisa Tasaki, Tadashi Yamaura