Speech Signal Processing Patents (Class 704/200)
  • Patent number: 8452583
    Abstract: A method and associated apparatus for using visual separators to indicate additional character combination choices from a disambiguation function on a handheld electronic device.
    Type: Grant
    Filed: July 2, 2012
    Date of Patent: May 28, 2013
    Assignee: Research In Motion Limited
    Inventors: Sherryl Lee Lorraine Scott, Zaheen Somani
  • Patent number: 8452604
    Abstract: Recognizable visual and/or audio artifacts, such as recognizable sounds, are introduced into visual and/or audio content in an identifying pattern to generate a signed visual and/or audio recording for distribution over a digital communications medium. A library of images and/or sounds may be provided, and the image and/or sounds from the library may be selectively inserted to generate the identifying pattern. The images and/or sounds may be inserted responsive to one or more parameters associated with creation of the visual and/or audio content. A representation of the identifying pattern may be generated and stored in a repository, e.g., an independent repository configured to maintain creative rights information. The stored pattern may be retrieved from the repository and compared to an unidentified visual and/or audio recording to determine an identity thereof.
    Type: Grant
    Filed: August 15, 2005
    Date of Patent: May 28, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Steven Tischer
  • Patent number: 8447590
    Abstract: A voice emitting and collecting device that is capable of picking up/outputting a voice emitted from a talker at a high S/N ratio by eliminating the influence of a diffracting voice despite a simple configuration is provided. A signal differencing circuit 191 outputs difference signals MS1 to MS4 between voice collecting beam signals MB11 to MB14 and voice collecting beam signals MB21 to MB24. A level comparator 195 selects the difference signal having a maximum level. A signal selecting circuit 196 selects voice collecting beam signals MB1x, MB2x of the difference signal MS that is selected/pointed by the level comparator 195. A subtracter 199 subtracts the voice collecting beam signal MB2x from the voice collecting beam signal MB1x, and output a resultant signal. Accordingly, main components of the diffracting voice can be removed from the voice collecting beam signal.
    Type: Grant
    Filed: June 20, 2007
    Date of Patent: May 21, 2013
    Assignee: Yamaha Corporation
    Inventors: Toshiaki Ishibashi, Ryo Tanaka, Satoshi Ukai
  • Patent number: 8447620
    Abstract: An audio encoder for encoding an audio signal has a first coding branch, the first coding branch comprising a first converter for converting a signal from a time domain into a frequency domain. Furthermore, the audio encoder has a second coding branch comprising a second time/frequency converter. Additionally, a signal analyzer for analyzing the audio signal is provided. The signal analyzer, on the hand, determines whether an audio portion is effective in the encoder output signal as a first encoded signal from the first encoding branch or as a second encoded signal from a second encoding branch. On the other hand, the signal analyzer determines a time/frequency resolution to be applied by the converters when generating the encoded signals. An output interface includes, in addition to the first encoded signal and the second encoded signal, a resolution information identifying the resolution used by the first time/frequency converter and used by the second time/frequency converter.
    Type: Grant
    Filed: April 6, 2011
    Date of Patent: May 21, 2013
    Assignees: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V., Voiceage Corporation
    Inventors: Max Neuendorf, Stefan Bayer, Jérémie Lecomte, Guillaume Fuchs, Julien Robilliard, Nikolaus Rettelbach, Frederik Nagel, Ralf Geiger, Markus Multrus, Bernhard Grill, Philippe Gournay, Redwan Salami
  • Patent number: 8447285
    Abstract: Systems and methods that can be utilized to convert a voice communication received over a telecommunication network to text are described. In an illustrative embodiment, a call processing system coupled to a telecommunications network receives a call from a caller intended for a first party, wherein the call is associated with call signaling information. At least a portion of the call signaling information is stored in a computer readable medium. A greeting is played the caller, and a voice communication from the caller is recorded. At least a portion of the voice communication is converted to text, which is analyzed to identify portions that are inferred to be relatively more important to communicate to the first party. A text communication is generated including at least some of the identified portions and including fewer words than the recorded voice communication. At least a portion of the text communication is made available to the first party over a data network.
    Type: Grant
    Filed: March 11, 2008
    Date of Patent: May 21, 2013
    Assignee: Callwave Communications, LLC
    Inventors: Anthony Bladon, David Giannini, David F. Hofstatter, Colin Kelley, David C. McClintock, Robert F. Smith, David S. Trandal, Leland W. Kirchhoff
  • Patent number: 8447617
    Abstract: There is provided a method or a device for extending a bandwidth of a first band speech signal to generate a second band speech signal wider than the first band speech signal and including the first band speech signal. The method comprises receiving a segment of the first band speech signal having a low cut off frequency and a high cut off frequency; determining the high cut off frequency of the segment; determining whether the segment is voiced or unvoiced; if the segment is voiced, applying a first bandwidth extension function to the segment to generate a first bandwidth extension in high frequencies; if the segment is unvoiced, applying a second bandwidth extension function to the segment to generate a second bandwidth extension in the high frequencies; using the first bandwidth extension and the second bandwidth extension to extend the first band speech signal beyond the high cut off frequency.
    Type: Grant
    Filed: March 15, 2010
    Date of Patent: May 21, 2013
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Norbert Rossello, Fabien Klein
  • Patent number: 8442836
    Abstract: Embodiments of the invention provides a method and device for assigning bitrates to a plurality of channels in a scalable audio encoding/truncation process. Different bitrates are assigned to different channels in the scalable audio encoding/truncation process.
    Type: Grant
    Filed: January 31, 2008
    Date of Patent: May 14, 2013
    Assignee: Agency for Science, Technology and Research
    Inventors: Te Li, Susanto Rahardja, Haibin Huang
  • Patent number: 8442825
    Abstract: A device for voice identification including a receiver, a segmenter, a resolver, two advancers, a buffer, and a plurality of IIR resonator digital filters where each IIR filter comprises a set of memory locations or functional equivalent to hold filter specifications, a memory location or functional equivalent to hold the arithmetic reciprocal of the filter's gain, a five cell controller array, several multipliers, an adder, a subtractor, and a logical non-shift register. Each cell of the five cell controller array has five logical states, each acting as a five-position single-pole rotating switch that operates in unison with the four others. Additionally, the device also includes an artificial neural network and a display means.
    Type: Grant
    Filed: August 16, 2011
    Date of Patent: May 14, 2013
    Assignee: The United States of America as Represented by the Director, National Security Agency
    Inventor: Michael Sinutko
  • Patent number: 8442837
    Abstract: A method for processing an audio signal including classifying an input frame as either a speech frame or a generic audio frame, producing an encoded bitstream and a corresponding processed frame based on the input frame, producing an enhancement layer encoded bitstream based on a difference between the input frame and the processed frame, and multiplexing the enhancement layer encoded bitstream, a codeword, and either a speech encoded bitstream or a generic audio encoded bitstream into a combined bitstream based on whether the codeword indicates that the input frame is classified as a speech frame or as a generic audio frame, wherein the encoded bitstream is either a speech encoded bitstream or a generic audio encoded bitstream.
    Type: Grant
    Filed: December 31, 2009
    Date of Patent: May 14, 2013
    Assignee: Motorola Mobility LLC
    Inventors: James P. Ashley, Jonathan A. Gibbs, Udar Mittal
  • Patent number: 8438036
    Abstract: In recent years, it has become commonplace for portable devices to generate analog audio signals from numerous sources, meaning that the codecs employed in these portable devices need to be able to utilize various digital bit streams at different sampling rates. To date, however, the circuitry for asynchronous sampling rate conversions for multiple bit streams has been complex, rigid, and power hungry. Here, a codec is provided which uses miniDSP cores to perform asynchronous sampling rate conversion efficiently and with reduced power consumption compared to other conventional codecs.
    Type: Grant
    Filed: September 3, 2009
    Date of Patent: May 7, 2013
    Assignee: Texas Instruments Incorporated
    Inventors: Shawn X. Yu, Terry L. Sculley
  • Patent number: 8433073
    Abstract: In a sound effect applying apparatus, an input part frequency-analyzes an input signal of sound or voice for detecting a plurality of local peaks of harmonics contained in the input signal. A subharmonics provision part adds a spectrum component of subharmonics between the detected local peaks so as to provide the input signal with a sound effect. An output part converts the input signal of a frequency domain containing the added spectrum component into an output signal of a time domain for generating the sound or voice provided with the sound effect.
    Type: Grant
    Filed: June 22, 2005
    Date of Patent: April 30, 2013
    Assignee: Yamaha Corporation
    Inventors: Yasuo Yoshioka, Alex Loscos
  • Patent number: 8428939
    Abstract: A voice mixing device for mixing a plurality of voice signals, comprises: a speaker selection unit selecting at least one voice signal among said plurality of voice signals; a full signal adder unit adding all of at least one voice signal selected by said speaker selection unit; respective subtractor unit subtracting only one of said selected voice signals from an addition result of said full signal adder unit; a common noise suppression unit suppressing noise of a common voice signal, being an addition result of said full signal adder unit; individual noise suppression unit suppressing noise of respective individual voice signals, being subtraction results of said subtractor unit; and memory switching unit copying information of noise suppression obtained in said common noise suppression unit based on a selection result of said speaker selection unit, to information of noise suppression in said individual noise suppression unit.
    Type: Grant
    Filed: July 28, 2008
    Date of Patent: April 23, 2013
    Assignee: NEC Corporation
    Inventors: Hironori Ito, Kazunori Ozawa
  • Patent number: 8428427
    Abstract: An apparatus and methods are described for receiving, storing, and recovering from storage one or more television program while maintaining proper audio to video synchronization or lip sync when the program is displayed. Preferably the audio and video portions are in digital form and may be compressed or uncompressed. The audio and video may be recorded on a physical storage medium such as tape, an optical or magnetic computer disk, or solid state memory or other such device providing user control functions including start, stop, record, play and search, with increased reproduction capability of such stored programs being facilitated by the present invention. Timing information is recovered from the audio and video to be displayed utilized to alter the time duration of a contiguous signal block (the alpha length) of the audio and/or video to correct for any timing errors therebetween.
    Type: Grant
    Filed: September 14, 2005
    Date of Patent: April 23, 2013
    Inventors: J. Carl Cooper, Steven J. Anderson
  • Patent number: 8428957
    Abstract: A technique of spectral noise shaping in an audio coding system is disclosed. Frequency decomposition of an input audio signal is performed to obtain multiple frequency sub-bands that closely follow critical bands of human auditory system decomposition. The tonality of each sub-band is determined. If a sub-band is tonal, time domain linear prediction (TDLP) processing is applied to the sub-band, yielding a residual signal and linear predictive coding (LPC) coefficients of an all-pole model representing the sub-band signal. The residual signal is further processed using a frequency domain linear prediction (FDLP) method. The FDLP parameters and LPC coefficients are transferred to a decoder. At the decoder, an inverse-FDLP process is applied to the encoded residual signal followed by an inverse TDLP process, which shapes the quantization noise according to the power spectral density of the original sub-band signal. Non-tonal sub-band signals bypass the TDLP process.
    Type: Grant
    Filed: August 22, 2008
    Date of Patent: April 23, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Harinath Garudadri, Sriram Ganapathy, Petr Motlicek, Hynek Hermansky
  • Patent number: 8411830
    Abstract: A system, method and computer program product for providing targeted messages to a person using telephony services by generating user profile information from telephony data and using the user profile information to retrieve targeted messages.
    Type: Grant
    Filed: November 18, 2011
    Date of Patent: April 2, 2013
    Assignee: iCall, Inc.
    Inventors: Arlo Christopher Gilbert, Andrew Muldowney
  • Patent number: 8412520
    Abstract: A noise reduction device comprises a SN ratio obtaining unit configured to obtain a SN ratio as a function of an estimated noise spectrum and an arithmetic product of an averaged power spectrum of the input signal and noise likeliness signal, and an output signal obtaining unit configured to obtain a output signal whose noise is reduced based on the input signal and the SN ratio obtained by the SN ratio obtaining unit.
    Type: Grant
    Filed: October 29, 2007
    Date of Patent: April 2, 2013
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Satoru Furuta, Shinya Takahashi
  • Patent number: 8407046
    Abstract: A method of transmitting an input audio signal is disclosed. A current spectral magnitude of the input audio signal is quantized. A quantization error of a previous spectral magnitude is fed back to influence quantization of the current spectral magnitude. The feeding back includes adaptively modifying a quantization criterion to form a modified quantization criterion. A current quantization error is minimized by using the modified quantization criterion. A quantized spectral envelope is formed based on the minimizing and the quantized spectral envelope is transmitted.
    Type: Grant
    Filed: September 4, 2009
    Date of Patent: March 26, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Patent number: 8407059
    Abstract: A method to audio matrix encode/decode, which encode and decode audio signals of two or more channels into an audio signal of one or more channel while preserving the direction of a sound image includes extracting pieces of sound image information from audio signals of multi channels, encoding and allocating the extracted sound image information to an inaudible frequency domain except an audible frequency domain, and adding the sound image information allocated to the inaudible frequency domain and matrix-encoded stereo signals of the audible frequency domain.
    Type: Grant
    Filed: June 12, 2008
    Date of Patent: March 26, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Sung-ho Cho
  • Patent number: 8406691
    Abstract: A method of wireless communication between user equipments includes retrieving user data and a criterion via a software application on a first wireless user equipment having communication interfaces. One of the interfaces is selected based on the criterion. The interface is used to identify wireless user equipments in a proximity of the first user equipment. A total number of user equipments in the proximity of the first user equipment is determined, and, when the number is below a value, another interface is selected. The other interface is used to identify wireless user equipments in the proximity of the first user equipment. A match between the user data of the first user equipment and user data of a second user equipment, causes initiation of wireless communication between the first and the second user equipments by sending to any of the user equipments a message associated with the match.
    Type: Grant
    Filed: December 17, 2008
    Date of Patent: March 26, 2013
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventors: Kent Bogestam, Ayodele Damola
  • Patent number: 8407043
    Abstract: The present invention provides a computationally efficient technique for compression encoding of an audio signal, and further provides a technique to enhance the sound quality of the encoded audio signal. This is accomplished by including more accurate attack detection and a computationally efficient quantization technique. The improved audio coder converts the input audio signal to a digital audio signal. The audio coder then divides the digital audio signal into larger frames having a long-block frame length and partitions each of the frames into multiple short-blocks. The audio coder then computes short-block audio signal characteristics for each of the partitioned short-blocks based on changes in the input audio signal.
    Type: Grant
    Filed: March 14, 2011
    Date of Patent: March 26, 2013
    Assignee: Sasken Communication Technologies Limited
    Inventors: K. P. P. Kalyan Chakravarthy, Navaneetha K. Ruthramoorthy, Pushkar P. Patwardhan, Bishwarup Mondal
  • Patent number: 8407060
    Abstract: An audio decoder for decoding a multi-audio-object signal having an audio signal of a first type and an audio signal of a second type encoded therein is described, the multi-audio-object signal having a downmix signal and side information, the side information having level information of the audio signals of the first and second types in a first predetermined time/frequency resolution, and a residual signal specifying residual level values in a second predetermined time/frequency resolution, the audio decoder having a processor for computing prediction coefficients based on the level information; and an up-mixer for up-mixing the downmix signal based on the prediction coefficients and the residual signal to obtain a first up-mix audio signal approximating the audio signal of the first type and/or a second up-mix audio signal approximating the audio signal of the second type.
    Type: Grant
    Filed: April 20, 2012
    Date of Patent: March 26, 2013
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.
    Inventors: Oliver Hellmuth, Johannes Hilpert, Leonid Terentiev, Cornelia Falch, Andreas Hoelzer, Juergen Herre
  • Patent number: 8396706
    Abstract: A method, system and program for encoding and decoding speech according to a source-filter model whereby speech is modeled to comprise a source signal filtered by a time-varying filter. The method comprises: receiving a speech signal; and from the speech signal, deriving a spectral envelope signal representing the modeled filter and a remaining signal representing the modeled source. At intervals during the encoding, the method further comprises determining a period between portions of the remaining signal having a degree of repetition and determining a correlation between said portions based on that period, thus producing a respective vector of the correlation for each interval. Once every number of said intervals, the method further comprises selecting a codebook from a plurality of codebooks for quantizing the vectors, quantizing the vectors of that number of intervals according to the selected codebook, and transmitting the quantized vectors along with an indication of the selected codebook.
    Type: Grant
    Filed: May 29, 2009
    Date of Patent: March 12, 2013
    Assignee: Skype
    Inventor: Koen Bernard Vos
  • Publication number: 20130060564
    Abstract: A method of providing a quality measure for an output voice signal generated to reproduce an input voice signal, the method comprising: partitioning the input and output signals into frames; for each frame of the input signal, determining a disturbance relative to each of a plurality of frames of the output signal; determining a subset of the determined disturbances comprising one disturbance for each input frame such that a sum of the disturbances in the subset set is a minimum; and using the set of disturbances to provide the measure of quality.
    Type: Application
    Filed: September 27, 2012
    Publication date: March 7, 2013
    Applicant: AUDIOCODES LTD.
    Inventor: AudioCodes Ltd
  • Patent number: 8392182
    Abstract: A method of encoding one or more parent blocks of values, the number of values being the length of each block, the method comprising for each parent block: (a) determining a first sum of values in the parent block; (b) splitting the parent block into smaller subblocks; (c) for at least one of the subblocks, determining a second sum of the values in the subblock, selecting a likelihood table from the plurality of likelihood tables based on said first sum of values in the parent block and encoding the second sum using the likelihood table; (d) designating each subblock a parent block; (e) carrying out steps (a), (b), (c) and (d) until at least one parent block reaches a predetermined condition.
    Type: Grant
    Filed: March 7, 2012
    Date of Patent: March 5, 2013
    Assignee: Skype
    Inventor: Koen Bernard Vos
  • Patent number: 8392179
    Abstract: The invention relates to the coding of audio signals that may include both speech-like and non-speech-like signal components. It describes methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters, a plurality of codebooks each having codevectors, at least one codebook providing an excitation more appropriate for non-speech-like signals and at least one codebook providing an excitation more appropriate for speech-like signals, and a plurality of gain factors, each associated with a codebook. The encoding methods and apparatus select from the codebooks codevectors and/or associated gain factors by minimizing a measure of the difference between the audio signal and a reconstruction of the audio signal derived from the codebook excitations. The decoding methods and apparatus generate a reconstructed output signal from the LPC parameters, codevectors, and gain factors.
    Type: Grant
    Filed: March 12, 2009
    Date of Patent: March 5, 2013
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Rongshan Yu, Regunathan Radhakrishnan, Robert Andersen, Grant Davidson
  • Patent number: 8385588
    Abstract: A method of processing audio signals recorded during display of image data from a media file on a display device to produce semantic understanding data and associating such data with the original media file, includes: separating a desired audio signal from the aggregate mixture of audio signals; analyzing the separated signal for purposes of gaining semantic understanding; and associating the semantic information obtained from the audio signals recorded during image display with the original media file.
    Type: Grant
    Filed: December 11, 2007
    Date of Patent: February 26, 2013
    Assignee: Eastman Kodak Company
    Inventors: Keith A. Jacoby, Thomas J. Murray, John V. Nelson, Kevin M. Gobeyn
  • Patent number: 8386246
    Abstract: A system is described that performs frame erasure concealment to generate frames of an output speech signal corresponding to a series of erased frames of encoded bit-stream in a manner that conceals the quality-degrading effects of such erased frames. In one embodiment, responsive to the detection of a first erased frame in the series, a number of steps are performed. These steps include deriving long-term and short synthesis filters based on previously-generated portions of the output speech signal, calculating a ringing signal segment based on the long-term and short-term synthesis filters, and generating a frame of the output speech signal corresponding to the first erased frame by overlap adding the ringing signal segment to an extrapolated waveform. Deriving the long-term filter includes estimating a pitch period based on a previously-generated portion of the output speech signal by finding a lag that minimizes a sum of magnitude difference function.
    Type: Grant
    Filed: June 27, 2008
    Date of Patent: February 26, 2013
    Assignee: Broadcom Corporation
    Inventor: Juin-Hwey Chen
  • Patent number: 8386267
    Abstract: A technique of improving the degree of freedom of controlling the accuracy of encoding a stereo signal. In a stereo signal encoding device, a sum/difference calculation section generates a monophonic signal which is the sum of first and second channel signals constituting a stereo signal and a side signal which is the difference between the first channel signal and the second channel signal. A mode setting section generates mode information that indicates either a monophonic encoding mode or a stereo encoding mode. A core layer encoding section, a first extended layer encoding section, a second extended layer encoding section, and a third extended layer encoding section individually carry out the monophonic encoding using the monophonic signals or the stereo encoding using both the monophonic signal and the side signal depending on the mode information, and output to a multiplexing section the resultant encoded information from the core layer to the third extended layer.
    Type: Grant
    Filed: March 18, 2009
    Date of Patent: February 26, 2013
    Assignee: Panasonic Corporation
    Inventor: Toshiyuki Morii
  • Patent number: 8386243
    Abstract: A method and system for regenerating wideband speech from narrowband speech. The method comprises: receiving samples of a narrowband speech signal in a first range of frequencies; modulating received samples of the narrowband speech signal with a modulation signal having a modulating frequency adapted to upshift each frequency in the first range of frequencies by an amount determined by the modulating frequency wherein the modulating frequency is selected to translate into a target band a selected frequency band within the first range of signals; filtering the modulated samples using a high pass filter to form a regenerated speech signal in the target band, wherein the lower limit of the high pass filter defines the lowermost frequency in the target band; and combining the narrow band speech signal with the regenerated speech signal in the target band to regenerate a wideband speech signal.
    Type: Grant
    Filed: June 10, 2009
    Date of Patent: February 26, 2013
    Assignee: Skype
    Inventors: Mattias Nilsson, Soren Vang Andersen, Koen Bernard Vos
  • Patent number: 8380525
    Abstract: According to the invention a receiving end terminal (RET) enters a delay mode based on the detecting of the quality of the link being lower than a threshold. In this delay mode, the receiving end terminal provides a reception delay indicator (RDI) for a sending end terminal (SET). The sending end terminal (SET) receives the reception delay indicator (RDI) and provides an end of speech indicator (ESI) for the receiving end terminal (RET) at an end of a speech coding interval (SC). The receiving end terminal (RET) uses the reception delay indicator (RDI) and end of speech indicator (ESI) to define a first time interval (AL1) during which a speech decoder is disabled. The speech decoder is again activated after the first time interval (AL1).
    Type: Grant
    Filed: June 25, 2007
    Date of Patent: February 19, 2013
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Magnus Almgren, Stefan Bruhn, Per Skillermark
  • Patent number: 8380504
    Abstract: Embodiments of the present invention provide systems, methods, and computer-readable media for generating a voice characteristic profile based on detected sound components. In embodiments, a call is initiated between a first caller and a second caller. Information communicated during the call is monitored to determine that sound components have been spoken by the first caller. The sound components are determined to be associated with a language dialect. Further, the sound components are stored in association with the first caller. In particular, the sound components are stored in association with the first caller in a voice characteristic profile of the first caller.
    Type: Grant
    Filed: May 6, 2010
    Date of Patent: February 19, 2013
    Assignee: Sprint Communications Company L.P.
    Inventors: Mark Douglas Peden, Simon Youngs, Gary Duane Koller, Piyush Jethwa
  • Patent number: 8379868
    Abstract: The present invention provides a frequency-domain spatial audio coding framework based on the perceived spatial audio scene rather than on the channel content. In one embodiment, time-frequency spatial direction vectors are used as cues to describe the input audio scene.
    Type: Grant
    Filed: May 17, 2007
    Date of Patent: February 19, 2013
    Assignee: Creative Technology Ltd
    Inventors: Michael Goodwin, Jean-Marc Jot
  • Patent number: 8380331
    Abstract: Methods and apparatus for relative pitch tracking of multiple arbitrary sounds. A probabilistic method for pitch tracking may be implemented as or in a pitch tracking module. A constant-Q transform of an input signal may be decomposed to estimate one or more kernel distributions and one or more impulse distributions. Each kernel distribution represents a spectrum of a particular source, and each impulse distribution represents a relative pitch track for a particular source. The decomposition of the constant-Q transform may be performed according to shift-invariant probabilistic latent component analysis, and may include applying an expectation maximization algorithm to estimate the kernel distributions and the impulse distributions. When decomposing, a prior, e.g. a sliding-Gaussian Dirichlet prior or an entropic prior, and/or a temporal continuity constraint may be imposed on each impulse distribution.
    Type: Grant
    Filed: October 30, 2008
    Date of Patent: February 19, 2013
    Assignee: Adobe Systems Incorporated
    Inventors: Paris Smaragdis, Gautham J. Mysore
  • Patent number: 8374852
    Abstract: Disclosed is a code conversion method to convert a first code sequence conforming to a first speech coding scheme into a second code sequence conforming to a second speech coding scheme. The method includes the following steps. The first step discriminates whether the first code sequence corresponds to a speech part or to a non-speech part, and generates a numerical value that indicates the discrimination result as a control flag. The second step converts the first code sequence into the second code sequence and outputs said second code sequence, when the value of the control flag corresponds to the speech part. The third step outputs the second code sequence that corresponds to the value of the control flag, when the value of the control flag corresponds to the non-speech part.
    Type: Grant
    Filed: March 16, 2006
    Date of Patent: February 12, 2013
    Assignee: NEC Corporation
    Inventor: Atsushi Murashima
  • Patent number: 8374883
    Abstract: An encoder improves inter-channel prediction (ICP) performance in scalable stereo sound encoding using an ICP. In the encoder, ICP analysis units use, as reference signal candidates, a frequency coefficient in the low-band portion of a side residual signal, a frequency coefficient in each sub-band portion of a monaural residual signal, and a frequency coefficient in the low-band portion of the monaural residual signal, respectively, and perform an ICP analysis between the these respective candidates and a frequency coefficient in each sub-band portion of the side residual signal to generate first, second, and third ICP coefficients.
    Type: Grant
    Filed: October 31, 2008
    Date of Patent: February 12, 2013
    Assignee: Panasonic Corporation
    Inventors: Haishan Zhong, Zongxian Liu, Kok Seng Chong, Koji Yoshida
  • Patent number: 8371857
    Abstract: A method of teaching pronunciation is provided which includes communicating by a voice portal server to a user a model word and detecting a response by the user to the voice portal server. The method also includes comparing the response word to the model word and determining a confidence level based on the comparison of the response word to the model word. The method further includes comparing an acceptance limit to the confidence level and confirming a correct pronunciation of the model word if the confidence level one of equals and exceeds the acceptance limit.
    Type: Grant
    Filed: May 2, 2012
    Date of Patent: February 12, 2013
    Assignee: Robert Bosch GmbH
    Inventors: Madhuri Raya, Karsten Funk, Sharmila Ravula, Yao Meng
  • Patent number: 8374845
    Abstract: A word coinciding with a key word input by speech and a word related to the word are set as retrieval candidate words based on a word dictionary in which words representing formal names and aliases of the formal names are registered in association with a family attribute indicating a familiar relation among the words. Content related to any one of retrieval words selected out of the retrieval candidate words and a word related to the retrieval word is retrieved.
    Type: Grant
    Filed: February 29, 2008
    Date of Patent: February 12, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Miwako Doi, Kaoru Suzuki, Toshiyuki Koga, Koichi Yamamoto
  • Patent number: 8370132
    Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.
    Type: Grant
    Filed: November 21, 2005
    Date of Patent: February 5, 2013
    Assignee: Verizon Services Corp.
    Inventor: Adrian E. Conway
  • Patent number: 8364480
    Abstract: A method and corresponding apparatus for coded-domain acoustic echo control is presented. An echo control problem is considered as that of perceptually matching an echo signal to a reference signal. A perceptual similarity function that is based on the coded spectral parameters produced by the speech codec is defined. Since codecs introduce a significant degree of non-linearity into the echo signal, the similarity function is designed to be robust against such effects. The similarity function is incorporated into a coded-domain echo control system that also includes spectrally-matched noise injection for replacing echo frames with comfort noise. Using actual echoes recorded over a commercial mobile network, it is shown herein that the similarity function is robust against both codec non-linearities and additive noise. Experimental results further show that the echo-control is effective at suppressing echoes compared to a Normalized Least Mean Squared (NLMS)-based echo cancellation system.
    Type: Grant
    Filed: September 9, 2011
    Date of Patent: January 29, 2013
    Assignee: Tellabs Operations, Inc.
    Inventor: Rafid A. Sukkar
  • Patent number: 8364476
    Abstract: The invention pertains to a method and apparatus of efficient encoding and decoding of vector quantized data. The method and system explores and implements sub-division of a quantization vector space comprising class-leader vectors and representation of the class-leader vectors by a set of class-leader root-vectors facilitating faster encoding and decoding, and reduced storage requirements.
    Type: Grant
    Filed: October 15, 2010
    Date of Patent: January 29, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Peter Vary, Hauke Kruger, Bernd Geiser
  • Patent number: 8364492
    Abstract: Disclosed is an apparatus including an unvoiced speech input device, a decision unit and an alarm unit. The unvoiced speech input device receives the unvoiced speech, and the decision unit determines whether or not a signal received from the unvoiced speech input device is an ordinary speech. The alarm unit receives a result of the decision from the decision unit to give an alarm when the result of decision indicates the ordinary speech. The alarm is given to a wearer of the apparatus if he/she has made ordinary speech.
    Type: Grant
    Filed: July 6, 2007
    Date of Patent: January 29, 2013
    Assignee: NEC Corporation
    Inventor: Reishi Kondou
  • Patent number: 8364479
    Abstract: A system estimates the spectral noise power density of an audio signal includes a spectral noise power density estimation unit, a correction term processor, and a combination processor. The spectral noise power density estimation unit may provide a first estimate of the spectral noise power density of the audio signal. The correction term processor may provide a time dependent correction term based, at least in part, on a spectral noise power density estimation error of the actual spectral noise power density. The correction term may be determined so that the spectral noise power density estimation error is reduced. The combination processor may combine the first estimate with the correction term to obtain a second estimate of the spectral noise power density that may be used for subsequent signal processing to enhance a desired signal component of the audio signal.
    Type: Grant
    Filed: August 29, 2008
    Date of Patent: January 29, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Gerhard Uwe Schmidt, Tobias Wolff, Markus Buck
  • Patent number: 8359205
    Abstract: Methods and apparatus to audio watermarking and watermark detection and extracted are described herein. An example method includes receiving a media content signal, sampling the media content signal to generate samples, storing the samples in a buffer, determining a first sequence of samples in the buffer, determining a second sequence of samples in the buffer, wherein the second sequence of samples is of substantially equal length as the first sequence of samples, calculating an average of the first sequence of samples and the second sequence of samples to generate an average sequence of samples, extracting an identifier from the average sequence of samples, and storing the identifier in a tangible memory.
    Type: Grant
    Filed: August 31, 2009
    Date of Patent: January 22, 2013
    Assignee: The Nielsen Company (US), LLC
    Inventors: Venugopal Srinivasan, Alexander Topchy
  • Patent number: 8355509
    Abstract: The following coding scenario is addressed: A number of audio source signals need to be transmitted or stored for the purpose of mixing wave field synthesis, multi-channel surround, or stereo signals after decoding the source signals. The proposed technique offers significant coding gain when jointly coding the source signals, compared to separately coding them, even when no redundancy is present between the source signals. This is possible by considering statistical properties of the source signals, the properties of mixing techniques, and spatial hearing. The sum of the source signals is transmitted plus the statistical properties of the source signals which mostly determine the perceptually important spatial cues of the final mixed audio channels. Source signals are recovered at the receiver such that their statistical properties approximate the corresponding properties of the original source signals. Subjective evaluations indicate that high audio quality is achieved by the proposed scheme.
    Type: Grant
    Filed: August 10, 2007
    Date of Patent: January 15, 2013
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.
    Inventor: Christof Faller
  • Patent number: 8355909
    Abstract: A system for controlling dynamic range of an audio signal comprises an automatic gain control element that receives an input signal having a varying level and outputs a control signal that varies based on the varying level of the input signal and a modified input signal having a dynamic range different than a dynamic range of the input signal. The system also comprises an inverter that inverts the control signal or a block-based control signal corresponding to the control signal in block format. The system also comprises a variable gain element that receives the modified input signal and at least some of the inverted control signal or block-based control signal. The variable gain element also outputs a remainder signal corresponding to the modified input signal as unmodified based on the at least some of the inverted control signal or block-based control signal.
    Type: Grant
    Filed: June 12, 2012
    Date of Patent: January 15, 2013
    Assignee: Audyne, Inc.
    Inventors: Timothy J Carroll, Leif Claesson
  • Patent number: 8352250
    Abstract: A method of filtering a speech signal for speech encoding in a communications network, includes determining a cut off frequency for a filter, wherein a component of the speech signal in a frequency range less than the cut off frequency is to be attenuated by the filter; receiving the speech signal at the filter; determining at least one parameter of the received speech signal, the at least one parameter providing an indication of the energy of the component of the received speech signal that is to be attenuated; and adjusting the cut off frequency in dependence on the at least one parameter, thereby adjusting the frequency range to be attenuated.
    Type: Grant
    Filed: June 19, 2009
    Date of Patent: January 8, 2013
    Assignee: Skype
    Inventors: Koen Bernard Vos, Stefan Kurt Olof Strömmer
  • Patent number: 8352248
    Abstract: A system for encoding speech includes a speech encoder (106, FIG. 1), a speech recognizer (110), and a difference encoder (108). When the speech recognizer (110) recognizes a word, phoneme or feature within an input speech signal (122), the difference encoder (108) calculates the differences between speech parameters (140, 142) derived by the speech encoder (106) and speech parameters (146, 148) derived by the speech recognizer (110). The difference encoder (108) quantizes the differences (128), which replace corresponding encoder-derived parameters to be transmitted over a channel (130). In one embodiment, the difference encoder representation (128) of the speech parameters consumes fewer bits than the encoder-derived representation (124). Accordingly, the resulting bandwidth consumed by a single channel can be decreased.
    Type: Grant
    Filed: January 3, 2003
    Date of Patent: January 8, 2013
    Assignee: Marvell International Ltd.
    Inventors: Khosro Darroudi, Brian R. Mears
  • Patent number: 8352256
    Abstract: An audio input signal is filtered using an adaptive filter to generate a prediction output signal with reduced noise, wherein the filter is implemented using a plurality of coefficients to generate a plurality of prediction errors and to generate an error from the plurality of prediction errors, wherein the absolute values of the coefficients are continuously reduced by a plurality of reduction parameters.
    Type: Grant
    Filed: September 30, 2010
    Date of Patent: January 8, 2013
    Assignee: Entropic Communications, Inc.
    Inventor: Joern Fischer
  • Patent number: 8352249
    Abstract: An encoding device improves the sound quality of a stereo signal while maintaining a low bit rate. The encoding device includes: an LP inverse filter which LP-inverse-filters a left signal L(n) by using an inverse quantization linear prediction coefficient AdM(z) of a monaural signal; a T/F conversion unit which converts the left sound source signal Le(n) from a temporal region to a frequency region; an inverse quantizer which inverse-quantizes encoded information Mqe; spectrum division units which divide a high-frequency component of the sound source signal Mde(f) and the left signal Le(f) into a plurality of bands; and scale factor calculation units which calculate scale factors ai and ssi by using a monaural sound source signal Mdeh,i(f), a left sound source signal Leh,i(f), Mdeh,i(f), and right sound source signal Reh,i(f) of each divided band.
    Type: Grant
    Filed: November 4, 2008
    Date of Patent: January 8, 2013
    Assignee: Panasonic Corporation
    Inventors: Kok Seng Chong, Koji Yoshida, Masahiro Oshikiri
  • Patent number: 8352280
    Abstract: An audio encoder adapted to encode a multi-channel audio signal. The encoder comprises an encoder combination module (ECM) for generating a dominant signal part and a residual signal part being a combined representation of first and second audio signals, the dominant and residual signal parts being obtained by applying a mathematical procedure to the first and second audio signals. The mathematical procedure involves a spatial parameter comprising a description of spatial properties of the first and second audio signals. Embodiments include a plurality of interconnected encoder combination module, so that e.g. six independent 5.1 format audio signals can be encoded to a single or two dominant signal parts and a number of parameter sets and residual signal parts.
    Type: Grant
    Filed: September 7, 2011
    Date of Patent: January 8, 2013
    Inventors: Francois Philippus Myburg, Erik Gosuinus Petrus Schuijers