Speech Signal Processing Patents (Class 704/200)
  • Patent number: 8688441
    Abstract: One provides (101) a digital audio signal having a corresponding signal bandwidth, and then provides (102) an energy value that corresponds to at least an estimate of out-of-signal bandwidth energy as corresponds to that digital audio signal. One then uses (103) the energy value to simultaneously determine both a spectral envelope shape and a corresponding suitable energy for the spectral envelope shape for out-of-signal bandwidth content as corresponds to the digital audio signal. By one approach, if desired, one then combines (104) (on, for example, a frame by frame basis) the digital audio signal with the out-of-signal bandwidth content to provide a bandwidth extended version of the digital audio signal to be audibly rendered to thereby improve corresponding audio quality of the digital audio signal as so rendered.
    Type: Grant
    Filed: November 29, 2007
    Date of Patent: April 1, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Tenkasi V. Ramabadran, Mark A. Jasiuk
  • Patent number: 8688442
    Abstract: An audio decoding apparatus comprises: a plurality of decoding units; a band replicating unit which processes a decoded signal obtained when a corresponding decoding unit decodes a coded signal, according to a scheme specified by transmitted information; and an information transmitting unit which transmits, to a signal processing unit, information identifying the corresponding decoding unit from among the plurality of decoding units.
    Type: Grant
    Filed: March 28, 2012
    Date of Patent: April 1, 2014
    Assignee: Panasonic Corporation
    Inventors: Shuji Miyasaka, Kosuke Nishio, Takeshi Norimatsu
  • Patent number: 8688440
    Abstract: There is disclosed an encoding device capable of improving similarity between the high frequency band spectrum of the original signal and a new spectrum to be generated while realizing a low bit rate when encoding a wide-band signal spectrum. The encoding device has sub-band amplitude calculation units for calculating the amplitude of the respective sub-bands for the high frequency band spectrum obtained from the wide-band signal. A search unit and a gain codebook select some sub-bands from a plurality of sub-bands and only the gain of the selected sub-bands is subjected to encoding. An interpolation unit expresses the gain of the sub-band not selected, by mutually interpolating the selected gains.
    Type: Grant
    Filed: May 8, 2013
    Date of Patent: April 1, 2014
    Assignee: Panasonic Corporation
    Inventor: Masahiro Oshikiri
  • Patent number: 8688092
    Abstract: Systems and methods that can be utilized to convert a voice communication received over a telecommunication network to text are described. In an illustrative embodiment, a call processing system coupled to a telecommunications network receives a call from a caller intended for a first party, wherein the call is associated with call signaling information. At least a portion of the call signaling information is stored in a computer readable medium. A greeting is played the caller, and a voice communication from the caller is recorded. At least a portion of the voice communication is converted to text, which is analyzed to identify portions that are inferred to be relatively more important to communicate to the first party. A text communication is generated including at least some of the identified portions and including fewer words than the recorded voice communication. At least a portion of the text communication is made available to the first party over a data network.
    Type: Grant
    Filed: May 20, 2013
    Date of Patent: April 1, 2014
    Assignee: Callwave Communications, LLC
    Inventors: Anthony Bladon, David Giannini, David Frank Hofstatter, Colin Kelley, David C. McClintock, Robert F. Smith, David S. Trandal, Leland W. Kirchhoff
  • Patent number: 8681978
    Abstract: Methods, devices, and computer program products enable the embedding of forensic marks in a host content that is in compressed domain. These and other features are achieved by preprocessing of a host content to provide a plurality of host content versions with different embedded watermarks that are subsequently compressed. A host content may then be efficiently marked with forensic marks in response to a request for such content. The marking process is conducted in compressed domain, thus reducing the computational burden of decompressing and re-compressing the content, and avoiding further perceptual degradation of the host content. In addition, methods, devices and computer program products are disclosed that obstruct differential analysis of such forensically marked content.
    Type: Grant
    Filed: December 17, 2012
    Date of Patent: March 25, 2014
    Assignee: Verance Corporation
    Inventors: Rade Petriovic, Dai Yang
  • Patent number: 8682651
    Abstract: The invention relates to the analysis of characteristics of audio and/or video signals for the generation of audio-visual content signatures. To determine an audio signature a region of interest for example of high entropy—is identified in audio signature data. This region of interest is then provided as an audio signature with offset information. A video signature is also provided.
    Type: Grant
    Filed: February 20, 2009
    Date of Patent: March 25, 2014
    Assignee: Snell Limited
    Inventor: Jonathan Diggins
  • Patent number: 8682645
    Abstract: The present disclosure relates to a signal analyzer for processing an overlapped input signal frame comprising 2N subsequent input signal values. The signal analyzer comprises: a windower adapted to window the overlapped input signal frame to obtain a windowed signal, wherein the windower is adapted to zero M+N/2 subsequent input signal values of the overlapped input signal frame, wherein M is equal or greater than 1 and smaller than N/2; and a transformer adapted to transform the remaining 3N/2?M subsequent windowed signal values of the windowed signal using N?M sets of transform parameters to obtain a transformed-domain signal comprising N?M transformed-domain signal values.
    Type: Grant
    Filed: April 15, 2013
    Date of Patent: March 25, 2014
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Anisse Taleb, Fengyan Qi, Chen Hu
  • Patent number: 8682650
    Abstract: Non-intrusive speech-quality assessment uses vocal-tract models, in particular for testing telecommunications systems and equipment. This process requires reduction of the speech stream under assessment into a set of parameters that are sensitive to the types of distortion to be assessed. Once parameterized, the data is used to generate a set of physiologically-based rules for error identification, using a parametric modeling of the shape of the vocal tract itself, by comparison between derived parameters and the output of models of physiologically realistic forms for the vocal tract, and the application of physical constraints on how these can change over time.
    Type: Grant
    Filed: December 30, 2005
    Date of Patent: March 25, 2014
    Assignee: Psytechnics Limited
    Inventors: Philip Gray, Michael P Hollier
  • Patent number: 8682018
    Abstract: Microphone arrays (MAs) are described that position and vent microphones so that performance of a noise suppression system coupled to the microphone array is enhanced. The MA includes at least two physical microphones to receive acoustic signals. The physical microphones make use of a common rear vent (actual or virtual) that samples a common pressure source. The MA includes a physical directional microphone configuration and a virtual directional microphone configuration. By making the input to the rear vents of the microphones (actual or virtual) as similar as possible, the real-world filter to be modeled becomes much simpler to model using an adaptive filter.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: March 25, 2014
    Assignee: AliphCom
    Inventor: Gregory C. Burnett
  • Patent number: 8676570
    Abstract: Example methods, apparatus and articles of manufacture to perform audio watermark decoding are disclosed. A disclosed example method includes receiving an audio signal including an audience measurement code embedded therein using a first plurality of frequency components, sampling the audio signal, transforming the sampled audio signal into a first frequency domain representation, determining whether the code is detectable in the first plurality of frequency components of the first frequency domain representation, and when the code is not detected in the first plurality of frequency components, examining a second plurality of frequency components of a second frequency domain representation to determine whether the code is detected, the second plurality of frequency components being offset from the first plurality of frequency components by a first offset, the first offset corresponding to a sampling frequency mismatch.
    Type: Grant
    Filed: April 26, 2010
    Date of Patent: March 18, 2014
    Assignee: The Nielsen Company (US), LLC
    Inventors: Daniel J. Nelson, Venugopal Srinivasan, John C. Peiffer
  • Patent number: 8676569
    Abstract: An error concealment method and apparatus for an audio signal and a decoding method and apparatus for an audio signal using the error concealment method and apparatus. The error concealment method includes selecting one of an error concealment in a frequency domain and an error concealment in a time domain as an error concealment scheme for a current frame based on a predetermined criteria when an error occurs in the current frame, selecting one of a repetition scheme and an interpolation scheme in the frequency domain as the error concealment scheme for the current frame based on a predetermined criteria when the error concealment in the frequency domain is selected, and concealing the error of the current frame using the selected scheme.
    Type: Grant
    Filed: July 9, 2012
    Date of Patent: March 18, 2014
    Assignee: Samsung Electronics Co., Ltd
    Inventors: Eun-mi Oh, Ki-hyun Choo, Ho-sang Sung, Chang-yong Son, Jung-hoe Kim, Kang-eun Lee
  • Patent number: 8670554
    Abstract: A method is provided for encoding multiple microphone signals into a composite source-separable audio (SSA) signal, conducive for transmission over a voice network. The embodiments enable the processing of source separation of the target voice signal from its ambient sound to be performed at any point in the voice communication network, including the internet cloud. A multiplicity of processing is possible over the SSA signal, based on the intended voice application. The level of processing is adapted with the availability of the processing power at the chosen processing node in the network in one embodiment. An apparatus for separating out the target source voice from its ambient sound is also provided. The apparatus includes a directed source separation (DSS) unit, which processes the two virtual microphone signals in the SSA representation, to generate a new SSA signal including the enhanced target voice and the enhanced ambient noise.
    Type: Grant
    Filed: April 20, 2012
    Date of Patent: March 11, 2014
    Assignee: Aurenta Inc.
    Inventor: Shridhar K. Mukund
  • Patent number: 8660840
    Abstract: A method and apparatus for predictively quantizing voiced speech includes a parameter generator and a quantizer. The parameter generator is configured to extract parameters from frames of predictive speech such as voiced speech, and to transform the extracted information to a frequency-domain representation. The quantizer is configured to subtract a weighted sum of the parameters for previous frames from the parameter for the current frame. The quantizer is configured to quantize the difference value. A prototype extractor may be added to first extract a pitch period prototype to be processed by the parameter generator.
    Type: Grant
    Filed: August 12, 2008
    Date of Patent: February 25, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Arasanipalai K. Ananthapadmanabhan, Sarath Manjunath, Pengjun Huang, Eddie-Lun Tik Choy, Andrew P. Dejaco
  • Patent number: 8655440
    Abstract: A speech sound intelligibility assessment system includes: a biological signal measurement section for measuring an electroencephalogram signal of a user; a presented-speech sound control section for determining a speech sound to be presented by referring to a speech sound database retaining a plurality of monosyllabic speech sounds; an audio output section for presenting the speech sound determined by the presented-speech sound control section as an audio; a characteristic component detection section for utilizing the electroencephalogram signal of the user measured by the biological signal measurement section to determine presence or absence of a characteristic component of an event-related potential at 800 ms±100 ms from a point of presenting the audio; and a speech sound intelligibility assessment section for, based on a result of determination by the characteristic component detection section, determining whether the user has aurally comprehended the speech sound or not.
    Type: Grant
    Filed: March 1, 2011
    Date of Patent: February 18, 2014
    Assignee: Panasonic Corporation
    Inventors: Shinobu Adachi, Koji Morikawa
  • Patent number: 8655651
    Abstract: The invention relates to a method, computer, computer program and computer program product for speech quality estimation. The method comprises the steps of: determining a coding distortion parameter (QCOD), a bandwidth related distortion parameter (BW) and a presentation level distortion parameter (PL) of a speech signal; extracting a first coefficient (?l) and a second coefficient (?2), the first coefficient and the second coefficient being dependent on the coding distortion parameter; and calculating a signal quality measure (Q), where the signal quality measure is QCOD+?1BW+?2PL using the signal quality measure in a quality estimation of the speech signal.
    Type: Grant
    Filed: July 26, 2010
    Date of Patent: February 18, 2014
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventors: Volodya Grancharov, Mats Folkesson
  • Patent number: 8655439
    Abstract: A speech discriminability assessment system includes: a biological signal measurement section for measuring an electroencephalogram signal of a user; a presented-speech sound control section for determining a speech sound to be presented to the user by referring to a speech sound database retaining a plurality of monosyllabic sound data; an audio presentation section for presenting an audio associated with the determined speech sound to the user; a character presentation section for presenting a character associated with the determined speech sound to the user, subsequent to the presentation of the audio by the audio presentation section; an unexpectedness detection section for detecting presence or absence of an unexpectedness signal from the measured electroencephalogram signal of the user, the unexpectedness signal representing a positive component at 600 ms±100 ms after a time point when the character was presented to the user; and a speech sound discriminability determination section for determining a sp
    Type: Grant
    Filed: December 3, 2010
    Date of Patent: February 18, 2014
    Assignee: Panasonic Corporation
    Inventors: Shinobu Adachi, Koji Morikawa
  • Patent number: 8650029
    Abstract: A voice activity detection (VAD) module analyzes a media file, such as an audio file or a video file, to determine whether one or more frames of the media file include speech. A speech recognizer generates feedback relating to an accuracy of the VAD determination. The VAD module leverages the feedback to improve subsequent VAD determinations. The VAD module also utilizes a look-ahead window associated with the media file to adjust estimated probabilities or VAD decisions for previously processed frames.
    Type: Grant
    Filed: February 25, 2011
    Date of Patent: February 11, 2014
    Assignee: Microsoft Corporation
    Inventors: Albert Joseph Kishan Thambiratnam, Weiwu Zhu, Frank Torsten Bernd Seide
  • Patent number: 8650027
    Abstract: The invention provides an electrolaryngeal speech reconstruction method and a system thereof. Firstly, model parameters are extracted from the collected speech as a parameter library, then facial images of a speaker are acquired and then transmitted to an image analyzing and processing module to obtain the voice onset and offset times and the vowel classes, then a waveform of a voice source is synthesized by a voice source synthesis module, finally, the waveform of the above voice source is output by an electrolarynx vibration output module, wherein the voice source synthesis module firstly sets the model parameters of a glottal voice source so as to synthesize the waveform of the glottal voice source, and then a waveguide model is used to simulate sound transmission in a vocal tract and select shape parameters of the vocal tract according to the vowel classes.
    Type: Grant
    Filed: September 4, 2012
    Date of Patent: February 11, 2014
    Assignee: Xi'an Jiaotong University
    Inventors: Mingxi Wan, Liang Wu, Supin Wang, Zhifeng Niu, Congying Wan
  • Patent number: 8645129
    Abstract: A system and method is described that improves the intelligibility of a far-end telephone speech signal to a user of a telephony device in the presence of near-end background noise. As described herein, the system and method improves the intelligibility of the far-end telephone speech signal in a manner that does not require user input and that minimizes the distortion of the far-end telephone speech signal. The system is integrated with an acoustic echo canceller and shares information therewith.
    Type: Grant
    Filed: May 12, 2009
    Date of Patent: February 4, 2014
    Assignee: Broadcom Corporation
    Inventors: Wilfrid LeBlanc, Jes Thyssen, Juin-Hwey Chen
  • Patent number: 8645127
    Abstract: Traditional audio encoders may conserve coding bit-rate by encoding fewer than all spectral coefficients, which can produce a blurry low-pass sound in the reconstruction. An audio encoder using wide-sense perceptual similarity improves the quality by encoding a perceptually similar version of the omitted spectral coefficients, represented as a scaled version of already coded spectrum. The omitted spectral coefficients are divided into a number of sub-bands. The sub-bands are encoded as two parameters: a scale factor, which may represent the energy in the band; and a shape parameter, which may represent a shape of the band. The shape parameter may be in the form of a motion vector pointing to a portion of the already coded spectrum, an index to a spectral shape in a fixed code-book, or a random noise vector. The encoding thus efficiently represents a scaled version of a similarly shaped portion of spectrum to be copied at decoding.
    Type: Grant
    Filed: November 26, 2008
    Date of Patent: February 4, 2014
    Assignee: Microsoft Corporation
    Inventors: Sanjeev Mehrotra, Wei-Ge Chen
  • Patent number: 8638991
    Abstract: A method is presented for imaging an object. The method comprises imaging a coherent speckle pattern propagating from an object, using an imaging system being focused on a plane displaced from the object.
    Type: Grant
    Filed: July 21, 2008
    Date of Patent: January 28, 2014
    Assignees: Bar Ilan University, Universtitat de Valencia
    Inventors: Zeev Zalevsky, Javier Garcia
  • Patent number: 8638951
    Abstract: At least two microphones generate wideband electrical audio signals in response to incoming sound waves, and the wideband audio signals are filtered to generate low band signals and high band signals. From the low band signals, low band beamformed signals are generated, and the low band beamformed signals are combined with the high band signals to generate modified wideband audio signals. In one implementation, an electronic apparatus is provided that includes a microphone array, a crossover, a beamformer module, and a combiner module. The microphone array has at least two pressure microphones that generate wideband electrical audio signals in response to incoming sound waves. The crossover generates low band signals and high band signals from the wideband electrical audio signals. The beamformer module generates low band beamformed signals from the low band signals. The combiner module combines the high band signals and the low band beamformed signals to generate modified wideband audio signals.
    Type: Grant
    Filed: July 15, 2010
    Date of Patent: January 28, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Robert Zurek, Kevin Bastyr, Joel Clark, Plamen Ivanov
  • Patent number: 8639498
    Abstract: Provided are an apparatus and method for coding and decoding a multi object audio signal with multi channel. The apparatus includes: a multi channel encoding means for down-mixing an audio signal including a plurality of channels, generating a spatial cue for the audio signal including the plurality of channels, and generating first rendering information including the generated spatial cue; and a multi object encoding unit for down-mixing an audio signal including a plurality of objects, which includes the down-mixed signal from the multi channel encoding unit, generating a spatial cue for the audio signal including the plurality of objects, and generating second rendering information including the generated spatial cue, wherein the multichannel encoding unit generates a spatial cue for the audio signal including the plurality of objects regardless of a Coder-DECoder (CODEC) scheme the limits the multi channel encoding unit.
    Type: Grant
    Filed: March 31, 2008
    Date of Patent: January 28, 2014
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Seung-Kwon Beack, Jeong-Il Seo, Tae-Jin Lee, Dae-Young Jang, Kyeong-Ok Kang, Jin-Woo Hong, Jin-Woong Kim
  • Patent number: 8639503
    Abstract: A method for encoding speech includes processing an input speech signal using an encoder, resulting in a compressed encoder representation of the input speech signal. The method also includes, if a speech recognizer identifies, in the input speech signal, a corresponding dictionary speech element that approximates the input speech signal, determining, with an electronic device, a compressed recognizer representation of the corresponding dictionary speech element, calculating, with the electronic device, one or more differences between the compressed encoder representation and the compressed recognizer representation, and compiling, with the electronic device, compressed speech information that includes representations of the one or more differences. The encoder and the speech recognizer are implemented with the electronic device.
    Type: Grant
    Filed: January 3, 2013
    Date of Patent: January 28, 2014
    Assignee: Marvell International Ltd.
    Inventors: Khosro Darroudi, Brian R. Mears
  • Patent number: 8638945
    Abstract: An encoding method and apparatus and a decoding method and apparatus are provided. The decoding method includes skipping extension information included in an input bitstream, extracting a three-dimensional (3D) down-mix signal and spatial information from the input bitstream, removing 3D effects from the 3D down-mix signal by performing a 3D rendering operation on the 3D down-mix signal, and generating a multi-channel signal using a down-mix signal obtained by the removal and the spatial information. Accordingly, it is possible to efficiently encode multi-channel signals with 3D effects and to adaptively restore and reproduce audio signals with optimum sound quality according to the characteristics of an audio reproduction environment.
    Type: Grant
    Filed: February 7, 2007
    Date of Patent: January 28, 2014
    Assignee: LG Electronics, Inc.
    Inventors: Yang Won Jung, Hee Suk Pang, Hyen O Oh, Dong Soo Kim, Jae Hyun Lim
  • Patent number: 8626495
    Abstract: The invention relates to a method of identifying and correcting errors in a noisy binary mask. An object of the present invention is to provide a scheme for improving a binary mask representing speech. The problem is solved in that the method comprises a) providing a noisy binary mask comprising a binary representation of the power density of an acoustic signal comprising a target signal and a noise signal at a predefined number of discrete frequencies and a number of discrete time instances; b) providing a statistical model of a clean binary mask representing the power density of the target signal; and c) using the statistical model to detect and correct errors in the noisy binary mask. This has the advantage of providing an alternative and relatively simple way of improving an estimate of a binary mask representing a speech signal. The invention may e.g. be used for speech processing, e.g. in a hearing instrument.
    Type: Grant
    Filed: August 4, 2010
    Date of Patent: January 7, 2014
    Assignee: Oticon A/S
    Inventors: Jesper Bünsow Boldt, Ulrik Kjems, Michael Syskind Pedersen, Mads Graesbøll Christensen, Søren Holdt Jensen
  • Patent number: 8626493
    Abstract: Sounds are inserted into audio content according to a pattern. A library stores humanly perceptible voice sounds. Pattern control information is received that is associated with a device recording the audio content. A pattern is retrieved and washing machine sounds are inserted into the audio content according to the pattern. The humanly perceptible voice sounds are inserted into the audio content according to the pattern to generate a signed audio recording.
    Type: Grant
    Filed: April 26, 2013
    Date of Patent: January 7, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Steven N. Tischer
  • Patent number: 8626501
    Abstract: An encoding apparatus includes a time-frequency transform unit that performs a time-frequency transform on an audio signal, a normalization unit that normalizes a frequency spectral coefficient obtained by the time-frequency transform in order to generate encoded data of the audio signal, a level calculation unit that calculates a level of the audio signal, a scale factor changing unit that changes a concealment scale factor included in encoded concealment data obtained by performing, on the basis of the level of the audio signal, a time-frequency transform and normalization on a minute noise signal, the concealment scale factor being a scale factor relating to a coefficient used for the normalization, and an output unit that outputs the encoded data of the audio signal generated by the normalization unit or outputs, as encoded data of the audio signal, the encoded concealment data whose concealment scale factor has been changed.
    Type: Grant
    Filed: November 23, 2011
    Date of Patent: January 7, 2014
    Assignee: Sony Corporation
    Inventors: Yasuhiro Toguri, Jun Matsumoto, Yuuji Maeda, Shiro Suzuki, Yuuki Matsumura
  • Patent number: 8620674
    Abstract: An audio encoder and decoder use architectures and techniques that improve the efficiency of multi-channel audio coding and decoding. The described strategies include various techniques and tools, which can be used in combination or independently. For example, an audio encoder performs a pre-processing multi-channel transform on multi-channel audio data, varying the transform so as to control quality. The encoder groups multiple windows from different channels into one or more tiles and outputs tile configuration information, which allows the encoder to isolate transients that appear in a particular channel with small windows, but use large windows in other channels. Using a variety of techniques, the encoder performs flexible multi-channel transforms that effectively take advantage of inter-channel correlation. An audio decoder performs corresponding processing and decoding. In addition, the decoder performs a post-processing multi-channel transform for any of multiple different purposes.
    Type: Grant
    Filed: January 31, 2013
    Date of Patent: December 31, 2013
    Assignee: Microsoft Corporation
    Inventors: Naveen Thumpudi, Wei-Ge Chen
  • Patent number: 8620671
    Abstract: Filter banks may have different structures and different individual output signal domains. Often a translation between different filter bank domains is desirable. Usually, mapping matrices are used that, however, vary over frequency. This requires a significant amount of lookup tables. A method for transforming first data frames of a first filter bank domain to second data frames of a different second filter bank domain, comprises steps of transcoding sub-bands of the first filter bank domain into sub-bands of an intermediate domain that corresponds to said second filter bank domain but has warped phase, and transcoding the sub-bands of the intermediate domain to sub-bands of the second filter bank domain, wherein a phase correction is performed on the sub-bands of the intermediate domain.
    Type: Grant
    Filed: February 19, 2009
    Date of Patent: December 31, 2013
    Assignee: Thomson Licensing
    Inventors: Peter Jax, Sven Kordon
  • Patent number: 8620673
    Abstract: Embodiments of the present invention disclose an audio decoding method, including: determining that bitstreams to be decoded are monophony coding layer and first stereo enhancement layer bitstreams; decoding the monophony coding layer to obtain a monophony decoded frequency-domain signal; reconstructing left and right channel frequency-domain signals in a first sub-band region by utilizing the monophony decoded frequency-domain signal after an energy adjustment; and reconstructing left and right channel frequency-domain signals in a second sub-band region by utilizing the monophony decoded frequency-domain signal without the energy adjustment.
    Type: Grant
    Filed: November 14, 2011
    Date of Patent: December 31, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Qi Zhang, Libin Zhang
  • Patent number: 8619758
    Abstract: The present invention includes a network telephone having a microphone coupled to provide voice data to a network, a speaker coupled to facilitate listening to voice data from the network, a dialing device coupled to facilitate routing of voice data upon the network, a first port configured to facilitate communication with a first network device, a second port configured to facilitate communication with a second network device and a prioritization circuit coupled to apply prioritization to voice data provided by the microphone.
    Type: Grant
    Filed: March 25, 2011
    Date of Patent: December 31, 2013
    Assignee: Broadcom Corporation
    Inventors: Theodore F. Rabenko, Ian Crayford, David L. Hartman, Jr.
  • Patent number: 8620648
    Abstract: An audio encoding device which can improve encoding performance while performing division search on an algebraic codebook in an audio encoding. In a distortion minimizing unit (112) of a CELP encoding device: a maximum correlation value calculation unit (221) calculates a correlation value by using each pulse and a target signal in each candidate position for four pulses constituting the fixed codebook so as to acquire a maximum value of the correlation value for each pulse and calculates a maximum correlation value by using the maximum value of the correlation value; a sorting unit (222) divides the four pulses into two subsets each having two pulses; and a search unit (224) performs a division search on the fixed codebook and acquires a code indicating the positions and polarities of the four pulses where the encoding distortion is minimum.
    Type: Grant
    Filed: July 25, 2008
    Date of Patent: December 31, 2013
    Assignee: Panasonic Corporation
    Inventor: Toshiyuki Morii
  • Patent number: 8620646
    Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and a representation of harmonic envelope at the estimated pitch. The estimated pitch and the representation of harmonic envelope may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.
    Type: Grant
    Filed: August 8, 2011
    Date of Patent: December 31, 2013
    Assignee: The Intellisis Corporation
    Inventors: David C. Bradley, Rodney Gateau, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher
  • Patent number: 8620672
    Abstract: Phase-based processing of a multichannel signal, and applications including proximity detection, are disclosed.
    Type: Grant
    Filed: June 8, 2010
    Date of Patent: December 31, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Erik Visser, Ernan Liu
  • Patent number: 8615040
    Abstract: A technique for suppressing a significant variation of a quantization step value and enabling a stable rate control to be performed. A function is used for calculating a quantization step conversion factor from a bit rate ratio is a straight line with an inclination of ?1, intersecting a function at a reference point. The function is a monotone decreasing exponential function. A reference bit rate ratio (R0) is expressed as R0=T/S by using a total bit rate (S) of a first stream and a total target bit rate (T) of a second stream. The function appropriately represents a relation between the bit rate ratio and the quantization step conversion factor in coding conversion but has a large rate of variation in an area where the bit rate ratio is about 0.5. The function has a small rate of variation and can suppress a significant variation of the quantization step conversion value.
    Type: Grant
    Filed: March 6, 2009
    Date of Patent: December 24, 2013
    Assignee: MegaChips Corporation
    Inventors: Nobumasa Narimatsu, Hiromu Hasegawa
  • Patent number: 8615390
    Abstract: The invention relates to transform coding/decoding of a digital audio signal represented by a succession of frames, using windows of different lengths. For the coding within the meaning of the invention, it is sought to detect (51) a particular event, such as an attack, in a current frame (Ti); and, at least if said particular event is detected at the start of the current frame (53), a short window (54) is directly applied in order to code (56) the current frame (Ti) without applying a transition window. Thus, the coding has a reduced delay in relation to the prior art. In addition, an ad hoc processing is applied during decoding in order to compensate for the direct passage from a long window to a short window during coding.
    Type: Grant
    Filed: December 18, 2007
    Date of Patent: December 24, 2013
    Assignee: France Telecom
    Inventors: Balazs Kovesi, David Virette, Pierrick Philippe
  • Patent number: 8612242
    Abstract: Methods and apparatus for coordinating audio data processing and network communication processing in a communication device are disclosed. An exemplary method begins with demodulating a series of received communication frames, using a network communication processing circuit, to produce received encoded audio frames. An event report for each of one or more of the received encoded audio frames is generated, the event report indicating a network communication circuit processing time associated with the corresponding received encoded audio frames. The received encoded audio frames are decoded, using an audio data processing circuit, and the decoded audio is output to an audio circuit. The timing of the outputting of the decoded audio is adjusted, based on the generated event reports.
    Type: Grant
    Filed: August 18, 2010
    Date of Patent: December 17, 2013
    Assignee: ST-Ericsson SA
    Inventors: BĂ©la Rathonyi, Jan Fex
  • Patent number: 8612214
    Abstract: An apparatus for generating bandwidth extension output data for an audio signal has a noise floor measurer, a signal energy characterizer and a processor. The audio signal has components in a first frequency band and components in a second frequency band, the bandwidth extension output data are adapted to control a synthesis of the components in the second frequency band. The noise floor measurer measures noise floor data of the second frequency band for a time portion of the audio signal. The signal energy characterizer derives energy distribution data, the energy distribution data characterizing an energy distribution in a spectrum of the time portion of the audio signal. The processor combines the noise floor data and the energy distribution data to obtain the bandwidth extension output data.
    Type: Grant
    Filed: January 11, 2011
    Date of Patent: December 17, 2013
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Max Neuendorf, Bernhard Grill, Ulrich Kraemer, Markus Multrus, Harald Popp, Nikolaus Rettelbach, Frederik Nagel, Markus Lohwasser, Marc Gayer, Manuel Jander, Virgilio Bacigalupo
  • Publication number: 20130325474
    Abstract: Computationally implemented methods and systems include receiving indication of initiation of a speech-facilitated transaction between a party and a target device, and receiving adaptation data correlated to the party. The receiving is facilitated by a particular device associated with the party. The adaptation data is at least partly based on previous adaptation data derived at least in part from one or more previous speech interactions of the party. The methods and systems also include applying the received adaptation data correlated to the party to the target device, and processing speech from the party using the target device to which the received adaptation data has been applied. In addition to the foregoing, other aspects are described in the claims, drawings, and text.
    Type: Application
    Filed: May 31, 2012
    Publication date: December 5, 2013
    Inventors: Royce A. Levien, Richard T. Lord, Robert W. Lord, Mark A. Malamud, John D. Rinaldo, JR.
  • Publication number: 20130325459
    Abstract: Computationally implemented methods and systems include receiving indication of initiation of a speech-facilitated transaction between a party and a target device, and receiving adaptation data correlated to the party. The receiving is facilitated by a particular device associated with the party. The adaptation data is at least partly based on previous adaptation data derived at least in part from one or more previous speech interactions of the party. The methods and systems also include applying the received adaptation data correlated to the party to the target device, and processing speech from the party using the target device to which the received adaptation data has been applied. In addition to the foregoing, other aspects are described in the claims, drawings, and text.
    Type: Application
    Filed: May 31, 2012
    Publication date: December 5, 2013
    Inventors: Royce A. Levien, Richard T. Lord, Robert W. Lord, Mark A. Malamud, John D. Rinaldo, JR.
  • Patent number: 8595001
    Abstract: A method applies a parametric approach to bandwidth extension but does not require training. The method computes narrowband linear predictive coefficients from a received narrowband speech signal, computes narrowband partial correlation coefficients using recursion, computes Mnb area coefficients from the partial correlation coefficient, and extracts Mwb area coefficients using interpolation. Wideband parcors are computed from the Mwb area coefficients and wideband LPCs are computed from the wideband parcors. The method further comprises synthesizing a wideband signal using the wideband LPCs and a wideband excitation signal, highpass filtering the synthesized wideband signal to produce a highband signal, and combining the highband signal with the original narrowband signal to generate a wideband signal.
    Type: Grant
    Filed: November 7, 2011
    Date of Patent: November 26, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: David Malah, Richard Vandervoort Cox
  • Patent number: 8588463
    Abstract: To modify a facial feature region in a video bitstream, the video bitstream is received and a feature region is extracted from the video bitstream. An audio characteristic, such as frequency, rhythm, or tempo is retrieved from an audio bitstream, and the feature region is modified according to the audio characteristic to generate a modified image. The modified image is outputted.
    Type: Grant
    Filed: April 11, 2013
    Date of Patent: November 19, 2013
    Assignee: CyberLink Corp.
    Inventors: Hao-Ping Hung, Wei-Hsin Tseng
  • Patent number: 8583443
    Abstract: Disclosed is a recording and reproducing apparatus comprising: an apparatus main body; and a remote controller to perform remote control of the apparatus main body, wherein the remote controller comprises: a key operating section to receive a key operation by a user; a sound information inputting section to input sound information; and a transmitting section to transmit sound data based on the sound information to the apparatus main body, and the apparatus main body comprises: a recording section to record input content data on a recording medium; a reproducing section to reproduce the content data; a receiving section to receive the sound data; a sound information recording section to record the sound data so as to be associated with a piece of the content data; and a sound information outputting section to reproduce the sound data to output the reproduced sound data.
    Type: Grant
    Filed: April 10, 2008
    Date of Patent: November 12, 2013
    Assignee: Funai Electric Co., Ltd.
    Inventor: Masayuki Misawa
  • Patent number: 8582804
    Abstract: To modify a facial feature region in a video bitstream, the video bitstream is received and a feature region is extracted from the video bitstream. An audio characteristic, such as frequency, rhythm, or tempo is retrieved from an audio bitstream, and the feature region is modified according to the audio characteristic to generate a modified image. The modified image is outputted.
    Type: Grant
    Filed: April 11, 2013
    Date of Patent: November 12, 2013
    Assignee: CyberLink Corp.
    Inventors: Hao-Ping Hung, Wei-Hsin Tseng
  • Patent number: 8583439
    Abstract: Improved methods of presenting speech prompts to a user as part of an automated system that employs speech recognition or other voice input are described. The invention improves the user interface by providing in combination with at least one user prompt seeking a voice response, an enhanced user keyword prompt intended to facilitate the user selecting a keyword to speak in response to the user prompt. The enhanced keyword prompts may be the same words as those a user can speak as a reply to the user prompt but presented using a different audio presentation method, e.g., speech rate, audio level, or speaker voice, than used for the user prompt. In some cases, the user keyword prompts are different words from the expected user response keywords, or portions of words, e.g., truncated versions of keywords.
    Type: Grant
    Filed: January 12, 2004
    Date of Patent: November 12, 2013
    Assignee: Verizon Services Corp.
    Inventor: James Mark Kondziela
  • Patent number: 8583426
    Abstract: A method for enhancing speech components of an audio signal composed of speech and noise components processes subbands of the audio signal, the processing including controlling the gain of the audio signal in ones of the subbands, wherein the gain in a subband is controlled at least by processes that convey either additive/subtractive differences in gain or multiplicative ratios of gain so as to reduce gain in a subband as the level of noise components increases with respect to the level of speech components in the subband and increase gain in a subband when speech components are present in subbands of the audio signal, the processes each responding to subbands of the audio signal and controlling gain independently of each other to provide a processed subband audio signal.
    Type: Grant
    Filed: September 10, 2008
    Date of Patent: November 12, 2013
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Rongshan Yu, Charles Phillip Brown
  • Patent number: 8577045
    Abstract: An encoding apparatus comprises a frame processor (105) which receives a multi channel audio signal comprising at least a first audio signal from a first microphone (101) and a second audio signal from a second microphone (103). An ITD processor 107 then determines an inter time difference between the first audio signal and the second audio signal and a set of delays (109, 111) generates a compensated multi channel audio signal from the multi channel audio signal by delaying at least one of the first and second audio signals in response to the inter time difference signal. A combiner (113) then generates a mono signal by combining channels of the compensated multi channel audio signal and a mono signal encoder (115) encodes the mono signal. The inter time difference may specifically be determined by an algorithm based on determining cross correlations between the first and second audio signals.
    Type: Grant
    Filed: September 9, 2008
    Date of Patent: November 5, 2013
    Assignee: Motorola Mobility LLC
    Inventor: Jonathan A. Gibbs
  • Patent number: 8571849
    Abstract: Disclosed herein are systems, methods, and computer readable-media for enriching spoken language translation with prosodic information in a statistical speech translation framework. The method includes receiving speech for translation to a target language, generating pitch accent labels representing segments of the received speech which are prosodically prominent, and injecting pitch accent labels with word tokens within the translation engine to create enriched target language output text. A further step may be added of synthesizing speech in the target language based on the prosody enriched target language output text. An automatic prosody labeler can generate pitch accent labels. An automatic prosody labeler can exploit lexical, syntactic, and prosodic information of the speech. A maximum entropy model may be used to determine which segments of the speech are prosodically prominent.
    Type: Grant
    Filed: September 30, 2008
    Date of Patent: October 29, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Srinivas Bangalore, Vivek Kumar Rangarajan Sridhar
  • Patent number: 8571877
    Abstract: An apparatus for providing an upmix signal representation on the basis of a downmix signal representation and an object-related parametric information, which are included in a bitstream representation of an audio content, in independence on a user-specified rendering matrix, the apparatus has a distortion limiter configured to obtain a modified rendering matrix using a linear combination of a user-specified rendering matrix in a target rendering matrix in dependence on a linear combination parameter. The apparatus also has a signal processor configured to obtain the upmix signal representation on the basis of the downmix signal representation and the object-related parametric information using the modified rendering matrix. The apparatus is also configured to evaluate a bitstream element representing the linear combination parameter in order to obtain the linear combination parameter.
    Type: Grant
    Filed: May 18, 2012
    Date of Patent: October 29, 2013
    Assignees: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V., Dolby International AB
    Inventors: Jonas Engdegard, Heiko Purnhagen, Juergen Herre, Cornelia Falch, Oliver Hellmuth, Leon Terentiv