Pitch Patents (Class 704/207)
  • Patent number: 9412388
    Abstract: A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes deformatting an encoded audio signal into a first part and a second part and extracting, from the first part, temporal envelope information and spectral components of the baseband portion. The method further includes decoding the first part to obtain a decoded baseband audio signal. The decoding includes filtering in a frequency domain at least some of the spectral components of the baseband portion with the reconstruction filter using the temporal envelope information to shape a temporal envelope of the baseband portion. The method also includes extracting, from the second part, a noise parameter and an estimated spectral envelope of the highband portion and obtaining a plurality of subband signals by filtering the decoded baseband audio signal.
    Type: Grant
    Filed: April 20, 2016
    Date of Patent: August 9, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael M. Truman, Mark S. Vinton
  • Patent number: 9412383
    Abstract: A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes deformatting an encoded audio signal into a first part and a second part and obtaining a decoded baseband audio signal by decoding the first part. The method also includes extracting, from the second part, a noise parameter and an estimated spectral envelope of the highband portion and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying in a circular manner a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal by adjusting, based on the estimated spectral envelope of the highband portion, a spectral envelope of the high-frequency reconstructed signal.
    Type: Grant
    Filed: April 14, 2016
    Date of Patent: August 9, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael M. Truman, Mark S. Vinton
  • Patent number: 9406311
    Abstract: An encoding method executed by a computer, the method includes converting by the computer information about a transient included in a low-frequency component of an audio signal into information about a transient included in a high-frequency component of the audio signal, detecting, by the computer the transient of the high-frequency component of the audio signal based on the high-frequency component of the audio signal and on the information about the transient of the high-frequency component obtained by the converting; and encoding, by the computer the high-frequency component of the audio signal based on the transient detected by the detecting.
    Type: Grant
    Filed: August 23, 2012
    Date of Patent: August 2, 2016
    Assignee: FUJITSU LIMITED
    Inventors: Shusaku Ito, Yoshiteru Tsuchinaga, Katsumori Hagiwara, Sosaku Moriki
  • Patent number: 9401160
    Abstract: Voice activity detectors and related methods are provided. Methods include receiving a frame of the input signal; determining a first SNR of the received frame; comparing the determined first SNR with an adaptive threshold; and detecting whether the received frame comprises voice based on the comparison. The adaptive threshold is at least based on total noise energy of a noise level, an estimate of a second SNR and on energy variation between different frames.
    Type: Grant
    Filed: October 18, 2010
    Date of Patent: July 26, 2016
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventor: Martin Sehlstedt
  • Patent number: 9390085
    Abstract: Method(s) and system(s) for speech processing of second language speech are described. According to the present subject matter, the system(s) implement the described method(s) for speech processing of Oriya English. The method for speech processing include receiving a plurality of speech samples of Oriya English to form a speech corpora where the plurality of speech samples comprise sounds of both vowels and consonants and, a plurality of speech parameters are associated with each of the plurality of speech samples. Method also includes determining values of the plurality of speech parameters for each of the plurality of speech samples and identifying difference between the values of each of the plurality of speech parameters and a corresponding value of accent neutral English. Further, the method includes articulating governing language rules based on the identifying to assess phonetic variation and mother tongue influence in sounds of vowels and consonants of Oriya English.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: July 12, 2016
    Assignee: TATA CONSULTANCY SEVICES LIMITED
    Inventor: Suman Bhattacharya
  • Patent number: 9384750
    Abstract: The present invention relates to coding of audio signals, and in particular to high frequency reconstruction methods including a frequency domain harmonic transposer. A system and method for generating a high frequency component of a signal from a low frequency component of the signal is described.
    Type: Grant
    Filed: October 3, 2014
    Date of Patent: July 5, 2016
    Assignee: Dolby International AB
    Inventors: Lars Villemoes, Per Ekstrand
  • Patent number: 9382901
    Abstract: The present invention can be included in the technical field of power control systems of electrical generation units comprising a supervisory regulation link applicable to a generation unit which calculates operating parameters or orders based on temporary averages of the power measurement.
    Type: Grant
    Filed: July 31, 2014
    Date of Patent: July 5, 2016
    Assignee: Acciona Windpower S.A.
    Inventors: Jose Miguel Garcia Sayes, Teresa Arlaban Gabeiras, Alfonso Ruiz Aldama, Alberto Garcia Barace, Ana Fernandez Garcia de Iturrospe, Diego Otamendi Claramunt, Alejandro Gonzalez Murua, Miguel Nunez Polo
  • Patent number: 9343071
    Abstract: A method for generating a reconstructed audio signal having a baseband portion and a highband portion is disclosed. The method includes deformatting an encoded audio signal into a first part and a second part and decoding the first part to obtain a decoded baseband audio signal. The method also includes extracting an estimated spectral envelope of the highband portion and a noise parameter from the second part and filtering the decoded baseband audio signal to obtain a plurality of subband signals. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and adjusting a spectral envelope of the high-frequency reconstructed signal based on the estimated spectral envelope of the highband portion to obtain an envelope adjusted high-frequency signal.
    Type: Grant
    Filed: June 10, 2015
    Date of Patent: May 17, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael M. Truman, Mark S. Vinton
  • Patent number: 9338492
    Abstract: The present invention refers to a method for reproducing an audio and/or video sequence, as well as a reproducing device and reproducing apparatus that make use of the method; the method reproduces an audio and/or video sequence by means of a decoder (Dav) apt to decode said sequence and a buffer (B) connected upstream to said decoder (Dav) and able to store at least a part of said sequence; the sequence is transmitted by means of a number of data blocks; each of said blocks comprises an audio and/or video information data section and a corresponding error correction data section; such sections are transmitted in different time intervals; the method comprises a transitory operation mode and a steady state operation mode; in the steady state operation mode the correction data of the block (FEC) are applied to the corresponding information data before said information data are supplied to said decoder (Dav), while in the transitory operation mode the information data of a block are directly supplied to said dec
    Type: Grant
    Filed: September 18, 2007
    Date of Patent: May 10, 2016
    Assignees: RAI Radiotelevisione Italiana S.P.A., S.I.SV.EL. S.P.A
    Inventors: Alberto Morello, Massimo Mancin
  • Patent number: 9324328
    Abstract: A method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes decoding an encoded audio signal to obtain a decoded baseband audio signal, filtering the decoded baseband audio signal to obtain subband signals, and generating a high-frequency reconstructed signal by copying a number of consecutive subband signals. The method also includes adjusting a spectral envelope of the high-frequency reconstructed signal based on an estimated spectral envelope of the highband portion extracted from the encoded audio signal to obtain an envelope adjusted high-frequency signal, generating a noise component based on a noise parameter extracted from the encoded audio signal, and adding the noise component to the envelope adjusted high-frequency signal to obtain a noise and envelope adjusted high-frequency signal.
    Type: Grant
    Filed: May 11, 2015
    Date of Patent: April 26, 2016
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael M. Truman, Mark S. Vinton
  • Patent number: 9318127
    Abstract: An apparatus for generating a bandwidth extended audio signal from an input signal, includes a patch generator for generating one or more patch signals from the input signal, wherein the patch generator is configured for performing a time stretching of subband signals from an analysis filterbank, and wherein the patch generator further includes a phase adjuster for adjusting phases of the subband signals using a filterbank-channel dependent phase correction.
    Type: Grant
    Filed: September 5, 2012
    Date of Patent: April 19, 2016
    Assignees: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Dolby International AB
    Inventors: Sascha Disch, Frederik Nagel, Stephan Wilde, Lars Villemoes, Per Ekstrand
  • Patent number: 9313634
    Abstract: An information processing device includes a collecting unit which collects user's biological information. The information processing device includes a determining unit which determines an emotion of the user by using the biological information collected by the collecting unit. The information processing device includes an output unit which outputs in association with information representing the user and the emotion of the user determined by the determining unit to terminal devices which are used by other users.
    Type: Grant
    Filed: September 11, 2013
    Date of Patent: April 12, 2016
    Assignee: YAHOO JAPAN CORPORATION
    Inventors: Mariko Suzuki, Hiroko Ota, Chiemi Taki, Yuki Uchida, Hiroshi Machida
  • Patent number: 9305557
    Abstract: Apparatus for processing an audio signal to generate a bandwidth extended signal having a high frequency part and a low frequency part using parametric data for the high frequency part, the parametric data relating to frequency bands of the high frequency part includes a patch border calculator for calculating a patch border such that the patch border coincides with a frequency band border of the frequency bands. The apparatus further includes a patcher for generating a patched signal using the audio signal and the patch border.
    Type: Grant
    Filed: September 5, 2012
    Date of Patent: April 5, 2016
    Assignees: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Dolby International AB
    Inventors: Frederik Nagel, Sascha Disch, Stephan Wilde, Lars Villemoes, Per Ekstrand
  • Patent number: 9293149
    Abstract: An audio encoder has a window function controller, a windower, a time warper with a final quality check functionality, a time/frequency converter, a TNS stage or a quantizer encoder, the window function controller, the time warper, the TNS stage or an additional noise filling analyzer are controlled by signal analysis results obtained by a time warp analyzer or a signal classifier. Furthermore, a decoder applies a noise filling operation using a manipulated noise filling estimate depending on a harmonic or speech characteristic of the audio signal.
    Type: Grant
    Filed: November 11, 2014
    Date of Patent: March 22, 2016
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Stefan Bayer, Sascha Disch, Ralf Geiger, Guillaume Fuchs, Max Neuendorf, Gerald Schuller, Bernd Edler
  • Patent number: 9283376
    Abstract: Aspects of the present invention are generally directed to a modulation enhancement strategy that helps improve ITD perception by explicitly modulating the electrical stimulation signal. In an embodiment, the timing of the applied modulations is based on amplitude inflections (i.e., peaks or troughs) in the received sound signal. In an embodiment, the identified inflections (i.e., peaks or troughs) represent the most energetic portions of the signal over a particular time period (e.g., the time period prior to the inflection having a length equal to the expected fundamental period of the signal. Further, in an embodiment, the cochlear implant applies a delay to the stimulation signal to help maintain interaural timing cues. The application of this delay helps account for the traveling wave delay in the acoustic path of the opposite ear in embodiments in which the opposite ear is fitted with a hearing aid or is not fitted with a hearing device.
    Type: Grant
    Filed: May 27, 2011
    Date of Patent: March 15, 2016
    Assignee: Cochlear Limited
    Inventors: Jan Wouters, Tom Francart
  • Patent number: 9271096
    Abstract: A delay unit (16) for a conference audio system (1) adapted to delay audio input signals for an adjustable time delay, thereby generating audio output signals, is proposed.
    Type: Grant
    Filed: September 3, 2009
    Date of Patent: February 23, 2016
    Assignee: Robert Bosch GmbH
    Inventors: Marc Smaak, C. P. Janse, Chen Tchang, L. C. A. van Stuivenberg
  • Patent number: 9263053
    Abstract: A method (1100) and apparatus (100) generate a candidate code-vector to code an information signal. The method can include producing (1110) a weighted target vector from an input signal. The method can include processing (1120) the weighted target vector through an inverse weighting function to create a residual domain target vector. The method can include performing (1130) a first search process on the residual domain target vector to obtain an initial fixed codebook code-vector. The method can include performing (1140) a second search process over a subset of possible codebook code-vectors for a low weighted-domain error to produce a final fixed codebook code-vector. The subset of possible codebook code-vectors can be based on the initial fixed codebook code-vector. The method can include generating (1150) a codeword representative of the final fixed codebook code-vector. The codeword can be for use by a decoder to generate an approximation of the input signal.
    Type: Grant
    Filed: November 2, 2012
    Date of Patent: February 16, 2016
    Assignee: GOOGLE TECHNOLOGY HOLDINGS LLC
    Inventors: James P Ashley, Udar Mittal
  • Patent number: 9257954
    Abstract: Two audio samples and/or sets of audio samples are identified. The pitch distributions of the audio samples and/or sets of audio samples are identified, the pitch distribution of an audio sample or set of audio samples referring to how much of each of multiple pitches of notes is present in the audio sample or set of audio samples. Based on the pitch distributions of the audio samples and/or sets of audio samples, at least one pitch of one of the audio sample and/or set of audio samples can be automatically adjusted (but need not be, depending on the pitch distributions) to increase harmonic coherence of the audio samples and/or sets of audio samples.
    Type: Grant
    Filed: September 19, 2013
    Date of Patent: February 9, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Steven J. Ball, Jorge Gabuardi Gonzalez, Tyler Brewer, Mitchell K. Rundle
  • Patent number: 9253568
    Abstract: A technique for suppressing non-stationary noise, such as wind noise, in an audio signal is described. In accordance with the technique, a series of frames of the audio signal is analyzed to detect whether the audio signal comprises non-stationary noise. If it is detected that the audio signal comprises non-stationary noise, a number of steps are performed. In accordance with these steps, a determination is made as to whether a frame of the audio signal comprises non-stationary noise or speech and non-stationary noise. If it is determined that the frame comprises non-stationary noise, a first filter is applied to the frame and if it is determined that the frame comprises speech and non-stationary noise, a second filter is applied to the frame.
    Type: Grant
    Filed: May 14, 2010
    Date of Patent: February 2, 2016
    Assignee: Broadcom Corporation
    Inventors: Elias Nemer, Wilfrid LeBlanc, Syavosh Zad-Issa, Jes Thyssen
  • Patent number: 9240193
    Abstract: Methods, systems, and devices for processing an audio signal are provided. An example method includes mapping a fundamental frequency of an audio signal to a modulation frequency. An output of the mapping is less than the fundamental frequency when the fundamental frequency is greater than an intersection frequency. The intersection frequency is a frequency at which the output of the mapping is the fundamental frequency.
    Type: Grant
    Filed: January 21, 2013
    Date of Patent: January 19, 2016
    Assignee: Cochlear Limited
    Inventor: Christopher James
  • Patent number: 9240196
    Abstract: An apparatus for processing an audio signal has an overlap-add stage for overlapping and adding blocks of a corresponding one of a plurality of subband signals using an overlap-add-advance value being different from a block extraction advance value. The apparatus further has a transient detector for detecting a transient in the audio signal or a subband signal of the plurality of subband signals. The overlap-add stage is configured for reducing an influence of a detected transient or for not using the detected transients when adding. The apparatus further has a transient adder for adding a detected transient to a subband signal generated by the overlap/add stage. A related method for processing an audio signal has, inter alia, either reducing an influence or discarding a detected transient when overlapping and adding.
    Type: Grant
    Filed: September 6, 2012
    Date of Patent: January 19, 2016
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Frederik Nagel, Stephan Wilde
  • Patent number: 9236064
    Abstract: The subject disclosure is directed towards dynamically computing anti-aliasing filter coefficients for sample rate conversion in digital audio. In one aspect, for each input-to-output sampling rate ratio (pitch) obtained, anti-aliasing filter coefficients are interpolated based upon the pitch (e.g., using the fractional part of the ratio) from two filters (coefficient sets) selected based upon the pitch (e.g., using the integer part of the ratio). The interpolation provides for fine-grained cutoff frequencies, and by re-computation for each pitch, smooth anti-aliasing with dynamically changing ratios.
    Type: Grant
    Filed: February 12, 2013
    Date of Patent: January 12, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Thomas Craig Savell
  • Patent number: 9208775
    Abstract: A method for determining pitch pulse period signal boundaries by an electronic device is described. The method includes obtaining a signal. The method also includes determining a first averaged curve based on the signal. The method further includes determining at least one first averaged curve peak position based on the first averaged curve and a threshold. The method additionally includes determining pitch pulse period signal boundaries based on the at least one first averaged curve peak position. The method also includes synthesizing a speech signal.
    Type: Grant
    Filed: August 30, 2013
    Date of Patent: December 8, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Subasingha Shaminda Subasingha, Venkatesh Krishnan, Vivek Rajendran, Stephane Pierre Villette
  • Patent number: 9196256
    Abstract: A data processing method for performing data processing on wireless received data and an associated data processing apparatus are provided, where the data processing method is applied to an electronic device. The data processing method includes the steps of: wirelessly receiving a plurality of packets corresponding to a same set of speech data from another electronic device; and selectively performing error correction operation on at least one of the plurality of packets to obtain the set of speech data, wherein whether to perform the error correction operation is determined according to at least one characteristic of the plurality of packets. More particularly, the error correction operation is selectively performed for at least one scenario of a timing critical scenario and a re-transmission limited scenario.
    Type: Grant
    Filed: August 8, 2013
    Date of Patent: November 24, 2015
    Assignee: MEDIATEK INC.
    Inventors: Wei-Kun Su, Hsuan-Yi Hou, Wei-Chu Lai, Chia-Wei Tao, Cheng-Lun Hu, Chieh-Cheng Cheng
  • Patent number: 9196240
    Abstract: A group of users may be presented with text and a synthesized speech recording of the text. The users can listen to the synthesized speech recording and submit feedback regarding errors or other issues with the synthesized speech. A system of one or more computing devices can analyze the feedback, modify the voice or language rules, and recursively test the modifications. The modifications may be determined through the use of machine learning algorithms or other automated processes.
    Type: Grant
    Filed: December 19, 2012
    Date of Patent: November 24, 2015
    Assignee: IVONA Software Sp. z.o.o.
    Inventors: Michal T. Kaszczuk, Lukasz M. Osowski
  • Patent number: 9172506
    Abstract: A network device includes a network interface device that receives at least a subset of a plurality of recoverable packets. The plurality of recoverable packets corresponds to a plurality of original packets generated by a source device. A processor having a packet recovery module recreates each of the original packets based on the subset of the recoverable packets received. The subset of recoverable packets excludes recoverable packets lost during transmission to the network interface device. A system includes a first network device that generates original packets and converts the original packets to recoverable packets. A second network device receives at least a subset of the recoverable packets, excluding recoverable packets lost during transmission from the first network device. The second network device includes a packet recovery module that recreates the original packets based on the subset of the recoverable packets received at the second network device.
    Type: Grant
    Filed: March 12, 2013
    Date of Patent: October 27, 2015
    Assignees: Cellco Partnership, Verizon Patent and Licensing Inc.
    Inventors: Donna L. Polehn, Deepak Kakadia, Lalit R. Kotecha
  • Patent number: 9159334
    Abstract: A voice processing device includes a voice pitch converting unit that performs a voice pitch converting process with respect to an input voice signal and converts voice pitch of the input voice signal, an error detecting unit that detects an error between the number of samples of an output voice signal, which is expected, and the number of samples of the output voice signal, which is actually output, and a time length control unit that controls adjustment of the time length in such a manner that the time length of the output voice signal is corrected by the amount of the error.
    Type: Grant
    Filed: March 9, 2012
    Date of Patent: October 13, 2015
    Assignee: Sony Corporation
    Inventors: Akihiro Mukai, Akira Inoue
  • Patent number: 9159323
    Abstract: A method including: obtaining, via a plurality of communication devices, a plurality of speech signals respectively associated with human speakers, the speech signals including verbal components and non-verbal components; identifying a plurality of geographical locations, each geographic location associated with a respective one of the plurality of the communication devices; extracting the non-verbal components from the obtained speech signals; deducing physiological or psychological conditions of the human speakers by analyzing, over a specified period, the extracted non-verbal components, using predefined relations between characteristics of the non-verbal components and physiological or psychological conditions of the human speakers; and providing a geographical distribution of the deduced physiological or psychological conditions of the human speakers by associating the deduced physiological or psychological conditions of the human speakers with geographical locations thereof.
    Type: Grant
    Filed: July 29, 2013
    Date of Patent: October 13, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Slava Shechtman, Raphael Steinberg
  • Patent number: 9153245
    Abstract: A pitch detection method and apparatus are disclosed. The method includes: performing pitch detection on an input signal in a signal domain, and obtaining a candidate pitch; performing linear prediction (LP) on the input signal, and obtaining an LP residual signal; setting a candidate pitch range that includes the candidate pitch; searching the candidate pitch range for the LP residual signal, and obtaining a selected pitch.
    Type: Grant
    Filed: April 9, 2010
    Date of Patent: October 6, 2015
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Fengyan Qi, Dejun Zhang, Lei Miao, Jianfeng Xu, Qing Zhang, Yang Gao
  • Patent number: 9147392
    Abstract: A speech synthesis device includes: a mouth-opening-degree generation unit which generates, for each of phonemes generated from input text, a mouth-opening-degree corresponding to oral-cavity volume, using information generated from the text and indicating the type and position of the phoneme within the text, such that the generated mouth-opening-degree is larger for a phoneme at the beginning of a sentence in the text than for a phoneme at the end of the sentence; a segment selection unit which selects, for each of the generated phonemes, segment information corresponding to the phoneme from among pieces of segment information stored in a segment storage unit and including phoneme type, mouth-opening-degree, and speech segment data, based on the type of the phoneme and the generated mouth-opening-degree; and a synthesis unit which generates synthetic speech of the text, using the selected pieces of segment information and pieces of prosody information generated from the text.
    Type: Grant
    Filed: May 28, 2013
    Date of Patent: September 29, 2015
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Yoshifumi Hirose, Takahiro Kamai
  • Patent number: 9137051
    Abstract: A method and apparatus for reducing rendering latency in a terminal device which receives audio data from a communication network such as, for example, Voice over Internet Protocol (VoIP) communications networks. Received packets are advantageously decoded “immediately” upon receipt, and the decoded data is placed directly in the rendering buffer at a location corresponding to the time appropriate for rendering, without using any intermediate buffer. Then, in accordance with the principles of the present invention and more particularly in accordance with certain illustrative embodiments thereof, packet loss concealment (PLC) routines are advantageously applied preemptively, without first determining whether or not any subsequent packets have or have not been received by any particular time.
    Type: Grant
    Filed: December 17, 2010
    Date of Patent: September 15, 2015
    Assignee: Alcatel Lucent
    Inventor: James W. McGowan
  • Patent number: 9137059
    Abstract: In a method for removing interferential signals of a mobile device, a differential signal waveform corresponding to a signal frame of an original communication signal waveform of the mobile device is generated, and a DPPPV of the differential signal waveform is acquired. The differential signal is determined to be an interferential signal, in response to that the DPPPV is not less than a preset differential threshold value, and a DNPPV at a target time point is not less than a preset ratio of the DPPPV. A signal interference section is determined and compensation values corresponding to the signal interference section is calculated, to generate a differential compensation waveform of the signal frame. An integrated differential compensation waveform of all signal frames and the original communication signal waveform are incorporated to obtain a processed signal waveform without interferential signals.
    Type: Grant
    Filed: April 28, 2014
    Date of Patent: September 15, 2015
    Assignee: HON HAI PRECISION INDUSTRY CO., LTD.
    Inventor: Chun-Te Wu
  • Patent number: 9135923
    Abstract: A pitch-synchronous method and system for speech coding using timbre vectors is disclosed. On the encoder side, speech signal is segmented into pitch-synchronous frames without overlap, then converted into a pitch-synchronous amplitude spectrum using FFT. Using Laguerre functions, the amplitude spectrum is transformed into a timbre vector. Using vector quantization, each timbre vector is converted to a timbre index based on a timbre codebook. The intensity and pitch are also converted into indices respectively using scalar quantization. Those indices are transmitted as encoded speech. On the decoder side, by looking up the same codebooks, pitch, intensity and the timbre vector are recovered. Using Laguerre functions, the amplitude spectrum is recovered. Using Kramers-Kronig relations, the phase spectrum is recovered. Using FFT, the elementary waves are regenerated, and superposed to become the speech signal.
    Type: Grant
    Filed: January 26, 2015
    Date of Patent: September 15, 2015
    Inventor: Chengjun Julian Chen
  • Patent number: 9123347
    Abstract: Provided are an apparatus and method for eliminating noise. The method includes: detecting a speech section from a noise speech signal including a noise signal; separating the speech section into a consonant section and a vowel section on the basis of a VOP at the speech section; calculating a transfer function of a filter for eliminating the noise signal to allow the degree of noise elimination to be different in the consonant section and the vowel section; and eliminating the noise signal from the noise speech signal on the basis of the transfer function.
    Type: Grant
    Filed: August 29, 2012
    Date of Patent: September 1, 2015
    Assignee: Gwangju Institute of Science and Technology
    Inventors: Hong Kook Kim, Ji Hun Park, Woo Kyeong Seong
  • Patent number: 9076444
    Abstract: A method and apparatus for sinusoidal audio coding and decoding are provided. The method for sinusoidal audio coding includes performing sinusoidal analysis on an input signal and extracting sinusoids of a current frame; tracking and coding a continuation mode sinusoid of the current frame by using a sinusoid of a previous frame which continues to the continuation mode sinusoid; searching for a sinusoid having a closest frequency to a frequency of a birth mode sinusoid of the current frame; calculating and coding a difference between an amplitude of the sinusoid having the closest frequency and an amplitude of the birth mode sinusoid; and coding the frequency of the birth mode sinusoid.
    Type: Grant
    Filed: February 13, 2008
    Date of Patent: July 7, 2015
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Nam-suk Lee
  • Patent number: 9070360
    Abstract: Described is a calibration model for use in a speech recognition system. The calibration model adjusts the confidence scores output by a speech recognition engine to thereby provide an improved calibrated confidence score for use by an application. The calibration model is one that has been trained for a specific usage scenario, e.g., for that application, based upon a calibration training set obtained from a previous similar/corresponding usage scenario or scenarios. Different calibration models may be used with different usage scenarios, e.g., during different conditions. The calibration model may comprise a maximum entropy classifier with distribution constraints, trained with continuous raw confidence scores and multi-valued word tokens, and/or other distributions and extracted features.
    Type: Grant
    Filed: December 10, 2009
    Date of Patent: June 30, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dong Yu, Li Deng, Jinyu Li
  • Patent number: 9070356
    Abstract: A method (300) and apparatus (100) generate a candidate code-vector to code an information signal. The method can include producing (310) a target vector from a received input signal. The method can include constructing (320) a plurality of inverse weighting functions based on the target vector. The method can include evaluating (330) an error value associated with each of the plurality of inverse weighting functions to produce a fixed codebook code-vector. The method can include generating (340) a codeword representative of the fixed codebook code-vector, where the codeword can be used by a decoder to generate an approximation of the input signal.
    Type: Grant
    Filed: April 4, 2012
    Date of Patent: June 30, 2015
    Assignee: GOOGLE TECHNOLOGY HOLDINGS LLC
    Inventors: James P. Ashley, Udar Mittal
  • Patent number: 9071340
    Abstract: Provided is a method of generating an orthogonal set in a quasi-synchronous spread spectrum system, the method including selecting a two-level autocorrelation sequence at a sequence selector, selecting a first indices set and a second indices set from the two-level autocorrelation sequence, and generating two quasi-orthogonal sets using the first indices set and the second indices set, at an orthogonal code generator.
    Type: Grant
    Filed: September 2, 2014
    Date of Patent: June 30, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sujit Jos, Jinesh P Nair
  • Patent number: 9064489
    Abstract: Recorded or synthesized speech segments of text-to-speech (TTS) systems may be compressed though the use of both time domain compression and perceptual compression techniques. The twice-compressed recording may be separated into speech segments corresponding to words or subword units for use in a TTS system. The compression rate of time domain compression, and the ratio of time domain compression to perceptual compression, may be modified for any speech segment. The compression amount or ratio may be determined based on linguistic or acoustic features of the word or subword unit that the speech segment represents. Differing compression amounts and ratios may be applied to portions of a single speech segment.
    Type: Grant
    Filed: December 19, 2012
    Date of Patent: June 23, 2015
    Assignee: IVONA Software Sp. z o.o.
    Inventors: Michal T. Kaszczuk, Lukasz M. Osowski
  • Patent number: 9026435
    Abstract: The invention provides a method for estimating a fundamental frequency of a speech signal comprising the steps of receiving a signal spectrum of the speech signal, filtering the signal spectrum to obtain a refined signal spectrum, determining a cross-power spectral density using the refined signal spectrum and the signal spectrum, transforming the cross-power spectral density into the time domain to obtain a cross-correlation function, and estimating the fundamental frequency of the speech signal based on the cross-correlation function.
    Type: Grant
    Filed: May 3, 2010
    Date of Patent: May 5, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Mohamed Krini, Gerhard Schmidt
  • Patent number: 9026434
    Abstract: An audio coding terminal and method is provided. The terminal includes a coding mode setting unit to set an operation mode, from plural operation modes, for input audio coding by a codec configured to code the input audio based on the set operation mode such that when the set operation mode is a high frame erasure rate (FER) mode the codec codes a current frame of the input audio according to a select frame erasure concealment (FEC) mode of one or more FEC modes. Upon the setting of the operation mode to be the High FER mode the one FEC mode is selected, from the one or more FEC modes predetermined for the High FER mode, to control the codec by incorporating of redundancy within a coding of the input audio or as separate redundancy information separate from the coded input audio according to the selected one FEC mode.
    Type: Grant
    Filed: April 10, 2012
    Date of Patent: May 5, 2015
    Assignee: Samsung Electronic Co., Ltd.
    Inventors: Steven Craig Greer, Hosang Sung
  • Patent number: 9015039
    Abstract: System and method embodiments for dual modes pitch coding are provided. The system and method embodiments are configured to adaptively code pitch lags of a voiced speech signal using one of two pitch coding modes according to a pitch length, stability, or both. The two pitch coding modes include a first pitch coding mode with relatively high precision and reduced dynamic range, and a second pitch coding mode with relatively large dynamic range and reduced precision. The first pitch coding mode is used upon determining that the voiced speech signal has a relatively short or substantially stable pitch. The second pitch coding mode is used upon determining that the voiced speech signal has a relatively long or less stable pitch or is a substantially noisy signal.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: April 21, 2015
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Patent number: 9002703
    Abstract: The community-based generation of audio narrations for a text-based work leverages collaboration of a community of people to provide human-voiced audio readings. During the community-based generation, a collection of audio recordings for the text-based work may be collected from multiple human readers in a community. An audio recording for each section in the text-based work may be selected from the collection of audio recordings. The selected audio recordings may be then combined to produce an audio reading of at least a portion of the text-based work.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: April 7, 2015
    Assignee: Amazon Technologies, Inc.
    Inventor: Jay A. Crosley
  • Patent number: 8996364
    Abstract: Using signal processing techniques described herein, pitch detection and correction of a user's vocal performance can be performed continuously and in real-time with respect to the audible rendering of the backing track at the handheld or portable computing device. In some implementations, pitch detection builds on time-domain pitch correction techniques that employ average magnitude difference function (AMDF) or autocorrelation-based techniques together with zero-crossing and/or peak picking techniques to identify differences between pitch of a captured vocal signal and score-coded target pitches. Based on detected differences, pitch correction based on pitch synchronous overlapped add (PSOLA) and/or linear predictive coding (LPC) techniques allow captured vocals to be pitch shifted in real-time to “correct” notes in accord with pitch correction settings that code score-coded melody targets and harmonies.
    Type: Grant
    Filed: April 12, 2011
    Date of Patent: March 31, 2015
    Assignee: Smule, Inc.
    Inventors: Perry R. Cook, Ari Lazier, Tom Lieber
  • Patent number: 8990094
    Abstract: An electronic device for coding a transient frame is described. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor. The electronic device obtains a current transient frame. The electronic device also obtains a residual signal based on the current transient frame. Additionally, the electronic device determines a set of peak locations based on the residual signal. The electronic device further determines whether to use a first coding mode or a second coding mode for coding the current transient frame based on at least the set of peak locations. The electronic device also synthesizes an excitation based on the first coding mode if the first coding mode is determined. The electronic device also synthesizes an excitation based on the second coding mode if the second coding mode is determined.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: March 24, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Venkatesh Krishnan, Ananthapadmanabhan Arasanipalai Kandhadai
  • Patent number: 8983829
    Abstract: Despite many practical limitations imposed by mobile device platforms and application execution environments, vocal musical performances may be captured and continuously pitch-corrected for mixing and rendering with backing tracks in ways that create compelling user experiences. Based on the techniques described herein, even mere amateurs are encouraged to share with friends and family or to collaborate and contribute vocal performances as part of virtual “glee clubs.” In some implementations, these interactions are facilitated through social network- and/or eMail-mediated sharing of performances and invitations to join in a group performance. Using uploaded vocals captured at clients such as a mobile device, a content server (or service) can mediate such virtual glee clubs by manipulating and mixing the uploaded vocal performances of multiple contributing vocalists.
    Type: Grant
    Filed: April 12, 2011
    Date of Patent: March 17, 2015
    Assignee: Smule, Inc.
    Inventors: Perry R. Cook, Ari Lazier, Tom Lieber, Turner E. Kirk
  • Publication number: 20150073781
    Abstract: A method and an apparatus for detecting correctness of a pitch period. The method for detecting correctness of a pitch period includes determining, according to an initial pitch period of an input signal in a time domain, a pitch frequency bin of the input signal, where the initial pitch period is obtained by performing open-loop detection on the input signal; determining, based on an amplitude spectrum of the input signal in a frequency domain, a pitch period correctness decision parameter, associated with the pitch frequency bin, of the input signal; and determining correctness of the initial pitch period according to the pitch period correctness decision parameter. The method and apparatus for detecting correctness of a pitch period according to the embodiments of the present invention can improve, based on a relatively less complex algorithm, accuracy of detecting correctness of a pitch period.
    Type: Application
    Filed: November 17, 2014
    Publication date: March 12, 2015
    Inventors: Fengyan Qi, Lei Miao
  • Publication number: 20150066492
    Abstract: An audio encoder has a window function controller, a windower, a time warper with a final quality check functionality, a time/frequency converter, a TNS stage or a quantizer encoder, the window function controller, the time warper, the TNS stage or an additional noise filling analyzer are controlled by signal analysis results obtained by a time warp analyzer or a signal classifier. Furthermore, a decoder applies a noise filling operation using a manipulated noise filling estimate depending on a harmonic or speech characteristic of the audio signal.
    Type: Application
    Filed: November 11, 2014
    Publication date: March 5, 2015
    Inventors: Stefan BAYER, Sascha DISCH, Ralf GEIGER, Guillaume FUCHS, Max NEUENDORF, Gerald SCHULLER, Bernd EDLER
  • Publication number: 20150066491
    Abstract: An audio encoder has a window function controller, a windower, a time warper with a final quality check functionality, a time/frequency converter, a TNS stage or a quantizer encoder, the window function controller, the time warper, the TNS stage or an additional noise filling analyzer are controlled by signal analysis results obtained by a time warp analyzer or a signal classifier. Furthermore, a decoder applies a noise filling operation using a manipulated noise filling estimate depending on a harmonic or speech characteristic of the audio signal.
    Type: Application
    Filed: November 11, 2014
    Publication date: March 5, 2015
    Inventors: Stefan BAYER, Sascha DISCH, Ralf GEIGER, Guillaume FUCHS, Max NEUENDORF, Gerald SCHULLER, Bernd EDLER
  • Publication number: 20150057998
    Abstract: An exemplary method of enhancing pitch of an audio signal presented to a cochlear implant patient includes 1) determining a frequency spectrum of an audio signal presented to a cochlear implant patient, the frequency spectrum comprising a plurality of frequency bins that each contain spectral energy, 2) generating a modified spectral envelope of the frequency spectrum of the audio signal, 3) identifying each frequency bin included in the plurality of frequency bins that contains spectral energy above the modified spectral envelope and each frequency bin included in the plurality of frequency bins that contains spectral energy below the modified spectral envelope, 4) enhancing the spectral energy contained in each frequency bin identified as containing spectral energy above the modified spectral envelope, and 5) compressing the spectral energy contained in each frequency bin identified as containing spectral energy below the modified spectral envelope. Corresponding methods and systems are also disclosed.
    Type: Application
    Filed: January 25, 2013
    Publication date: February 26, 2015
    Inventors: Adam B. Strauss, Leonid M. Litvak