Formant Patents (Class 704/209)
  • Patent number: 8706488
    Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.
    Type: Grant
    Filed: February 27, 2013
    Date of Patent: April 22, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
  • Patent number: 8700389
    Abstract: The present invention includes model-based processing of linguistic user inputs. In one embodiment, the present invention includes a computer-implemented method comprising receiving linguistic inputs, parsing the linguistic inputs, mapping the linguistic inputs to a formal representation used by a model, applying the formal representation against the model, where the model comprises said formal representation, and where the model specifies relationships between the elements of the formal representation and defines process information, and accessing software resources based on the formal representation of the user input and the relationships and process information in said model.
    Type: Grant
    Filed: December 23, 2010
    Date of Patent: April 15, 2014
    Assignee: SAP AG
    Inventors: Markus Latzina, Joerg Beringer
  • Patent number: 8682671
    Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.
    Type: Grant
    Filed: April 17, 2013
    Date of Patent: March 25, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Darren C. Meyer, Stephen R. Springer
  • Patent number: 8645142
    Abstract: System and method to improve intelligibility of coded speech, the method including: receiving an encoded speech signal from a network; extracting an encoded media data stream and one or more control data packets from the encoded speech signal; decoding the encoded media data stream to produce a decoded speech signal; boosting an upper spectral portion of the decoded speech signal to produce a boosted speech signal; and outputting the boosted speech signal. In another embodiment, the method may include: receiving an uncoded speech signal; processing the uncoded speech signal, wherein the processing comprises generating an unencoded data stream from the uncoded speech signal; boosting an upper spectral portion of the unencoded data stream to produce a boosted speech signal; encoding the boosted speech signal to produce an encoded speech signal; and outputting the boosted speech signal.
    Type: Grant
    Filed: March 27, 2012
    Date of Patent: February 4, 2014
    Assignee: Avaya Inc.
    Inventors: Heinz Teutsch, John Cornelius Lynch
  • Patent number: 8639499
    Abstract: A noise cancellation device includes a plurality of first computation modules, a formant detection module, a direction of arrival module and a beamformer. The plurality of first computation modules receives raw audio data and generates a respective transformed signal as a function of formants. A first transformed signal relates to speech data and a second transformed signal relates to noise data. The formant detection module receives the first transformed signal and generates a frequency range data signal. The direction of arrival module receives the first and second transformed signals, determines a cross-correlation between the first and second transformed signals, and generates a spatial orientation data signal. The beamformer receives the first and second transformed signals, the frequency range data signal, and the spatial orientation data signal and generates modification data at selected formant ranges to eliminate a maximum amount of the noise data.
    Type: Grant
    Filed: July 28, 2010
    Date of Patent: January 28, 2014
    Assignee: Motorola Solutions, Inc.
    Inventors: Kaustubh Kale, Yong Wang
  • Patent number: 8620646
    Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and a representation of harmonic envelope at the estimated pitch. The estimated pitch and the representation of harmonic envelope may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.
    Type: Grant
    Filed: August 8, 2011
    Date of Patent: December 31, 2013
    Assignee: The Intellisis Corporation
    Inventors: David C. Bradley, Rodney Gateau, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher
  • Patent number: 8612238
    Abstract: An encoding method and apparatus and a decoding method and apparatus are provided. The decoding method includes extracting a three-dimensional (3D) down-mix signal from an input bitstream, generating a down-mix signal with 3D effects removed therefrom by performing a 3D rendering operation on the extracted 3D down-mix signal, and generating a 3D down-mix signal with 3D effects by performing a 3D rendering operation on the generated down-mix signal. Accordingly, it is possible to efficiently encode multi-channel signals with 3D effects and to adaptively restore and reproduce audio signals with optimum sound quality according to the characteristics of an audio reproduction environment.
    Type: Grant
    Filed: February 7, 2007
    Date of Patent: December 17, 2013
    Assignee: LG Electronics, Inc.
    Inventors: Yang Won Jung, Hee Suk Pang, Hyen O Oh, Dong Soo Kim, Jae Hyun Lim
  • Patent number: 8571870
    Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.
    Type: Grant
    Filed: August 9, 2010
    Date of Patent: October 29, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Darren C. Meyer, Stephen R. Springer
  • Patent number: 8560305
    Abstract: LOGIFOLG is a system and method for finding implicit information that is not explicitly mentioned in the sentence, not contained in the synonyms of the particular word, not present in the concept the word belongs to, not found with statistical or concordance based analysis. Nevertheless, this implicit information is present and understood, implicitly, consciously or unconsciously, by everybody who reads the text. LOGIFOLG uses a computer software process, such as computer-executable program code, to discover this implicit information. The steps in this process are: analyzing user's written input, up to five successive and non-successive words in a sequence, understanding the meaning of the written input, finding implicit information in the written input and finally, displaying the implicit information as a variant of the original sentence. The subject matter of the invention deals with Artificial Reasoning, namely inductive and deductive reasoning, based on Natural Language written sentences.
    Type: Grant
    Filed: May 16, 2012
    Date of Patent: October 15, 2013
    Inventor: Hristo Georgiev
  • Publication number: 20130262096
    Abstract: A system-effected method for synthesizing speech, or recognizing speech including a sequence of expressive speech utterances. The method can be computer-implemented and can include system-generating a speech signal embodying the sequence of expressive speech utterances. Other possible steps include: system-marking the speech signal with a pitch marker indicating a pitch change at or near a first zero amplitude crossing point of the speech signal following a glottal closure point, at a minimum, at a maximum or at another location; system marking the speech signal with at least one further pitch marker; system-aligning a sequence of prosodically marked text with the pitch-marked speech signal according to the pitch markers; and system outputting the aligned text or the aligned speech signal, respectively. Computerized systems, and stored programs for implementing method embodiments of the invention are also disclosed.
    Type: Application
    Filed: September 21, 2012
    Publication date: October 3, 2013
    Applicant: LESSAC TECHNOLOGIES, INC.
    Inventors: Reiner WILHELMS-TRICARICO, Brian MOTTERSHEAD, Rattima NITISAROJ, Michael BAUMGARTNER, John B. REICHENBACH, Gary A. MARPLE
  • Publication number: 20130231927
    Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.
    Type: Application
    Filed: August 20, 2012
    Publication date: September 5, 2013
    Inventors: PIERRE ZAKARAUSKAS, ALEXANDER ESCOTT, CLARENCE S.H. CHU, SHAWN E. STEVENSON
  • Patent number: 8463599
    Abstract: A method includes defining a transition band for a signal having a spectrum within a first frequency band, where the transition band is defined as a portion of the first frequency band, and is located near an adjacent frequency band that is adjacent to the first frequency band. The method analyzes the transition band to obtain a transition band spectral envelope and a transition band excitation spectrum; estimates an adjacent frequency band spectral envelope; generates an adjacent frequency band excitation spectrum by periodic repetition of at least a part of the transition band excitation spectrum with a repetition period determined by a pitch frequency of the signal; and combines the adjacent frequency band spectral envelope and the adjacent frequency band excitation spectrum to obtain an adjacent frequency band signal spectrum. A signal processing logic for performing the method is also disclosed.
    Type: Grant
    Filed: February 4, 2009
    Date of Patent: June 11, 2013
    Assignee: Motorola Mobility LLC
    Inventors: Tenkasi Ramabadran, Mark Jasiuk
  • Patent number: 8447592
    Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.
    Type: Grant
    Filed: September 13, 2005
    Date of Patent: May 21, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
  • Patent number: 8447610
    Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.
    Type: Grant
    Filed: August 9, 2010
    Date of Patent: May 21, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Darren C. Meyer, Stephen R. Springer
  • Patent number: 8407046
    Abstract: A method of transmitting an input audio signal is disclosed. A current spectral magnitude of the input audio signal is quantized. A quantization error of a previous spectral magnitude is fed back to influence quantization of the current spectral magnitude. The feeding back includes adaptively modifying a quantization criterion to form a modified quantization criterion. A current quantization error is minimized by using the modified quantization criterion. A quantized spectral envelope is formed based on the minimizing and the quantized spectral envelope is transmitted.
    Type: Grant
    Filed: September 4, 2009
    Date of Patent: March 26, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Patent number: 8396703
    Abstract: A band-limited voice signal is processed to reduce its spectral envelope or harmonic structure, or both. The resulting reduced signal is moved into a frequency band above the upper limit frequency of the band-limited voice signal, and then combined with the band-limited voice signal to form a band expanded signal with improved quality and comprehensibility, free of unnatural high-frequency resonances and unnaturally strong high-frequency harmonics.
    Type: Grant
    Filed: March 5, 2009
    Date of Patent: March 12, 2013
    Assignee: Oki Electric Industry Co., Ltd.
    Inventor: Hiromi Aoyagi
  • Patent number: 8386247
    Abstract: An adaptive audio system can be implemented in a communication device. The adaptive audio system can enhance voice in an audio signal received by the communication device to increase intelligibility of the voice. The audio system can adapt the audio enhancement based at least in part on levels of environmental content, such as noise, that are received by the communication device. For higher levels of environmental content, for example, the audio system might apply the audio enhancement more aggressively. Additionally, the adaptive audio system can detect substantially periodic content in the environmental content. The adaptive audio system can further adapt the audio enhancement responsive to the environmental content.
    Type: Grant
    Filed: June 18, 2012
    Date of Patent: February 26, 2013
    Assignee: DTS LLC
    Inventors: Jun Yang, Richard J. Oliver, James Tracey, Xing He
  • Patent number: 8386242
    Abstract: Provided are a method, medium and apparatus for enhancing an acoustic signal using an auditory property. An acoustic signal is enhanced by generating a plurality of harmonic signals based on a predetermined acoustic signal frequency, selecting harmonic signals, which exist in an area masked by the predetermined harmonic signal, from among the generated plurality of harmonic signals, and outputting harmonic signals remaining after excluding the selected harmonic signals from the generated plurality of harmonic signals. The enhancement results in a bass signal of good sound quality and having a low distortion ratio, without changing the structure of a micro speaker.
    Type: Grant
    Filed: June 22, 2007
    Date of Patent: February 26, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jung-ho Kim, Sang-wook Kim, Young-tae Kim, Sang-chul Ko
  • Patent number: 8380484
    Abstract: A method (50) of dynamically changing a sentence structure of a message can include the step of receiving (51) a user request for information, retrieving (52) data based on the information requested, and altering (53) among an intonation and/or the language conveying the information based on the context of the information to be presented. The intonation can optionally be altered by altering (54) a volume, a speed, and/or a pitch based on the information to be presented. The language can be altered by selecting (55) among a finite set of synonyms based on the information to be presented to the user or by selecting (56) among key verbs, adjectives or adverbs that vary along a continuum.
    Type: Grant
    Filed: August 10, 2004
    Date of Patent: February 19, 2013
    Assignee: International Business Machines Corporation
    Inventors: Brent L. Davis, Stephen W. Hanley, Vanessa V. Michelini, Melanie D. Polkosky
  • Patent number: 8370132
    Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.
    Type: Grant
    Filed: November 21, 2005
    Date of Patent: February 5, 2013
    Assignee: Verizon Services Corp.
    Inventor: Adrian E. Conway
  • Patent number: 8364477
    Abstract: A method (400, 500) and apparatus (220) seeks to improve the intelligibility of speech emitted into a noisy environment. Formants are identified (426) and perceptual frequency scale band is selected (502) that includes at least one of the identified formants. The SNR in each band is compared (504) to a threshold and, if the SNR for that band is less than the threshold, the method increases a formant enhancement gain for that band. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains yielding combined gains that are then clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532), and used to reconstruct (532, 534) an audio signal.
    Type: Grant
    Filed: August 30, 2012
    Date of Patent: January 29, 2013
    Assignee: Motorola Mobility LLC
    Inventors: Jianming J Song, John C Johnson
  • Patent number: 8363854
    Abstract: A device and method are provided for automatically adjusting gain, including a conversion module for converting an audio time-domain signal to an audio frequency-domain signal, an analysis module for analyzing the audio frequency-domain signal in accordance with an equal-loudness level contour of human hearing so as to generate strength weightings and generating a signal strength in accordance with the weightings, a calculation module for calculating a gain by analysis of the audio frequency-domain signal when the signal strength falls outside a default range, and a control module for generating an audio output signal in accordance with the gain and the audio time-domain signal.
    Type: Grant
    Filed: October 17, 2008
    Date of Patent: January 29, 2013
    Assignee: Realtek Semiconductor Corp.
    Inventors: Kai-Hsiang Chou, Wen-Haw Wang, Yu-Heng Chen, Mei-Yu Fan
  • Patent number: 8355921
    Abstract: An apparatus for performing improved audio processing may include a processor. The processor may be configured to divide respective signals of each channel of a multi-channel audio input signal into one or more spectral bands corresponding to respective analysis frames, select a leading channel from among channels of the multi-channel audio input signal for at least one spectral band, determine a time shift value for at least one spectral band of at least one channel, and time align the channels based at least in part on the time shift value.
    Type: Grant
    Filed: June 13, 2008
    Date of Patent: January 15, 2013
    Assignee: Nokia Corporation
    Inventors: Mikko Tapio Tammi, Miikka Tapani Vilermo
  • Patent number: 8346548
    Abstract: The aural similarity measuring system and method provides a measure of the aural similarity between a target text (10) and one or more reference texts (11). Both the target text (10) and the reference texts (11) are converted into a string of phonemes (15) and then one or other of the phoneme strings are adjusted (16) so that both are equal in length. The phoneme strings are compared (12) and a score generated representative of the degree of similarity of the two phoneme strings. Finally, where there is a plurality of reference texts the similarity scores for each of the reference texts are ranked (13). With this aural similarity measuring system the analysis is automated thereby reducing risks of errors and omissions. Moreover, the system provides an objective measure of aural similarity enabling consistency of comparison in results and reproducibility of results.
    Type: Grant
    Filed: March 5, 2008
    Date of Patent: January 1, 2013
    Assignee: Mongoose Ventures Limited
    Inventor: Mark Owen
  • Patent number: 8321211
    Abstract: A method and system for multi-channel detection of pitch may comprise one or more of the following steps and/or means therefore: (a) sampling an audio input stream including at least a first channel and a second channel; (b) setting a search frequency for each of the first channel and the second channel; and (c) detecting a pitch of the first channel and a pitch of the second channel.
    Type: Grant
    Filed: March 2, 2009
    Date of Patent: November 27, 2012
    Assignee: University of Kansas-KU Medical Center Research Institute
    Inventor: David W. Petr
  • Patent number: 8315855
    Abstract: Character extraction section extracts character amounts, pertaining to a prosody of voice, from a voice signal sequentially in a time-serial manner. Difference value calculation calculates a difference value between each of the extracted character amounts and a reference value. Processing values, corresponding to the individual character amounts, are generated in accordance with the respective difference values, and a voice processing section controls the individual character amounts of the voice signal in accordance with the processing values corresponding to the character amounts and thereby generates an output signal having a prosody changed from the prosody of the voice signal.
    Type: Grant
    Filed: July 22, 2009
    Date of Patent: November 20, 2012
    Assignee: Yamaha Corporation
    Inventor: Yasuo Yoshioka
  • Patent number: 8315856
    Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a region of the signal representing speech. The region can comprise a portion of a frame of the signal representing speech classified as a voiced frame. The region can be marked based on one or more pitch estimates for the region. A cord can be identified within the region based on occurrence of one or more events within the region of the signal. For example, the one or more events can comprise one or more glottal pulses. In such cases, cord can begin with onset of a first glottal pulse and extend to a point prior to an onset of a second glottal pulse. The cord may exclude a portion of the region of the signal prior to the onset of the second glottal pulse.
    Type: Grant
    Filed: October 23, 2008
    Date of Patent: November 20, 2012
    Assignee: Red Shift Company, LLC
    Inventors: Joel K. Nyquist, Erik N. Reckase, Matthew D. Robinson, John F. Remillard
  • Patent number: 8311812
    Abstract: A method and apparatus are provided for determining an instantaneous frequency and an instantaneous bandwidth of a speech resonance of a speech signal. The method includes receiving a speech signal having a real component; filtering the speech signal so as to generate a plurality of filtered signals such that the real component and an imaginary component of the speech signal are reconstructed; and generating a first estimated frequency and a first estimated bandwidth of a speech resonance of the speech signal based on both a first filtered signal of the plurality of filtered signals and a single-lag delay of the first filtered signal.
    Type: Grant
    Filed: December 1, 2009
    Date of Patent: November 13, 2012
    Assignee: Eliza Corporation
    Inventors: John P. Kroeker, Janet Slifka, Richard S. McGowan
  • Patent number: 8311842
    Abstract: A method and apparatus for expanding a bandwidth of an input narrowband voice signal is provided. The narrowband voice signal is analyzed separately for each frame, and a Degree of Voicing (DV) and a Degree of Stationary (DS) are calculated depending on the analysis. A Degree of Difficulty of Bandwidth Expansion (DDBWE) of the narrowband voice signal is calculated based on DV and DS. Bandwidth expansion is controlled according to DDBWE.
    Type: Grant
    Filed: March 3, 2008
    Date of Patent: November 13, 2012
    Assignee: Samsung Electronics Co., Ltd
    Inventors: Geun-Bae Song, Min-Sung Kim, Hee-Jin Oh, Austin Kim, Jae-Bum Kim
  • Patent number: 8306821
    Abstract: A signal enhancement system reinforces signal content and improves the signal-to-noise ratio of a signal. The system detects, tracks, and reinforces non-stationary periodic signal components of a signal. The periodic signal components may represent vowel sounds or other voiced sounds. The system may detect, track, and attenuate quasi-stationary signal components in the signal.
    Type: Grant
    Filed: June 4, 2007
    Date of Patent: November 6, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Rajeev Nongpiur, Phillip A. Hetherington
  • Patent number: 8280730
    Abstract: A method (400, 600, 700) and apparatus (220) for enhancing the intelligibility of speech emitted into a noisy environment. After filtering (408) ambient noise with a filter (304) that simulates the physical blocking of noise by a at least a part of a voice communication device (102) a frequency dependent SNR of received voice audio relative to ambient noise is computed (424) on a perceptual (e.g. Bark) frequency scale. Formants are identified (426, 600, 700) and the SNR in bands including certain formants are modified (508, 510) with formant enhancement gain factors in order to improve intelligibility. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains factors yielding combined gains which are clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532) and used to reconstruct (532, 534) an audio signal.
    Type: Grant
    Filed: May 25, 2005
    Date of Patent: October 2, 2012
    Assignee: Motorola Mobility LLC
    Inventors: Jianming J. Song, John C. Johnson
  • Patent number: 8280732
    Abstract: Hand gestures are translated by first detecting the hand gestures with an electronic sensor and converting the detected gestures into respective electrical transfer signals in a frequency band corresponding to that of speech. These transfer signals are inputted in the audible-sound frequency band into a speech-recognition system where they are analyzed.
    Type: Grant
    Filed: March 26, 2009
    Date of Patent: October 2, 2012
    Inventors: Wolfgang Richter, Roland Aubauer
  • Patent number: 8280724
    Abstract: A method for processing a speech signal includes dividing the speech signal into a succession of frames, identifying one or more of the frames as click frames, and extracting phase information from the click frames. The speech signal is encoded using the phase information. Methods are also provided for modeling phase spectra of voiced frames and click frames.
    Type: Grant
    Filed: January 31, 2005
    Date of Patent: October 2, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Dan Chazan, Ron Hoory, Zvi Kons, Slava Shechtman, Alexander Sorin
  • Patent number: 8280727
    Abstract: A voice band expansion device includes a time-frequency converter that calculates a frequency spectrum of a voice signal having a first frequency band; a separator that extracts, from the frequency spectrum, an envelope amplitude spectrum, a periodic amplitude spectrum, and a random amplitude spectrum; an envelope amplitude spectrum band expander that expands a frequency band to a second frequency band that is different from the first frequency band; a periodic amplitude spectrum band expander that expands a frequency band to the second frequency band; a random amplitude spectrum band expander that expands a frequency band of the random amplitude spectrum to the second frequency band; a broadband spectrum calculator that calculates a broadband frequency spectrum having the first frequency band and the second frequency band; and a frequency-time converter generates a voice signal having the first frequency band and the second frequency band.
    Type: Grant
    Filed: May 11, 2010
    Date of Patent: October 2, 2012
    Assignee: Fujitsu Limited
    Inventors: Kaori Endo, Takeshi Otani, Taro Togawa, Yasuji Ota
  • Patent number: 8234110
    Abstract: A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.
    Type: Grant
    Filed: September 29, 2008
    Date of Patent: July 31, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
  • Patent number: 8219390
    Abstract: A system and method are disclosed for modifying an audio signal. A pitch associated with the audio signal is detected. A portion of the audio signal that is associated with the detected pitch is modified. Controlling the modification of a primary audio signal is disclosed. The level of a secondary audio signal is monitored. Modification of the primary audio signal is enabled if the level of the secondary audio signal rises above a first prescribed threshold at a time when the primary audio signal is not being modified. Modification of the primary audio signal is disabled if the level of the secondary audio signal drops below a second prescribed threshold at a time when the primary audio signal is being modified.
    Type: Grant
    Filed: September 16, 2003
    Date of Patent: July 10, 2012
    Assignee: Creative Technology Ltd
    Inventor: Jean Laroche
  • Patent number: 8204742
    Abstract: An adaptive audio system can be implemented in a communication device. The adaptive audio system can enhance voice in an audio signal received by the communication device to increase intelligibility of the voice. The audio system can adapt the audio enhancement based at least in part on levels of environmental content, such as noise, that are received by the communication device. For higher levels of environmental content, for example, the audio system might apply the audio enhancement more aggressively. Additionally, the adaptive audio system can detect substantially periodic content in the environmental content. The adaptive audio system can further adapt the audio enhancement responsive to the environmental content.
    Type: Grant
    Filed: September 14, 2009
    Date of Patent: June 19, 2012
    Assignee: SRS Labs, Inc.
    Inventors: Jun Yang, Richard J. Oliver, James Tracey
  • Patent number: 8200477
    Abstract: A method and system for extracting opinions about a subject of interest from a text document in which each sentence is analyzed individually to identify the opinions. The most relevant feature terms related to the subject are extracted from the document based on their relevancy scores. Candidate feature terms are definite noun phrases at the beginning of the sentences. For each sentence that refers to the subject or a feature term, the invention determines whether the sentence includes an opinion polarity about the subject or the feature term. The opinion polarity is detected by identifying opinion terms in the sentence using an opinion dictionary or an opinion rule base, parsing the sentence with an English parser to identify grammatical components in the sentence and their relationships, and finding a matching entry in the dictionary or the rule base.
    Type: Grant
    Filed: October 22, 2003
    Date of Patent: June 12, 2012
    Assignee: International Business Machines Corporation
    Inventors: Jeonghee Yi, Tetsuya Nasukawa, Razvan Constantin Bunescu
  • Patent number: 8195449
    Abstract: A non-intrusive signal quality assessment apparatus includes a feature vector calculator that determines parameters representing frames of a signal and extracts a collection of per-frame feature vectors (?;(n)) representing structural information of the signal from the parameters. A frame selector preferably selects only frames (?\with a feature vector (?;(n)) lying within a predetermined multi-dimensional window (?). Means determine a global feature set (?) over the collection of feature vectors (?;(n)) from statistical moments of selected feature vector components ((1^,02, . . . O11). A quality predictor predicts a signal quality measure (Qj from the global feature set (?)).
    Type: Grant
    Filed: January 30, 2007
    Date of Patent: June 5, 2012
    Assignee: Telefonaktiebolaget L M Ericsson (Publ)
    Inventors: Stefan Bruhn, Volodya Grancharov, Willem Bastiaan Kleijn
  • Patent number: 8185395
    Abstract: An information transmission device which analyzes a diction of a speaker and provides an utterance in accordance with the diction of the speaker, and which has a microphone detecting a sound signal of the speaker, a feature extraction unit extracting at least one feature value of the diction of the speaker based on the sound signal detected by the microphone, a voice synthesis unit synthesizing a voice signal to be uttered so that the voice signal has the same feature value as the diction of the speaker, based on the feature value extracted by the feature extraction unit, and a voice output unit performing an utterance based on the voice signal synthesized by the voice synthesis unit.
    Type: Grant
    Filed: September 13, 2005
    Date of Patent: May 22, 2012
    Assignee: Honda Motor Co., Ltd.
    Inventors: Tokitomo Ariyoshi, Kazuhiro Nakadai, Hiroshi Tsujino
  • Patent number: 8170885
    Abstract: Disclosed is a wideband audio signal coding/decoding device and method that may code a wideband audio signal while maintaining a low bit rate. The wideband audio signal coding device includes an enhancement layer that extracts a first spectrum parameter from an inputted wideband signal having a first bandwidth, quantizes the extracted first spectrum parameter, and converts the extracted first spectrum parameter into a second spectrum parameter; and a coding unit that extracts a narrowband signal from the inputted wideband signal and codes the narrowband signal based on the second spectrum parameter provided from the enhancement layer, wherein the narrowband signal has a second bandwidth smaller than the first bandwidth. The wideband audio signal coding/decoding device and method may code a wideband audio signal while maintaining a low bit rate.
    Type: Grant
    Filed: October 15, 2008
    Date of Patent: May 1, 2012
    Assignee: Gwangju Institute of Science and Technology
    Inventors: Hong Kook Kim, Young Han Lee
  • Patent number: 8150682
    Abstract: An enhancement system extracts pitch from a processed speech signal. The system estimates the pitch of voiced speech by deriving filter coefficients of an adaptive filter and using the obtained filter coefficients to derive pitch. The pitch estimation may be enhanced by using various techniques to condition the input speech signal, such as spectral modification of the background noise and the speech signal, and/or reduction of the tonal noise from the speech signal.
    Type: Grant
    Filed: May 11, 2011
    Date of Patent: April 3, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Rajeev Nongpiur, Phillip A. Hetherington
  • Patent number: 8145476
    Abstract: A disclosed received voice playback apparatus includes a characteristic acquiring unit configured to acquire first frequency characteristic values obtained by resolving digital vocal signals that are based on received vocal signals into predetermined frequency bands, wherein each first frequency characteristic value corresponds to one of the predetermined frequency bands; a setting unit configured to obtain second frequency characteristic values, wherein each second frequency characteristic value is set for one of the predetermined frequency bands; a computing unit configured to compute a gain for each of the predetermined frequency bands based on a difference between the first frequency characteristic value and the second frequency characteristic value; and a characteristic changing unit configured to change the first frequency characteristic values of the digital vocal signals by multiplying the digital vocal signals by each of the gains corresponding to one of the predetermined frequency bands of the digit
    Type: Grant
    Filed: January 10, 2008
    Date of Patent: March 27, 2012
    Assignee: Ricoh Company, Ltd.
    Inventor: Yukihiro Imai
  • Patent number: 8121835
    Abstract: Automatic level control of speech portions of an audio signal is provided. An audio signal is received in the form of a sequence of samples and may contain speech portion and non-speech portions. The sequence of samples is divided into a sequence of sub-frames. Multiple sub-frames adjacent to a present sub-frame are examined to determine a peak value of samples in the sub-frames. A gain factor is computed for the present sub-frame based on the peak value and a desired maximum value for said speech portion, and each sample in the present sub-frame is amplified by the gain factor. In an embodiment, variations in filtered energy values of multiple sub-frames enable determination of whether a sub-frame corresponds to a speech or non-speech/noise portion.
    Type: Grant
    Filed: March 6, 2008
    Date of Patent: February 21, 2012
    Assignee: Texas Instruments Incorporated
    Inventor: Fitzgerald John Archibald
  • Patent number: 8077893
    Abstract: The aim of the invention is to provide inter-channel level differences ICLD related to audio signals for hearing aids.
    Type: Grant
    Filed: May 30, 2008
    Date of Patent: December 13, 2011
    Assignee: Ecole Polytechnique Federale de Lausanne
    Inventors: Olivier Roy, Martin Vetterli
  • Patent number: 8065138
    Abstract: A speech processing apparatus includes a spectrum envelope extracting unit which extracts the spectrum envelope of an input speech signal, a spectrum envelope deforming unit which applies deformation to the spectrum envelope to generate a deformed spectrum envelope, a spectrum fine structure extracting unit which extracts the spectrum fine structure of the input speech signal, a deformed spectrum generating unit which generates a deformed spectrum by combining the deformed spectrum envelope with the spectrum fine structure, and a speech generating unit which generates an output speech signal on the basis of the deformed spectrum. This apparatus emits a disrupting sound based on the output speech signal to prevent a third party from eavesdropping on a conversation.
    Type: Grant
    Filed: August 31, 2007
    Date of Patent: November 22, 2011
    Assignees: Japan Advanced Institute of Science and Technology, Glory Ltd.
    Inventors: Masato Akagi, Rieko Futonagane, Yoshihiro Irie, Hisakazu Yanagiuchi, Yoshitane Tanaka
  • Patent number: 8019597
    Abstract: A scalable encoding apparatus capable of reducing the bit rates of encoded parameters and also capable of efficiently encoding audio signals in which a plurality of harmonic structures are coexistent. In the apparatus, an MDCT analyzer MDCT analyzes an audio signal for converting/encoding processes. A pitch frequency converter determines an inverse of a pitch period to calculate a pitch frequency. A selector selects spectra located at frequencies that are integral multiples of the pitch frequency, and a second layer encoder encodes the selected spectra.
    Type: Grant
    Filed: October 26, 2005
    Date of Patent: September 13, 2011
    Assignee: Panasonic Corporation
    Inventor: Masahiro Oshikiri
  • Patent number: 8015000
    Abstract: An audio decoding system performs frame loss concealment (FLC) when portions of a bit stream representing an audio signal are lost within the context of a digital communication system. The audio decoding system employs two different FLC methods: one designed to perform well for music, and the other designed to perform well for speech. When a frame is deemed lost, the audio decoding system analyzes a previously-decoded audio signal corresponding to previously-decoded frames of an audio bit-stream. Based on the results of the analysis, the lost frame is classified as either speech or music. Using this classification, other signal analysis, and knowledge of the employed FLC methods, the audio decoding system selects the appropriate FLC method which then performs FLC on the lost frame.
    Type: Grant
    Filed: April 13, 2007
    Date of Patent: September 6, 2011
    Assignee: Broadcom Corporation
    Inventors: Robert W. Zopf, Juin-Hwey Chen, Jes Thyssen
  • Patent number: 8000959
    Abstract: In a formants extracting method capable of precisely obtaining formants as resonance frequencies of voice with less computational complexity, the method includes searching a maximum value by a spectral peak-picking method, judging whether the number of formants corresponding to a zero at the obtained maximum point are two, and analyzing a pertinent root by roots polishing when the number of the formants are judged as two. The number of the formants are judged by applying Cauchy's integral formula, wherein Cauchy's integral formula is not applied repeatedly but only once at a surrounding portion of the maximum value in a z-domain.
    Type: Grant
    Filed: October 6, 2004
    Date of Patent: August 16, 2011
    Assignee: LG Electronics Inc.
    Inventor: Chan-Woo Kim
  • Patent number: 7957959
    Abstract: A method for processing speech data includes obtaining a pitch and at least one formant frequency for each of a plurality of first speech data; constructing a first feature space with the obtained fundamental frequencies and formant frequencies as features; and classifying the plurality pieces of first speech data using the first feature space, and thus a plurality of speech data classes and the corresponding description are obtained.
    Type: Grant
    Filed: July 20, 2007
    Date of Patent: June 7, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Zhao Bing Han, Guo Kang Fu, Da Lai Yan