Formant Patents (Class 704/209)

Methods and apparatus for formant-based voice synthesis

Patent number: 8706488

Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.

Type: Grant

Filed: February 27, 2013

Date of Patent: April 22, 2014

Assignee: Nuance Communications, Inc.

Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
Systems and methods for model-based processing of linguistic user inputs using annotations

Patent number: 8700389

Abstract: The present invention includes model-based processing of linguistic user inputs. In one embodiment, the present invention includes a computer-implemented method comprising receiving linguistic inputs, parsing the linguistic inputs, mapping the linguistic inputs to a formal representation used by a model, applying the formal representation against the model, where the model comprises said formal representation, and where the model specifies relationships between the elements of the formal representation and defines process information, and accessing software resources based on the formal representation of the user input and the relationships and process information in said model.

Type: Grant

Filed: December 23, 2010

Date of Patent: April 15, 2014

Assignee: SAP AG

Inventors: Markus Latzina, Joerg Beringer
Method and apparatus for generating synthetic speech with contrastive stress

Patent number: 8682671

Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.

Type: Grant

Filed: April 17, 2013

Date of Patent: March 25, 2014

Assignee: Nuance Communications, Inc.

Inventors: Darren C. Meyer, Stephen R. Springer
System and method for method for improving speech intelligibility of voice calls using common speech codecs

Patent number: 8645142

Abstract: System and method to improve intelligibility of coded speech, the method including: receiving an encoded speech signal from a network; extracting an encoded media data stream and one or more control data packets from the encoded speech signal; decoding the encoded media data stream to produce a decoded speech signal; boosting an upper spectral portion of the decoded speech signal to produce a boosted speech signal; and outputting the boosted speech signal. In another embodiment, the method may include: receiving an uncoded speech signal; processing the uncoded speech signal, wherein the processing comprises generating an unencoded data stream from the uncoded speech signal; boosting an upper spectral portion of the unencoded data stream to produce a boosted speech signal; encoding the boosted speech signal to produce an encoded speech signal; and outputting the boosted speech signal.

Type: Grant

Filed: March 27, 2012

Date of Patent: February 4, 2014

Assignee: Avaya Inc.

Inventors: Heinz Teutsch, John Cornelius Lynch
Formant aided noise cancellation using multiple microphones

Patent number: 8639499

Abstract: A noise cancellation device includes a plurality of first computation modules, a formant detection module, a direction of arrival module and a beamformer. The plurality of first computation modules receives raw audio data and generates a respective transformed signal as a function of formants. A first transformed signal relates to speech data and a second transformed signal relates to noise data. The formant detection module receives the first transformed signal and generates a frequency range data signal. The direction of arrival module receives the first and second transformed signals, determines a cross-correlation between the first and second transformed signals, and generates a spatial orientation data signal. The beamformer receives the first and second transformed signals, the frequency range data signal, and the spatial orientation data signal and generates modification data at selected formant ranges to eliminate a maximum amount of the noise data.

Type: Grant

Filed: July 28, 2010

Date of Patent: January 28, 2014

Assignee: Motorola Solutions, Inc.

Inventors: Kaustubh Kale, Yong Wang
System and method for tracking sound pitch across an audio signal using harmonic envelope

Patent number: 8620646

Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and a representation of harmonic envelope at the estimated pitch. The estimated pitch and the representation of harmonic envelope may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.

Type: Grant

Filed: August 8, 2011

Date of Patent: December 31, 2013

Assignee: The Intellisis Corporation

Inventors: David C. Bradley, Rodney Gateau, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher
Apparatus and method for encoding/decoding signal

Patent number: 8612238

Abstract: An encoding method and apparatus and a decoding method and apparatus are provided. The decoding method includes extracting a three-dimensional (3D) down-mix signal from an input bitstream, generating a down-mix signal with 3D effects removed therefrom by performing a 3D rendering operation on the extracted 3D down-mix signal, and generating a 3D down-mix signal with 3D effects by performing a 3D rendering operation on the generated down-mix signal. Accordingly, it is possible to efficiently encode multi-channel signals with 3D effects and to adaptively restore and reproduce audio signals with optimum sound quality according to the characteristics of an audio reproduction environment.

Type: Grant

Filed: February 7, 2007

Date of Patent: December 17, 2013

Assignee: LG Electronics, Inc.

Inventors: Yang Won Jung, Hee Suk Pang, Hyen O Oh, Dong Soo Kim, Jae Hyun Lim
Method and apparatus for generating synthetic speech with contrastive stress

Patent number: 8571870

Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.

Type: Grant

Filed: August 9, 2010

Date of Patent: October 29, 2013

Assignee: Nuance Communications, Inc.

Inventors: Darren C. Meyer, Stephen R. Springer
LOGIFOLG

Patent number: 8560305

Abstract: LOGIFOLG is a system and method for finding implicit information that is not explicitly mentioned in the sentence, not contained in the synonyms of the particular word, not present in the concept the word belongs to, not found with statistical or concordance based analysis. Nevertheless, this implicit information is present and understood, implicitly, consciously or unconsciously, by everybody who reads the text. LOGIFOLG uses a computer software process, such as computer-executable program code, to discover this implicit information. The steps in this process are: analyzing user's written input, up to five successive and non-successive words in a sequence, understanding the meaning of the written input, finding implicit information in the written input and finally, displaying the implicit information as a variant of the original sentence. The subject matter of the invention deals with Artificial Reasoning, namely inductive and deductive reasoning, based on Natural Language written sentences.

Type: Grant

Filed: May 16, 2012

Date of Patent: October 15, 2013

Inventor: Hristo Georgiev
METHODS FOR ALIGNING EXPRESSIVE SPEECH UTTERANCES WITH TEXT AND SYSTEMS THEREFOR

Publication number: 20130262096

Abstract: A system-effected method for synthesizing speech, or recognizing speech including a sequence of expressive speech utterances. The method can be computer-implemented and can include system-generating a speech signal embodying the sequence of expressive speech utterances. Other possible steps include: system-marking the speech signal with a pitch marker indicating a pitch change at or near a first zero amplitude crossing point of the speech signal following a glottal closure point, at a minimum, at a maximum or at another location; system marking the speech signal with at least one further pitch marker; system-aligning a sequence of prosodically marked text with the pitch-marked speech signal according to the pitch markers; and system outputting the aligned text or the aligned speech signal, respectively. Computerized systems, and stored programs for implementing method embodiments of the invention are also disclosed.

Type: Application

Filed: September 21, 2012

Publication date: October 3, 2013

Applicant: LESSAC TECHNOLOGIES, INC.

Inventors: Reiner WILHELMS-TRICARICO, Brian MOTTERSHEAD, Rattima NITISAROJ, Michael BAUMGARTNER, John B. REICHENBACH, Gary A. MARPLE
Formant Based Speech Reconstruction from Noisy Signals

Publication number: 20130231927

Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.

Type: Application

Filed: August 20, 2012

Publication date: September 5, 2013

Inventors: PIERRE ZAKARAUSKAS, ALEXANDER ESCOTT, CLARENCE S.H. CHU, SHAWN E. STEVENSON
Bandwidth extension method and apparatus for a modified discrete cosine transform audio coder

Patent number: 8463599

Abstract: A method includes defining a transition band for a signal having a spectrum within a first frequency band, where the transition band is defined as a portion of the first frequency band, and is located near an adjacent frequency band that is adjacent to the first frequency band. The method analyzes the transition band to obtain a transition band spectral envelope and a transition band excitation spectrum; estimates an adjacent frequency band spectral envelope; generates an adjacent frequency band excitation spectrum by periodic repetition of at least a part of the transition band excitation spectrum with a repetition period determined by a pitch frequency of the signal; and combines the adjacent frequency band spectral envelope and the adjacent frequency band excitation spectrum to obtain an adjacent frequency band signal spectrum. A signal processing logic for performing the method is also disclosed.

Type: Grant

Filed: February 4, 2009

Date of Patent: June 11, 2013

Assignee: Motorola Mobility LLC

Inventors: Tenkasi Ramabadran, Mark Jasiuk
Methods and apparatus for formant-based voice systems

Patent number: 8447592

Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.

Type: Grant

Filed: September 13, 2005

Date of Patent: May 21, 2013

Assignee: Nuance Communications, Inc.

Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
Method and apparatus for generating synthetic speech with contrastive stress

Patent number: 8447610

Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.

Type: Grant

Filed: August 9, 2010

Date of Patent: May 21, 2013

Assignee: Nuance Communications, Inc.

Inventors: Darren C. Meyer, Stephen R. Springer
Noise-feedback for spectral envelope quantization

Patent number: 8407046

Abstract: A method of transmitting an input audio signal is disclosed. A current spectral magnitude of the input audio signal is quantized. A quantization error of a previous spectral magnitude is fed back to influence quantization of the current spectral magnitude. The feeding back includes adaptively modifying a quantization criterion to form a modified quantization criterion. A current quantization error is minimized by using the modified quantization criterion. A quantized spectral envelope is formed based on the minimizing and the quantized spectral envelope is transmitted.

Type: Grant

Filed: September 4, 2009

Date of Patent: March 26, 2013

Assignee: Huawei Technologies Co., Ltd.

Inventor: Yang Gao
Voice band expander and expansion method, and voice communication apparatus

Patent number: 8396703

Abstract: A band-limited voice signal is processed to reduce its spectral envelope or harmonic structure, or both. The resulting reduced signal is moved into a frequency band above the upper limit frequency of the band-limited voice signal, and then combined with the band-limited voice signal to form a band expanded signal with improved quality and comprehensibility, free of unnatural high-frequency resonances and unnaturally strong high-frequency harmonics.

Type: Grant

Filed: March 5, 2009

Date of Patent: March 12, 2013

Assignee: Oki Electric Industry Co., Ltd.

Inventor: Hiromi Aoyagi
System for processing an audio signal to enhance speech intelligibility

Patent number: 8386247

Abstract: An adaptive audio system can be implemented in a communication device. The adaptive audio system can enhance voice in an audio signal received by the communication device to increase intelligibility of the voice. The audio system can adapt the audio enhancement based at least in part on levels of environmental content, such as noise, that are received by the communication device. For higher levels of environmental content, for example, the audio system might apply the audio enhancement more aggressively. Additionally, the adaptive audio system can detect substantially periodic content in the environmental content. The adaptive audio system can further adapt the audio enhancement responsive to the environmental content.

Type: Grant

Filed: June 18, 2012

Date of Patent: February 26, 2013

Assignee: DTS LLC

Inventors: Jun Yang, Richard J. Oliver, James Tracey, Xing He
Method, medium and apparatus enhancing a bass signal using an auditory property

Patent number: 8386242

Abstract: Provided are a method, medium and apparatus for enhancing an acoustic signal using an auditory property. An acoustic signal is enhanced by generating a plurality of harmonic signals based on a predetermined acoustic signal frequency, selecting harmonic signals, which exist in an area masked by the predetermined harmonic signal, from among the generated plurality of harmonic signals, and outputting harmonic signals remaining after excluding the selected harmonic signals from the generated plurality of harmonic signals. The enhancement results in a bass signal of good sound quality and having a low distortion ratio, without changing the structure of a micro speaker.

Type: Grant

Filed: June 22, 2007

Date of Patent: February 26, 2013

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jung-ho Kim, Sang-wook Kim, Young-tae Kim, Sang-chul Ko
Method and system of dynamically changing a sentence structure of a message

Patent number: 8380484

Abstract: A method (50) of dynamically changing a sentence structure of a message can include the step of receiving (51) a user request for information, retrieving (52) data based on the information requested, and altering (53) among an intonation and/or the language conveying the information based on the context of the information to be presented. The intonation can optionally be altered by altering (54) a volume, a speed, and/or a pitch based on the information to be presented. The language can be altered by selecting (55) among a finite set of synonyms based on the information to be presented to the user or by selecting (56) among key verbs, adjectives or adverbs that vary along a continuum.

Type: Grant

Filed: August 10, 2004

Date of Patent: February 19, 2013

Assignee: International Business Machines Corporation

Inventors: Brent L. Davis, Stephen W. Hanley, Vanessa V. Michelini, Melanie D. Polkosky
Distributed apparatus and method for a perceptual quality measurement service

Patent number: 8370132

Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.

Type: Grant

Filed: November 21, 2005

Date of Patent: February 5, 2013

Assignee: Verizon Services Corp.

Inventor: Adrian E. Conway
Method and apparatus for increasing speech intelligibility in noisy environments

Patent number: 8364477

Abstract: A method (400, 500) and apparatus (220) seeks to improve the intelligibility of speech emitted into a noisy environment. Formants are identified (426) and perceptual frequency scale band is selected (502) that includes at least one of the identified formants. The SNR in each band is compared (504) to a threshold and, if the SNR for that band is less than the threshold, the method increases a formant enhancement gain for that band. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains yielding combined gains that are then clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532), and used to reconstruct (532, 534) an audio signal.

Type: Grant

Filed: August 30, 2012

Date of Patent: January 29, 2013

Assignee: Motorola Mobility LLC

Inventors: Jianming J Song, John C Johnson
Device and method for automatically adjusting gain

Patent number: 8363854

Abstract: A device and method are provided for automatically adjusting gain, including a conversion module for converting an audio time-domain signal to an audio frequency-domain signal, an analysis module for analyzing the audio frequency-domain signal in accordance with an equal-loudness level contour of human hearing so as to generate strength weightings and generating a signal strength in accordance with the weightings, a calculation module for calculating a gain by analysis of the audio frequency-domain signal when the signal strength falls outside a default range, and a control module for generating an audio output signal in accordance with the gain and the audio time-domain signal.

Type: Grant

Filed: October 17, 2008

Date of Patent: January 29, 2013

Assignee: Realtek Semiconductor Corp.

Inventors: Kai-Hsiang Chou, Wen-Haw Wang, Yu-Heng Chen, Mei-Yu Fan
Method, apparatus and computer program product for providing improved audio processing

Patent number: 8355921

Abstract: An apparatus for performing improved audio processing may include a processor. The processor may be configured to divide respective signals of each channel of a multi-channel audio input signal into one or more spectral bands corresponding to respective analysis frames, select a leading channel from among channels of the multi-channel audio input signal for at least one spectral band, determine a time shift value for at least one spectral band of at least one channel, and time align the channels based at least in part on the time shift value.

Type: Grant

Filed: June 13, 2008

Date of Patent: January 15, 2013

Assignee: Nokia Corporation

Inventors: Mikko Tapio Tammi, Miikka Tapani Vilermo
Aural similarity measuring system for text

Patent number: 8346548

Abstract: The aural similarity measuring system and method provides a measure of the aural similarity between a target text (10) and one or more reference texts (11). Both the target text (10) and the reference texts (11) are converted into a string of phonemes (15) and then one or other of the phoneme strings are adjusted (16) so that both are equal in length. The phoneme strings are compared (12) and a score generated representative of the degree of similarity of the two phoneme strings. Finally, where there is a plurality of reference texts the similarity scores for each of the reference texts are ranked (13). With this aural similarity measuring system the analysis is automated thereby reducing risks of errors and omissions. Moreover, the system provides an objective measure of aural similarity enabling consistency of comparison in results and reproducibility of results.

Type: Grant

Filed: March 5, 2008

Date of Patent: January 1, 2013

Assignee: Mongoose Ventures Limited

Inventor: Mark Owen
System and method for multi-channel pitch detection

Patent number: 8321211

Abstract: A method and system for multi-channel detection of pitch may comprise one or more of the following steps and/or means therefore: (a) sampling an audio input stream including at least a first channel and a second channel; (b) setting a search frequency for each of the first channel and the second channel; and (c) detecting a pitch of the first channel and a pitch of the second channel.

Type: Grant

Filed: March 2, 2009

Date of Patent: November 27, 2012

Assignee: University of Kansas-KU Medical Center Research Institute

Inventor: David W. Petr
Voice processing apparatus and method

Patent number: 8315855

Abstract: Character extraction section extracts character amounts, pertaining to a prosody of voice, from a voice signal sequentially in a time-serial manner. Difference value calculation calculates a difference value between each of the extracted character amounts and a reference value. Processing values, corresponding to the individual character amounts, are generated in accordance with the respective difference values, and a voice processing section controls the individual character amounts of the voice signal in accordance with the processing values corresponding to the character amounts and thereby generates an output signal having a prosody changed from the prosody of the voice signal.

Type: Grant

Filed: July 22, 2009

Date of Patent: November 20, 2012

Assignee: Yamaha Corporation

Inventor: Yasuo Yoshioka
Identify features of speech based on events in a signal representing spoken sounds

Patent number: 8315856

Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a region of the signal representing speech. The region can comprise a portion of a frame of the signal representing speech classified as a voiced frame. The region can be marked based on one or more pitch estimates for the region. A cord can be identified within the region based on occurrence of one or more events within the region of the signal. For example, the one or more events can comprise one or more glottal pulses. In such cases, cord can begin with onset of a first glottal pulse and extend to a point prior to an onset of a second glottal pulse. The cord may exclude a portion of the region of the signal prior to the onset of the second glottal pulse.

Type: Grant

Filed: October 23, 2008

Date of Patent: November 20, 2012

Assignee: Red Shift Company, LLC

Inventors: Joel K. Nyquist, Erik N. Reckase, Matthew D. Robinson, John F. Remillard
Fast and accurate extraction of formants for speech recognition using a plurality of complex filters in parallel

Patent number: 8311812

Abstract: A method and apparatus are provided for determining an instantaneous frequency and an instantaneous bandwidth of a speech resonance of a speech signal. The method includes receiving a speech signal having a real component; filtering the speech signal so as to generate a plurality of filtered signals such that the real component and an imaginary component of the speech signal are reconstructed; and generating a first estimated frequency and a first estimated bandwidth of a speech resonance of the speech signal based on both a first filtered signal of the plurality of filtered signals and a single-lag delay of the first filtered signal.

Type: Grant

Filed: December 1, 2009

Date of Patent: November 13, 2012

Assignee: Eliza Corporation

Inventors: John P. Kroeker, Janet Slifka, Richard S. McGowan
Method and apparatus for expanding bandwidth of voice signal

Patent number: 8311842

Abstract: A method and apparatus for expanding a bandwidth of an input narrowband voice signal is provided. The narrowband voice signal is analyzed separately for each frame, and a Degree of Voicing (DV) and a Degree of Stationary (DS) are calculated depending on the analysis. A Degree of Difficulty of Bandwidth Expansion (DDBWE) of the narrowband voice signal is calculated based on DV and DS. Bandwidth expansion is controlled according to DDBWE.

Type: Grant

Filed: March 3, 2008

Date of Patent: November 13, 2012

Assignee: Samsung Electronics Co., Ltd

Inventors: Geun-Bae Song, Min-Sung Kim, Hee-Jin Oh, Austin Kim, Jae-Bum Kim
Sub-band periodic signal enhancement system

Patent number: 8306821

Abstract: A signal enhancement system reinforces signal content and improves the signal-to-noise ratio of a signal. The system detects, tracks, and reinforces non-stationary periodic signal components of a signal. The periodic signal components may represent vowel sounds or other voiced sounds. The system may detect, track, and attenuate quasi-stationary signal components in the signal.

Type: Grant

Filed: June 4, 2007

Date of Patent: November 6, 2012

Assignee: QNX Software Systems Limited

Inventors: Rajeev Nongpiur, Phillip A. Hetherington
Method and apparatus of increasing speech intelligibility in noisy environments

Patent number: 8280730

Abstract: A method (400, 600, 700) and apparatus (220) for enhancing the intelligibility of speech emitted into a noisy environment. After filtering (408) ambient noise with a filter (304) that simulates the physical blocking of noise by a at least a part of a voice communication device (102) a frequency dependent SNR of received voice audio relative to ambient noise is computed (424) on a perceptual (e.g. Bark) frequency scale. Formants are identified (426, 600, 700) and the SNR in bands including certain formants are modified (508, 510) with formant enhancement gain factors in order to improve intelligibility. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains factors yielding combined gains which are clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532) and used to reconstruct (532, 534) an audio signal.

Type: Grant

Filed: May 25, 2005

Date of Patent: October 2, 2012

Assignee: Motorola Mobility LLC

Inventors: Jianming J. Song, John C. Johnson
System and method for multidimensional gesture analysis

Patent number: 8280732

Abstract: Hand gestures are translated by first detecting the hand gestures with an electronic sensor and converting the detected gestures into respective electrical transfer signals in a frequency band corresponding to that of speech. These transfer signals are inputted in the audible-sound frequency band into a speech-recognition system where they are analyzed.

Type: Grant

Filed: March 26, 2009

Date of Patent: October 2, 2012

Inventors: Wolfgang Richter, Roland Aubauer
Speech synthesis using complex spectral modeling

Patent number: 8280724

Abstract: A method for processing a speech signal includes dividing the speech signal into a succession of frames, identifying one or more of the frames as click frames, and extracting phase information from the click frames. The speech signal is encoded using the phase information. Methods are also provided for modeling phase spectra of voiced frames and click frames.

Type: Grant

Filed: January 31, 2005

Date of Patent: October 2, 2012

Assignee: Nuance Communications, Inc.

Inventors: Dan Chazan, Ron Hoory, Zvi Kons, Slava Shechtman, Alexander Sorin
Voice band expansion device, voice band expansion method, and communication apparatus

Patent number: 8280727

Abstract: A voice band expansion device includes a time-frequency converter that calculates a frequency spectrum of a voice signal having a first frequency band; a separator that extracts, from the frequency spectrum, an envelope amplitude spectrum, a periodic amplitude spectrum, and a random amplitude spectrum; an envelope amplitude spectrum band expander that expands a frequency band to a second frequency band that is different from the first frequency band; a periodic amplitude spectrum band expander that expands a frequency band to the second frequency band; a random amplitude spectrum band expander that expands a frequency band of the random amplitude spectrum to the second frequency band; a broadband spectrum calculator that calculates a broadband frequency spectrum having the first frequency band and the second frequency band; and a frequency-time converter generates a voice signal having the first frequency band and the second frequency band.

Type: Grant

Filed: May 11, 2010

Date of Patent: October 2, 2012

Assignee: Fujitsu Limited

Inventors: Kaori Endo, Takeshi Otani, Taro Togawa, Yasuji Ota
Voice conversion method and system

Patent number: 8234110

Abstract: A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.

Type: Grant

Filed: September 29, 2008

Date of Patent: July 31, 2012

Assignee: Nuance Communications, Inc.

Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
Pitch-based frequency domain voice removal

Patent number: 8219390

Abstract: A system and method are disclosed for modifying an audio signal. A pitch associated with the audio signal is detected. A portion of the audio signal that is associated with the detected pitch is modified. Controlling the modification of a primary audio signal is disclosed. The level of a secondary audio signal is monitored. Modification of the primary audio signal is enabled if the level of the secondary audio signal rises above a first prescribed threshold at a time when the primary audio signal is not being modified. Modification of the primary audio signal is disabled if the level of the secondary audio signal drops below a second prescribed threshold at a time when the primary audio signal is being modified.

Type: Grant

Filed: September 16, 2003

Date of Patent: July 10, 2012

Assignee: Creative Technology Ltd

Inventor: Jean Laroche
System for processing an audio signal to enhance speech intelligibility

Patent number: 8204742

Abstract: An adaptive audio system can be implemented in a communication device. The adaptive audio system can enhance voice in an audio signal received by the communication device to increase intelligibility of the voice. The audio system can adapt the audio enhancement based at least in part on levels of environmental content, such as noise, that are received by the communication device. For higher levels of environmental content, for example, the audio system might apply the audio enhancement more aggressively. Additionally, the adaptive audio system can detect substantially periodic content in the environmental content. The adaptive audio system can further adapt the audio enhancement responsive to the environmental content.

Type: Grant

Filed: September 14, 2009

Date of Patent: June 19, 2012

Assignee: SRS Labs, Inc.

Inventors: Jun Yang, Richard J. Oliver, James Tracey
Method and system for extracting opinions from text documents

Patent number: 8200477

Abstract: A method and system for extracting opinions about a subject of interest from a text document in which each sentence is analyzed individually to identify the opinions. The most relevant feature terms related to the subject are extracted from the document based on their relevancy scores. Candidate feature terms are definite noun phrases at the beginning of the sentences. For each sentence that refers to the subject or a feature term, the invention determines whether the sentence includes an opinion polarity about the subject or the feature term. The opinion polarity is detected by identifying opinion terms in the sentence using an opinion dictionary or an opinion rule base, parsing the sentence with an English parser to identify grammatical components in the sentence and their relationships, and finding a matching entry in the dictionary or the rule base.

Type: Grant

Filed: October 22, 2003

Date of Patent: June 12, 2012

Assignee: International Business Machines Corporation

Inventors: Jeonghee Yi, Tetsuya Nasukawa, Razvan Constantin Bunescu
Low-complexity, non-intrusive speech quality assessment

Patent number: 8195449

Abstract: A non-intrusive signal quality assessment apparatus includes a feature vector calculator that determines parameters representing frames of a signal and extracts a collection of per-frame feature vectors (?;(n)) representing structural information of the signal from the parameters. A frame selector preferably selects only frames (?\with a feature vector (?;(n)) lying within a predetermined multi-dimensional window (?). Means determine a global feature set (?) over the collection of feature vectors (?;(n)) from statistical moments of selected feature vector components ((1^,02, . . . O11). A quality predictor predicts a signal quality measure (Qj from the global feature set (?)).

Type: Grant

Filed: January 30, 2007

Date of Patent: June 5, 2012

Assignee: Telefonaktiebolaget L M Ericsson (Publ)

Inventors: Stefan Bruhn, Volodya Grancharov, Willem Bastiaan Kleijn
Information transmission device

Patent number: 8185395

Abstract: An information transmission device which analyzes a diction of a speaker and provides an utterance in accordance with the diction of the speaker, and which has a microphone detecting a sound signal of the speaker, a feature extraction unit extracting at least one feature value of the diction of the speaker based on the sound signal detected by the microphone, a voice synthesis unit synthesizing a voice signal to be uttered so that the voice signal has the same feature value as the diction of the speaker, based on the feature value extracted by the feature extraction unit, and a voice output unit performing an utterance based on the voice signal synthesized by the voice synthesis unit.

Type: Grant

Filed: September 13, 2005

Date of Patent: May 22, 2012

Assignee: Honda Motor Co., Ltd.

Inventors: Tokitomo Ariyoshi, Kazuhiro Nakadai, Hiroshi Tsujino
Wideband audio signal coding/decoding device and method

Patent number: 8170885

Abstract: Disclosed is a wideband audio signal coding/decoding device and method that may code a wideband audio signal while maintaining a low bit rate. The wideband audio signal coding device includes an enhancement layer that extracts a first spectrum parameter from an inputted wideband signal having a first bandwidth, quantizes the extracted first spectrum parameter, and converts the extracted first spectrum parameter into a second spectrum parameter; and a coding unit that extracts a narrowband signal from the inputted wideband signal and codes the narrowband signal based on the second spectrum parameter provided from the enhancement layer, wherein the narrowband signal has a second bandwidth smaller than the first bandwidth. The wideband audio signal coding/decoding device and method may code a wideband audio signal while maintaining a low bit rate.

Type: Grant

Filed: October 15, 2008

Date of Patent: May 1, 2012

Assignee: Gwangju Institute of Science and Technology

Inventors: Hong Kook Kim, Young Han Lee
Adaptive filter pitch extraction

Patent number: 8150682

Abstract: An enhancement system extracts pitch from a processed speech signal. The system estimates the pitch of voiced speech by deriving filter coefficients of an adaptive filter and using the obtained filter coefficients to derive pitch. The pitch estimation may be enhanced by using various techniques to condition the input speech signal, such as spectral modification of the background noise and the speech signal, and/or reduction of the tonal noise from the speech signal.

Type: Grant

Filed: May 11, 2011

Date of Patent: April 3, 2012

Assignee: QNX Software Systems Limited

Inventors: Rajeev Nongpiur, Phillip A. Hetherington
Received voice playback apparatus

Patent number: 8145476

Abstract: A disclosed received voice playback apparatus includes a characteristic acquiring unit configured to acquire first frequency characteristic values obtained by resolving digital vocal signals that are based on received vocal signals into predetermined frequency bands, wherein each first frequency characteristic value corresponds to one of the predetermined frequency bands; a setting unit configured to obtain second frequency characteristic values, wherein each second frequency characteristic value is set for one of the predetermined frequency bands; a computing unit configured to compute a gain for each of the predetermined frequency bands based on a difference between the first frequency characteristic value and the second frequency characteristic value; and a characteristic changing unit configured to change the first frequency characteristic values of the digital vocal signals by multiplying the digital vocal signals by each of the gains corresponding to one of the predetermined frequency bands of the digit

Type: Grant

Filed: January 10, 2008

Date of Patent: March 27, 2012

Assignee: Ricoh Company, Ltd.

Inventor: Yukihiro Imai
Automatic level control of speech signals

Patent number: 8121835

Abstract: Automatic level control of speech portions of an audio signal is provided. An audio signal is received in the form of a sequence of samples and may contain speech portion and non-speech portions. The sequence of samples is divided into a sequence of sub-frames. Multiple sub-frames adjacent to a present sub-frame are examined to determine a peak value of samples in the sub-frames. A gain factor is computed for the present sub-frame based on the peak value and a desired maximum value for said speech portion, and each sample in the present sub-frame is amplified by the gain factor. In an embodiment, variations in filtered energy values of multiple sub-frames enable determination of whether a sub-frame corresponds to a speech or non-speech/noise portion.

Type: Grant

Filed: March 6, 2008

Date of Patent: February 21, 2012

Assignee: Texas Instruments Incorporated

Inventor: Fitzgerald John Archibald
Distributed audio coding for wireless hearing aids

Patent number: 8077893

Abstract: The aim of the invention is to provide inter-channel level differences ICLD related to audio signals for hearing aids.

Type: Grant

Filed: May 30, 2008

Date of Patent: December 13, 2011

Assignee: Ecole Polytechnique Federale de Lausanne

Inventors: Olivier Roy, Martin Vetterli
Speech processing method and apparatus, storage medium, and speech system

Patent number: 8065138

Abstract: A speech processing apparatus includes a spectrum envelope extracting unit which extracts the spectrum envelope of an input speech signal, a spectrum envelope deforming unit which applies deformation to the spectrum envelope to generate a deformed spectrum envelope, a spectrum fine structure extracting unit which extracts the spectrum fine structure of the input speech signal, a deformed spectrum generating unit which generates a deformed spectrum by combining the deformed spectrum envelope with the spectrum fine structure, and a speech generating unit which generates an output speech signal on the basis of the deformed spectrum. This apparatus emits a disrupting sound based on the output speech signal to prevent a third party from eavesdropping on a conversation.

Type: Grant

Filed: August 31, 2007

Date of Patent: November 22, 2011

Assignees: Japan Advanced Institute of Science and Technology, Glory Ltd.

Inventors: Masato Akagi, Rieko Futonagane, Yoshihiro Irie, Hisakazu Yanagiuchi, Yoshitane Tanaka
Scalable encoding apparatus, scalable decoding apparatus, and methods thereof

Patent number: 8019597

Abstract: A scalable encoding apparatus capable of reducing the bit rates of encoded parameters and also capable of efficiently encoding audio signals in which a plurality of harmonic structures are coexistent. In the apparatus, an MDCT analyzer MDCT analyzes an audio signal for converting/encoding processes. A pitch frequency converter determines an inverse of a pitch period to calculate a pitch frequency. A selector selects spectra located at frequencies that are integral multiples of the pitch frequency, and a second layer encoder encodes the selected spectra.

Type: Grant

Filed: October 26, 2005

Date of Patent: September 13, 2011

Assignee: Panasonic Corporation

Inventor: Masahiro Oshikiri
Classification-based frame loss concealment for audio signals

Patent number: 8015000

Abstract: An audio decoding system performs frame loss concealment (FLC) when portions of a bit stream representing an audio signal are lost within the context of a digital communication system. The audio decoding system employs two different FLC methods: one designed to perform well for music, and the other designed to perform well for speech. When a frame is deemed lost, the audio decoding system analyzes a previously-decoded audio signal corresponding to previously-decoded frames of an audio bit-stream. Based on the results of the analysis, the lost frame is classified as either speech or music. Using this classification, other signal analysis, and knowledge of the employed FLC methods, the audio decoding system selects the appropriate FLC method which then performs FLC on the lost frame.

Type: Grant

Filed: April 13, 2007

Date of Patent: September 6, 2011

Assignee: Broadcom Corporation

Inventors: Robert W. Zopf, Juin-Hwey Chen, Jes Thyssen
Formants extracting method combining spectral peak picking and roots extraction

Patent number: 8000959

Abstract: In a formants extracting method capable of precisely obtaining formants as resonance frequencies of voice with less computational complexity, the method includes searching a maximum value by a spectral peak-picking method, judging whether the number of formants corresponding to a zero at the obtained maximum point are two, and analyzing a pertinent root by roots polishing when the number of the formants are judged as two. The number of the formants are judged by applying Cauchy's integral formula, wherein Cauchy's integral formula is not applied repeatedly but only once at a surrounding portion of the maximum value in a z-domain.

Type: Grant

Filed: October 6, 2004

Date of Patent: August 16, 2011

Assignee: LG Electronics Inc.

Inventor: Chan-Woo Kim
Method and apparatus for processing speech data with classification models

Patent number: 7957959

Abstract: A method for processing speech data includes obtaining a pitch and at least one formant frequency for each of a plurality of first speech data; constructing a first feature space with the obtained fundamental frequencies and formant frequencies as features; and classifying the plurality pieces of first speech data using the first feature space, and thus a plurality of speech data classes and the corresponding description are obtained.

Type: Grant

Filed: July 20, 2007

Date of Patent: June 7, 2011

Assignee: Nuance Communications, Inc.

Inventors: Zhao Bing Han, Guo Kang Fu, Da Lai Yan

prev 1 2 3 4 5 next