Formant Patents (Class 704/209)
-
Patent number: 8706488Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.Type: GrantFiled: February 27, 2013Date of Patent: April 22, 2014Assignee: Nuance Communications, Inc.Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
-
Patent number: 8700389Abstract: The present invention includes model-based processing of linguistic user inputs. In one embodiment, the present invention includes a computer-implemented method comprising receiving linguistic inputs, parsing the linguistic inputs, mapping the linguistic inputs to a formal representation used by a model, applying the formal representation against the model, where the model comprises said formal representation, and where the model specifies relationships between the elements of the formal representation and defines process information, and accessing software resources based on the formal representation of the user input and the relationships and process information in said model.Type: GrantFiled: December 23, 2010Date of Patent: April 15, 2014Assignee: SAP AGInventors: Markus Latzina, Joerg Beringer
-
Patent number: 8682671Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.Type: GrantFiled: April 17, 2013Date of Patent: March 25, 2014Assignee: Nuance Communications, Inc.Inventors: Darren C. Meyer, Stephen R. Springer
-
Patent number: 8645142Abstract: System and method to improve intelligibility of coded speech, the method including: receiving an encoded speech signal from a network; extracting an encoded media data stream and one or more control data packets from the encoded speech signal; decoding the encoded media data stream to produce a decoded speech signal; boosting an upper spectral portion of the decoded speech signal to produce a boosted speech signal; and outputting the boosted speech signal. In another embodiment, the method may include: receiving an uncoded speech signal; processing the uncoded speech signal, wherein the processing comprises generating an unencoded data stream from the uncoded speech signal; boosting an upper spectral portion of the unencoded data stream to produce a boosted speech signal; encoding the boosted speech signal to produce an encoded speech signal; and outputting the boosted speech signal.Type: GrantFiled: March 27, 2012Date of Patent: February 4, 2014Assignee: Avaya Inc.Inventors: Heinz Teutsch, John Cornelius Lynch
-
Patent number: 8639499Abstract: A noise cancellation device includes a plurality of first computation modules, a formant detection module, a direction of arrival module and a beamformer. The plurality of first computation modules receives raw audio data and generates a respective transformed signal as a function of formants. A first transformed signal relates to speech data and a second transformed signal relates to noise data. The formant detection module receives the first transformed signal and generates a frequency range data signal. The direction of arrival module receives the first and second transformed signals, determines a cross-correlation between the first and second transformed signals, and generates a spatial orientation data signal. The beamformer receives the first and second transformed signals, the frequency range data signal, and the spatial orientation data signal and generates modification data at selected formant ranges to eliminate a maximum amount of the noise data.Type: GrantFiled: July 28, 2010Date of Patent: January 28, 2014Assignee: Motorola Solutions, Inc.Inventors: Kaustubh Kale, Yong Wang
-
Patent number: 8620646Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and a representation of harmonic envelope at the estimated pitch. The estimated pitch and the representation of harmonic envelope may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.Type: GrantFiled: August 8, 2011Date of Patent: December 31, 2013Assignee: The Intellisis CorporationInventors: David C. Bradley, Rodney Gateau, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher
-
Patent number: 8612238Abstract: An encoding method and apparatus and a decoding method and apparatus are provided. The decoding method includes extracting a three-dimensional (3D) down-mix signal from an input bitstream, generating a down-mix signal with 3D effects removed therefrom by performing a 3D rendering operation on the extracted 3D down-mix signal, and generating a 3D down-mix signal with 3D effects by performing a 3D rendering operation on the generated down-mix signal. Accordingly, it is possible to efficiently encode multi-channel signals with 3D effects and to adaptively restore and reproduce audio signals with optimum sound quality according to the characteristics of an audio reproduction environment.Type: GrantFiled: February 7, 2007Date of Patent: December 17, 2013Assignee: LG Electronics, Inc.Inventors: Yang Won Jung, Hee Suk Pang, Hyen O Oh, Dong Soo Kim, Jae Hyun Lim
-
Patent number: 8571870Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.Type: GrantFiled: August 9, 2010Date of Patent: October 29, 2013Assignee: Nuance Communications, Inc.Inventors: Darren C. Meyer, Stephen R. Springer
-
Patent number: 8560305Abstract: LOGIFOLG is a system and method for finding implicit information that is not explicitly mentioned in the sentence, not contained in the synonyms of the particular word, not present in the concept the word belongs to, not found with statistical or concordance based analysis. Nevertheless, this implicit information is present and understood, implicitly, consciously or unconsciously, by everybody who reads the text. LOGIFOLG uses a computer software process, such as computer-executable program code, to discover this implicit information. The steps in this process are: analyzing user's written input, up to five successive and non-successive words in a sequence, understanding the meaning of the written input, finding implicit information in the written input and finally, displaying the implicit information as a variant of the original sentence. The subject matter of the invention deals with Artificial Reasoning, namely inductive and deductive reasoning, based on Natural Language written sentences.Type: GrantFiled: May 16, 2012Date of Patent: October 15, 2013Inventor: Hristo Georgiev
-
Publication number: 20130262096Abstract: A system-effected method for synthesizing speech, or recognizing speech including a sequence of expressive speech utterances. The method can be computer-implemented and can include system-generating a speech signal embodying the sequence of expressive speech utterances. Other possible steps include: system-marking the speech signal with a pitch marker indicating a pitch change at or near a first zero amplitude crossing point of the speech signal following a glottal closure point, at a minimum, at a maximum or at another location; system marking the speech signal with at least one further pitch marker; system-aligning a sequence of prosodically marked text with the pitch-marked speech signal according to the pitch markers; and system outputting the aligned text or the aligned speech signal, respectively. Computerized systems, and stored programs for implementing method embodiments of the invention are also disclosed.Type: ApplicationFiled: September 21, 2012Publication date: October 3, 2013Applicant: LESSAC TECHNOLOGIES, INC.Inventors: Reiner WILHELMS-TRICARICO, Brian MOTTERSHEAD, Rattima NITISAROJ, Michael BAUMGARTNER, John B. REICHENBACH, Gary A. MARPLE
-
Publication number: 20130231927Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.Type: ApplicationFiled: August 20, 2012Publication date: September 5, 2013Inventors: PIERRE ZAKARAUSKAS, ALEXANDER ESCOTT, CLARENCE S.H. CHU, SHAWN E. STEVENSON
-
Patent number: 8463599Abstract: A method includes defining a transition band for a signal having a spectrum within a first frequency band, where the transition band is defined as a portion of the first frequency band, and is located near an adjacent frequency band that is adjacent to the first frequency band. The method analyzes the transition band to obtain a transition band spectral envelope and a transition band excitation spectrum; estimates an adjacent frequency band spectral envelope; generates an adjacent frequency band excitation spectrum by periodic repetition of at least a part of the transition band excitation spectrum with a repetition period determined by a pitch frequency of the signal; and combines the adjacent frequency band spectral envelope and the adjacent frequency band excitation spectrum to obtain an adjacent frequency band signal spectrum. A signal processing logic for performing the method is also disclosed.Type: GrantFiled: February 4, 2009Date of Patent: June 11, 2013Assignee: Motorola Mobility LLCInventors: Tenkasi Ramabadran, Mark Jasiuk
-
Patent number: 8447592Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.Type: GrantFiled: September 13, 2005Date of Patent: May 21, 2013Assignee: Nuance Communications, Inc.Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
-
Patent number: 8447610Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.Type: GrantFiled: August 9, 2010Date of Patent: May 21, 2013Assignee: Nuance Communications, Inc.Inventors: Darren C. Meyer, Stephen R. Springer
-
Patent number: 8407046Abstract: A method of transmitting an input audio signal is disclosed. A current spectral magnitude of the input audio signal is quantized. A quantization error of a previous spectral magnitude is fed back to influence quantization of the current spectral magnitude. The feeding back includes adaptively modifying a quantization criterion to form a modified quantization criterion. A current quantization error is minimized by using the modified quantization criterion. A quantized spectral envelope is formed based on the minimizing and the quantized spectral envelope is transmitted.Type: GrantFiled: September 4, 2009Date of Patent: March 26, 2013Assignee: Huawei Technologies Co., Ltd.Inventor: Yang Gao
-
Patent number: 8396703Abstract: A band-limited voice signal is processed to reduce its spectral envelope or harmonic structure, or both. The resulting reduced signal is moved into a frequency band above the upper limit frequency of the band-limited voice signal, and then combined with the band-limited voice signal to form a band expanded signal with improved quality and comprehensibility, free of unnatural high-frequency resonances and unnaturally strong high-frequency harmonics.Type: GrantFiled: March 5, 2009Date of Patent: March 12, 2013Assignee: Oki Electric Industry Co., Ltd.Inventor: Hiromi Aoyagi
-
Patent number: 8386247Abstract: An adaptive audio system can be implemented in a communication device. The adaptive audio system can enhance voice in an audio signal received by the communication device to increase intelligibility of the voice. The audio system can adapt the audio enhancement based at least in part on levels of environmental content, such as noise, that are received by the communication device. For higher levels of environmental content, for example, the audio system might apply the audio enhancement more aggressively. Additionally, the adaptive audio system can detect substantially periodic content in the environmental content. The adaptive audio system can further adapt the audio enhancement responsive to the environmental content.Type: GrantFiled: June 18, 2012Date of Patent: February 26, 2013Assignee: DTS LLCInventors: Jun Yang, Richard J. Oliver, James Tracey, Xing He
-
Patent number: 8386242Abstract: Provided are a method, medium and apparatus for enhancing an acoustic signal using an auditory property. An acoustic signal is enhanced by generating a plurality of harmonic signals based on a predetermined acoustic signal frequency, selecting harmonic signals, which exist in an area masked by the predetermined harmonic signal, from among the generated plurality of harmonic signals, and outputting harmonic signals remaining after excluding the selected harmonic signals from the generated plurality of harmonic signals. The enhancement results in a bass signal of good sound quality and having a low distortion ratio, without changing the structure of a micro speaker.Type: GrantFiled: June 22, 2007Date of Patent: February 26, 2013Assignee: Samsung Electronics Co., Ltd.Inventors: Jung-ho Kim, Sang-wook Kim, Young-tae Kim, Sang-chul Ko
-
Patent number: 8380484Abstract: A method (50) of dynamically changing a sentence structure of a message can include the step of receiving (51) a user request for information, retrieving (52) data based on the information requested, and altering (53) among an intonation and/or the language conveying the information based on the context of the information to be presented. The intonation can optionally be altered by altering (54) a volume, a speed, and/or a pitch based on the information to be presented. The language can be altered by selecting (55) among a finite set of synonyms based on the information to be presented to the user or by selecting (56) among key verbs, adjectives or adverbs that vary along a continuum.Type: GrantFiled: August 10, 2004Date of Patent: February 19, 2013Assignee: International Business Machines CorporationInventors: Brent L. Davis, Stephen W. Hanley, Vanessa V. Michelini, Melanie D. Polkosky
-
Patent number: 8370132Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.Type: GrantFiled: November 21, 2005Date of Patent: February 5, 2013Assignee: Verizon Services Corp.Inventor: Adrian E. Conway
-
Patent number: 8364477Abstract: A method (400, 500) and apparatus (220) seeks to improve the intelligibility of speech emitted into a noisy environment. Formants are identified (426) and perceptual frequency scale band is selected (502) that includes at least one of the identified formants. The SNR in each band is compared (504) to a threshold and, if the SNR for that band is less than the threshold, the method increases a formant enhancement gain for that band. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains yielding combined gains that are then clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532), and used to reconstruct (532, 534) an audio signal.Type: GrantFiled: August 30, 2012Date of Patent: January 29, 2013Assignee: Motorola Mobility LLCInventors: Jianming J Song, John C Johnson
-
Patent number: 8363854Abstract: A device and method are provided for automatically adjusting gain, including a conversion module for converting an audio time-domain signal to an audio frequency-domain signal, an analysis module for analyzing the audio frequency-domain signal in accordance with an equal-loudness level contour of human hearing so as to generate strength weightings and generating a signal strength in accordance with the weightings, a calculation module for calculating a gain by analysis of the audio frequency-domain signal when the signal strength falls outside a default range, and a control module for generating an audio output signal in accordance with the gain and the audio time-domain signal.Type: GrantFiled: October 17, 2008Date of Patent: January 29, 2013Assignee: Realtek Semiconductor Corp.Inventors: Kai-Hsiang Chou, Wen-Haw Wang, Yu-Heng Chen, Mei-Yu Fan
-
Patent number: 8355921Abstract: An apparatus for performing improved audio processing may include a processor. The processor may be configured to divide respective signals of each channel of a multi-channel audio input signal into one or more spectral bands corresponding to respective analysis frames, select a leading channel from among channels of the multi-channel audio input signal for at least one spectral band, determine a time shift value for at least one spectral band of at least one channel, and time align the channels based at least in part on the time shift value.Type: GrantFiled: June 13, 2008Date of Patent: January 15, 2013Assignee: Nokia CorporationInventors: Mikko Tapio Tammi, Miikka Tapani Vilermo
-
Patent number: 8346548Abstract: The aural similarity measuring system and method provides a measure of the aural similarity between a target text (10) and one or more reference texts (11). Both the target text (10) and the reference texts (11) are converted into a string of phonemes (15) and then one or other of the phoneme strings are adjusted (16) so that both are equal in length. The phoneme strings are compared (12) and a score generated representative of the degree of similarity of the two phoneme strings. Finally, where there is a plurality of reference texts the similarity scores for each of the reference texts are ranked (13). With this aural similarity measuring system the analysis is automated thereby reducing risks of errors and omissions. Moreover, the system provides an objective measure of aural similarity enabling consistency of comparison in results and reproducibility of results.Type: GrantFiled: March 5, 2008Date of Patent: January 1, 2013Assignee: Mongoose Ventures LimitedInventor: Mark Owen
-
Patent number: 8321211Abstract: A method and system for multi-channel detection of pitch may comprise one or more of the following steps and/or means therefore: (a) sampling an audio input stream including at least a first channel and a second channel; (b) setting a search frequency for each of the first channel and the second channel; and (c) detecting a pitch of the first channel and a pitch of the second channel.Type: GrantFiled: March 2, 2009Date of Patent: November 27, 2012Assignee: University of Kansas-KU Medical Center Research InstituteInventor: David W. Petr
-
Patent number: 8315855Abstract: Character extraction section extracts character amounts, pertaining to a prosody of voice, from a voice signal sequentially in a time-serial manner. Difference value calculation calculates a difference value between each of the extracted character amounts and a reference value. Processing values, corresponding to the individual character amounts, are generated in accordance with the respective difference values, and a voice processing section controls the individual character amounts of the voice signal in accordance with the processing values corresponding to the character amounts and thereby generates an output signal having a prosody changed from the prosody of the voice signal.Type: GrantFiled: July 22, 2009Date of Patent: November 20, 2012Assignee: Yamaha CorporationInventor: Yasuo Yoshioka
-
Patent number: 8315856Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a region of the signal representing speech. The region can comprise a portion of a frame of the signal representing speech classified as a voiced frame. The region can be marked based on one or more pitch estimates for the region. A cord can be identified within the region based on occurrence of one or more events within the region of the signal. For example, the one or more events can comprise one or more glottal pulses. In such cases, cord can begin with onset of a first glottal pulse and extend to a point prior to an onset of a second glottal pulse. The cord may exclude a portion of the region of the signal prior to the onset of the second glottal pulse.Type: GrantFiled: October 23, 2008Date of Patent: November 20, 2012Assignee: Red Shift Company, LLCInventors: Joel K. Nyquist, Erik N. Reckase, Matthew D. Robinson, John F. Remillard
-
Patent number: 8311812Abstract: A method and apparatus are provided for determining an instantaneous frequency and an instantaneous bandwidth of a speech resonance of a speech signal. The method includes receiving a speech signal having a real component; filtering the speech signal so as to generate a plurality of filtered signals such that the real component and an imaginary component of the speech signal are reconstructed; and generating a first estimated frequency and a first estimated bandwidth of a speech resonance of the speech signal based on both a first filtered signal of the plurality of filtered signals and a single-lag delay of the first filtered signal.Type: GrantFiled: December 1, 2009Date of Patent: November 13, 2012Assignee: Eliza CorporationInventors: John P. Kroeker, Janet Slifka, Richard S. McGowan
-
Patent number: 8311842Abstract: A method and apparatus for expanding a bandwidth of an input narrowband voice signal is provided. The narrowband voice signal is analyzed separately for each frame, and a Degree of Voicing (DV) and a Degree of Stationary (DS) are calculated depending on the analysis. A Degree of Difficulty of Bandwidth Expansion (DDBWE) of the narrowband voice signal is calculated based on DV and DS. Bandwidth expansion is controlled according to DDBWE.Type: GrantFiled: March 3, 2008Date of Patent: November 13, 2012Assignee: Samsung Electronics Co., LtdInventors: Geun-Bae Song, Min-Sung Kim, Hee-Jin Oh, Austin Kim, Jae-Bum Kim
-
Patent number: 8306821Abstract: A signal enhancement system reinforces signal content and improves the signal-to-noise ratio of a signal. The system detects, tracks, and reinforces non-stationary periodic signal components of a signal. The periodic signal components may represent vowel sounds or other voiced sounds. The system may detect, track, and attenuate quasi-stationary signal components in the signal.Type: GrantFiled: June 4, 2007Date of Patent: November 6, 2012Assignee: QNX Software Systems LimitedInventors: Rajeev Nongpiur, Phillip A. Hetherington
-
Patent number: 8280730Abstract: A method (400, 600, 700) and apparatus (220) for enhancing the intelligibility of speech emitted into a noisy environment. After filtering (408) ambient noise with a filter (304) that simulates the physical blocking of noise by a at least a part of a voice communication device (102) a frequency dependent SNR of received voice audio relative to ambient noise is computed (424) on a perceptual (e.g. Bark) frequency scale. Formants are identified (426, 600, 700) and the SNR in bands including certain formants are modified (508, 510) with formant enhancement gain factors in order to improve intelligibility. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains factors yielding combined gains which are clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532) and used to reconstruct (532, 534) an audio signal.Type: GrantFiled: May 25, 2005Date of Patent: October 2, 2012Assignee: Motorola Mobility LLCInventors: Jianming J. Song, John C. Johnson
-
Patent number: 8280732Abstract: Hand gestures are translated by first detecting the hand gestures with an electronic sensor and converting the detected gestures into respective electrical transfer signals in a frequency band corresponding to that of speech. These transfer signals are inputted in the audible-sound frequency band into a speech-recognition system where they are analyzed.Type: GrantFiled: March 26, 2009Date of Patent: October 2, 2012Inventors: Wolfgang Richter, Roland Aubauer
-
Patent number: 8280724Abstract: A method for processing a speech signal includes dividing the speech signal into a succession of frames, identifying one or more of the frames as click frames, and extracting phase information from the click frames. The speech signal is encoded using the phase information. Methods are also provided for modeling phase spectra of voiced frames and click frames.Type: GrantFiled: January 31, 2005Date of Patent: October 2, 2012Assignee: Nuance Communications, Inc.Inventors: Dan Chazan, Ron Hoory, Zvi Kons, Slava Shechtman, Alexander Sorin
-
Patent number: 8280727Abstract: A voice band expansion device includes a time-frequency converter that calculates a frequency spectrum of a voice signal having a first frequency band; a separator that extracts, from the frequency spectrum, an envelope amplitude spectrum, a periodic amplitude spectrum, and a random amplitude spectrum; an envelope amplitude spectrum band expander that expands a frequency band to a second frequency band that is different from the first frequency band; a periodic amplitude spectrum band expander that expands a frequency band to the second frequency band; a random amplitude spectrum band expander that expands a frequency band of the random amplitude spectrum to the second frequency band; a broadband spectrum calculator that calculates a broadband frequency spectrum having the first frequency band and the second frequency band; and a frequency-time converter generates a voice signal having the first frequency band and the second frequency band.Type: GrantFiled: May 11, 2010Date of Patent: October 2, 2012Assignee: Fujitsu LimitedInventors: Kaori Endo, Takeshi Otani, Taro Togawa, Yasuji Ota
-
Patent number: 8234110Abstract: A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.Type: GrantFiled: September 29, 2008Date of Patent: July 31, 2012Assignee: Nuance Communications, Inc.Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
-
Patent number: 8219390Abstract: A system and method are disclosed for modifying an audio signal. A pitch associated with the audio signal is detected. A portion of the audio signal that is associated with the detected pitch is modified. Controlling the modification of a primary audio signal is disclosed. The level of a secondary audio signal is monitored. Modification of the primary audio signal is enabled if the level of the secondary audio signal rises above a first prescribed threshold at a time when the primary audio signal is not being modified. Modification of the primary audio signal is disabled if the level of the secondary audio signal drops below a second prescribed threshold at a time when the primary audio signal is being modified.Type: GrantFiled: September 16, 2003Date of Patent: July 10, 2012Assignee: Creative Technology LtdInventor: Jean Laroche
-
Patent number: 8204742Abstract: An adaptive audio system can be implemented in a communication device. The adaptive audio system can enhance voice in an audio signal received by the communication device to increase intelligibility of the voice. The audio system can adapt the audio enhancement based at least in part on levels of environmental content, such as noise, that are received by the communication device. For higher levels of environmental content, for example, the audio system might apply the audio enhancement more aggressively. Additionally, the adaptive audio system can detect substantially periodic content in the environmental content. The adaptive audio system can further adapt the audio enhancement responsive to the environmental content.Type: GrantFiled: September 14, 2009Date of Patent: June 19, 2012Assignee: SRS Labs, Inc.Inventors: Jun Yang, Richard J. Oliver, James Tracey
-
Patent number: 8200477Abstract: A method and system for extracting opinions about a subject of interest from a text document in which each sentence is analyzed individually to identify the opinions. The most relevant feature terms related to the subject are extracted from the document based on their relevancy scores. Candidate feature terms are definite noun phrases at the beginning of the sentences. For each sentence that refers to the subject or a feature term, the invention determines whether the sentence includes an opinion polarity about the subject or the feature term. The opinion polarity is detected by identifying opinion terms in the sentence using an opinion dictionary or an opinion rule base, parsing the sentence with an English parser to identify grammatical components in the sentence and their relationships, and finding a matching entry in the dictionary or the rule base.Type: GrantFiled: October 22, 2003Date of Patent: June 12, 2012Assignee: International Business Machines CorporationInventors: Jeonghee Yi, Tetsuya Nasukawa, Razvan Constantin Bunescu
-
Patent number: 8195449Abstract: A non-intrusive signal quality assessment apparatus includes a feature vector calculator that determines parameters representing frames of a signal and extracts a collection of per-frame feature vectors (?;(n)) representing structural information of the signal from the parameters. A frame selector preferably selects only frames (?\with a feature vector (?;(n)) lying within a predetermined multi-dimensional window (?). Means determine a global feature set (?) over the collection of feature vectors (?;(n)) from statistical moments of selected feature vector components ((1^,02, . . . O11). A quality predictor predicts a signal quality measure (Qj from the global feature set (?)).Type: GrantFiled: January 30, 2007Date of Patent: June 5, 2012Assignee: Telefonaktiebolaget L M Ericsson (Publ)Inventors: Stefan Bruhn, Volodya Grancharov, Willem Bastiaan Kleijn
-
Patent number: 8185395Abstract: An information transmission device which analyzes a diction of a speaker and provides an utterance in accordance with the diction of the speaker, and which has a microphone detecting a sound signal of the speaker, a feature extraction unit extracting at least one feature value of the diction of the speaker based on the sound signal detected by the microphone, a voice synthesis unit synthesizing a voice signal to be uttered so that the voice signal has the same feature value as the diction of the speaker, based on the feature value extracted by the feature extraction unit, and a voice output unit performing an utterance based on the voice signal synthesized by the voice synthesis unit.Type: GrantFiled: September 13, 2005Date of Patent: May 22, 2012Assignee: Honda Motor Co., Ltd.Inventors: Tokitomo Ariyoshi, Kazuhiro Nakadai, Hiroshi Tsujino
-
Patent number: 8170885Abstract: Disclosed is a wideband audio signal coding/decoding device and method that may code a wideband audio signal while maintaining a low bit rate. The wideband audio signal coding device includes an enhancement layer that extracts a first spectrum parameter from an inputted wideband signal having a first bandwidth, quantizes the extracted first spectrum parameter, and converts the extracted first spectrum parameter into a second spectrum parameter; and a coding unit that extracts a narrowband signal from the inputted wideband signal and codes the narrowband signal based on the second spectrum parameter provided from the enhancement layer, wherein the narrowband signal has a second bandwidth smaller than the first bandwidth. The wideband audio signal coding/decoding device and method may code a wideband audio signal while maintaining a low bit rate.Type: GrantFiled: October 15, 2008Date of Patent: May 1, 2012Assignee: Gwangju Institute of Science and TechnologyInventors: Hong Kook Kim, Young Han Lee
-
Patent number: 8150682Abstract: An enhancement system extracts pitch from a processed speech signal. The system estimates the pitch of voiced speech by deriving filter coefficients of an adaptive filter and using the obtained filter coefficients to derive pitch. The pitch estimation may be enhanced by using various techniques to condition the input speech signal, such as spectral modification of the background noise and the speech signal, and/or reduction of the tonal noise from the speech signal.Type: GrantFiled: May 11, 2011Date of Patent: April 3, 2012Assignee: QNX Software Systems LimitedInventors: Rajeev Nongpiur, Phillip A. Hetherington
-
Patent number: 8145476Abstract: A disclosed received voice playback apparatus includes a characteristic acquiring unit configured to acquire first frequency characteristic values obtained by resolving digital vocal signals that are based on received vocal signals into predetermined frequency bands, wherein each first frequency characteristic value corresponds to one of the predetermined frequency bands; a setting unit configured to obtain second frequency characteristic values, wherein each second frequency characteristic value is set for one of the predetermined frequency bands; a computing unit configured to compute a gain for each of the predetermined frequency bands based on a difference between the first frequency characteristic value and the second frequency characteristic value; and a characteristic changing unit configured to change the first frequency characteristic values of the digital vocal signals by multiplying the digital vocal signals by each of the gains corresponding to one of the predetermined frequency bands of the digitType: GrantFiled: January 10, 2008Date of Patent: March 27, 2012Assignee: Ricoh Company, Ltd.Inventor: Yukihiro Imai
-
Patent number: 8121835Abstract: Automatic level control of speech portions of an audio signal is provided. An audio signal is received in the form of a sequence of samples and may contain speech portion and non-speech portions. The sequence of samples is divided into a sequence of sub-frames. Multiple sub-frames adjacent to a present sub-frame are examined to determine a peak value of samples in the sub-frames. A gain factor is computed for the present sub-frame based on the peak value and a desired maximum value for said speech portion, and each sample in the present sub-frame is amplified by the gain factor. In an embodiment, variations in filtered energy values of multiple sub-frames enable determination of whether a sub-frame corresponds to a speech or non-speech/noise portion.Type: GrantFiled: March 6, 2008Date of Patent: February 21, 2012Assignee: Texas Instruments IncorporatedInventor: Fitzgerald John Archibald
-
Patent number: 8077893Abstract: The aim of the invention is to provide inter-channel level differences ICLD related to audio signals for hearing aids.Type: GrantFiled: May 30, 2008Date of Patent: December 13, 2011Assignee: Ecole Polytechnique Federale de LausanneInventors: Olivier Roy, Martin Vetterli
-
Patent number: 8065138Abstract: A speech processing apparatus includes a spectrum envelope extracting unit which extracts the spectrum envelope of an input speech signal, a spectrum envelope deforming unit which applies deformation to the spectrum envelope to generate a deformed spectrum envelope, a spectrum fine structure extracting unit which extracts the spectrum fine structure of the input speech signal, a deformed spectrum generating unit which generates a deformed spectrum by combining the deformed spectrum envelope with the spectrum fine structure, and a speech generating unit which generates an output speech signal on the basis of the deformed spectrum. This apparatus emits a disrupting sound based on the output speech signal to prevent a third party from eavesdropping on a conversation.Type: GrantFiled: August 31, 2007Date of Patent: November 22, 2011Assignees: Japan Advanced Institute of Science and Technology, Glory Ltd.Inventors: Masato Akagi, Rieko Futonagane, Yoshihiro Irie, Hisakazu Yanagiuchi, Yoshitane Tanaka
-
Patent number: 8019597Abstract: A scalable encoding apparatus capable of reducing the bit rates of encoded parameters and also capable of efficiently encoding audio signals in which a plurality of harmonic structures are coexistent. In the apparatus, an MDCT analyzer MDCT analyzes an audio signal for converting/encoding processes. A pitch frequency converter determines an inverse of a pitch period to calculate a pitch frequency. A selector selects spectra located at frequencies that are integral multiples of the pitch frequency, and a second layer encoder encodes the selected spectra.Type: GrantFiled: October 26, 2005Date of Patent: September 13, 2011Assignee: Panasonic CorporationInventor: Masahiro Oshikiri
-
Patent number: 8015000Abstract: An audio decoding system performs frame loss concealment (FLC) when portions of a bit stream representing an audio signal are lost within the context of a digital communication system. The audio decoding system employs two different FLC methods: one designed to perform well for music, and the other designed to perform well for speech. When a frame is deemed lost, the audio decoding system analyzes a previously-decoded audio signal corresponding to previously-decoded frames of an audio bit-stream. Based on the results of the analysis, the lost frame is classified as either speech or music. Using this classification, other signal analysis, and knowledge of the employed FLC methods, the audio decoding system selects the appropriate FLC method which then performs FLC on the lost frame.Type: GrantFiled: April 13, 2007Date of Patent: September 6, 2011Assignee: Broadcom CorporationInventors: Robert W. Zopf, Juin-Hwey Chen, Jes Thyssen
-
Patent number: 8000959Abstract: In a formants extracting method capable of precisely obtaining formants as resonance frequencies of voice with less computational complexity, the method includes searching a maximum value by a spectral peak-picking method, judging whether the number of formants corresponding to a zero at the obtained maximum point are two, and analyzing a pertinent root by roots polishing when the number of the formants are judged as two. The number of the formants are judged by applying Cauchy's integral formula, wherein Cauchy's integral formula is not applied repeatedly but only once at a surrounding portion of the maximum value in a z-domain.Type: GrantFiled: October 6, 2004Date of Patent: August 16, 2011Assignee: LG Electronics Inc.Inventor: Chan-Woo Kim
-
Patent number: 7957959Abstract: A method for processing speech data includes obtaining a pitch and at least one formant frequency for each of a plurality of first speech data; constructing a first feature space with the obtained fundamental frequencies and formant frequencies as features; and classifying the plurality pieces of first speech data using the first feature space, and thus a plurality of speech data classes and the corresponding description are obtained.Type: GrantFiled: July 20, 2007Date of Patent: June 7, 2011Assignee: Nuance Communications, Inc.Inventors: Zhao Bing Han, Guo Kang Fu, Da Lai Yan