Excitation Patents (Class 704/264)

Separable, intelligible, single channel voice communication

Patent number: 12255671

Abstract: The method provides for separable subchannels sharing a communication channel. A processor receives input of a user setting a transmitter device to a first of at least two subchannels of a communication channel in which the first subchannel comprises a first portion of a bandwidth of the communication channel. The processor receives an audio signal as input to the transmitter device. The processor converts a time-series waveform of the audio signal into a frequency-series waveform. The processor determines that the transmitter device is set to the first subchannel. In response to determining the device is set to the first channel, the processor filters the frequency-series waveform through a series of steep shoulder digital bandpass filters set to transmit through the first portion of the bandwidth, and the processor transmits the audio signal as the filtered frequency-series waveform.

Type: Grant

Filed: March 16, 2023

Date of Patent: March 18, 2025

Assignee: International Business Machines Corporation

Inventors: Hyman David Chantz, Robert Lynch, Elijah Swift
Method and system for generating time-frequency representation of a continuous signal

Patent number: 11366012

Abstract: A method and a system for generating a time-frequency representation of an aperiodic continuous input signal comprising generating a periodic train of short pulses having a repetition frequency, and sampling the signal temporally using the periodic train of short pulses to obtain a temporally sampled signal, the temporally sampled signal comprising a plurality of sampled copies of the input signal, each sampled copy being spaced in function of the repetition frequency of the periodic train of short pulses. The temporally sampled signal is delayed based on the repetition frequency to obtain a delayed temporally sampled signal comprising a plurality of delayed sampled copies, a spectral representation of a given delayed sampled copy being delayed in function of the repetition frequency. The delayed temporally sampled signal is evaluated over consecutive time slots to obtain, for each consecutive time slot, a respective output signal in the time-frequency domain.

Type: Grant

Filed: September 26, 2019

Date of Patent: June 21, 2022

Assignees: INSTITUT NATIONAL DE LA RECHERCHE SCIENTIFIQUE (INRS), CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE (CNRS, UNIVERSITÉ GRENOBLE ALPES

Inventors: Jose Azana, Konatham Saikrishna Reddy, Reza Maram, Hugues Guillet De Chatellus
Periodic-combined-envelope-sequence generation device, periodic-combined-envelope-sequence generation method, periodic-combined-envelope-sequence generation program and recording medium

Patent number: 11100938

Abstract: An envelope sequence is provided that can improve approximation accuracy near peaks caused by the pitch period of an audio signal. A periodic-combined-envelope-sequence generation device according to the present invention takes, as an input audio signal, a time-domain audio digital signal in each frame, which is a predetermined time segment, and generates a periodic combined envelope sequence as an envelope sequence. The periodic-combined-envelope-sequence generation device according to the present invention comprises at least a spectral-envelope-sequence calculating part and a periodic-combined-envelope generating part. The spectral-envelope-sequence calculating part calculates a spectral envelope sequence of the input audio signal on the basis of time-domain linear prediction of the input audio signal.

Type: Grant

Filed: May 14, 2020

Date of Patent: August 24, 2021

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada
Hybrid concealment method: combination of frequency and time domain packet loss concealment in audio codecs

Patent number: 10984804

Abstract: Embodiments of the invention relate to an error concealment unit for providing an error concealment audio information for concealing a loss of an audio frame in an encoded audio information. The error concealment unit provides a first error concealment audio information component for a first frequency range using a frequency domain concealment. The error concealment unit also provides a second error concealment audio information component for a second frequency range, which includes lower frequencies than the first frequency range, using a time domain concealment. The error concealment unit also combines the first error concealment audio information component and the second error concealment audio information component, to obtain the error concealment audio information. Other embodiments of the invention relate to a decoder including the error concealment unit, as well as related encoders, methods, and computer programs for decoding and/or concealing.

Type: Grant

Filed: September 7, 2018

Date of Patent: April 20, 2021

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Jérémie Lecomte, Adrian Tomasek
Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system

Patent number: 10014007

Abstract: A method is presented for forming the excitation signal for a glottal pulse model based parametric speech synthesis system. In one embodiment, fundamental frequency values are used to form the excitation signal. The excitation is modeled using a voice source pulse selected from a database of a given speaker. The voice source signal is segmented into glottal segments, which are used in vector representation to identify the glottal pulse used for formation of the excitation signal. Use of a novel distance metric and preserving the original signals extracted from the speakers voice samples helps capture low frequency information of the excitation signal. In addition, segment edge artifacts are removed by applying a unique segment joining method to improve the quality of synthetic speech while creating a true representation of the voice quality of a speaker.

Type: Grant

Filed: May 28, 2014

Date of Patent: July 3, 2018

Inventors: Rajesh Dachiraju, Aravind Ganapathiraju
GAIN SHAPE ESTIMATION FOR IMPROVED TRACKING OF HIGH-BAND TEMPORAL CHARACTERISTICS

Publication number: 20150106102

Abstract: A method includes determining, at a speech encoder, first gain shape parameters based on a harmonically extended signal and/or based on a high-band residual signal associated with a high-band portion of an audio signal. The method also includes determining second gain shape parameters based on a synthesized high-band signal and based on the high-band portion of the audio signal. The method further includes inserting the first gain parameters and the second gain shape parameters into an encoded version of the audio signal to enable gain adjustment during reproduction of the audio signal from the encoded version of the audio signal.

Type: Application

Filed: October 7, 2014

Publication date: April 16, 2015

Inventors: Venkata Subrahmanyam Chandra Sekhar Chebiyyam, Venkatraman S. Atti
SYSTEMS AND METHODS FOR MITIGATING SPEECH SIGNAL QUALITY DEGRADATION

Publication number: 20150100318

Abstract: A method for decoding a speech signal is described. The method includes obtaining a packet. The method also includes obtaining a previous lag value. The method further includes limiting the previous lag value if the previous lag value is greater than a maximum lag threshold. The method additionally includes disallowing an adjustment to a number of synthesized peaks if a combination of the number of synthesized peaks and an estimated number of peaks is not valid.

Type: Application

Filed: October 4, 2013

Publication date: April 9, 2015

Applicant: QUALCOMM Incorporated

Inventors: Venkatraman Rajagopalan, Venkatesh Krishnan, Alok K. Gupta
Display apparatus and voice conversion method thereof

Patent number: 8949123

Abstract: The voice conversion method of a display apparatus includes: in response to the receipt of a first video frame, detecting one or more entities from the first video frame; in response to the selection of one of the detected entities, storing the selected entity; in response to the selection of one of a plurality of previously-stored voice samples, storing the selected voice sample in connection with the selected entity; and in response to the receipt of a second video frame including the selected entity, changing a voice of the selected entity based on the selected voice sample and outputting the changed voice.

Type: Grant

Filed: April 11, 2012

Date of Patent: February 3, 2015

Assignee: Samsung Electronics Co., Ltd.

Inventors: Aditi Garg, Kasthuri Jayachand Yadlapalli
Vector joint encoding/decoding method and vector joint encoder/decoder

Patent number: 8930200

Abstract: A vector joint encoding/decoding method and a vector joint encoder/decoder are provided, more than two vectors are jointly encoded, and an encoding index of at least one vector is split and then combined between different vectors, so that encoding idle spaces of different vectors can be recombined, thereby facilitating saving of encoding bits, and because an encoding index of a vector is split and then shorter split indexes are recombined, thereby facilitating reduction of requirements for the bit width of operating parts in encoding/decoding calculation.

Type: Grant

Filed: July 24, 2013

Date of Patent: January 6, 2015

Assignee: Huawei Technologies Co., Ltd

Inventors: Fuwei Ma, Dejun Zhang, Lei Miao, Fengyan Qi
Audio signal bandwidth extension in CELP-based speech coder

Patent number: 8868432

Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.

Type: Grant

Filed: September 28, 2011

Date of Patent: October 21, 2014

Assignee: Motorola Mobility LLC

Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
Decomposition of music signals using basis functions with time-evolution information

Patent number: 8805697

Abstract: Decomposition of a multi-source signal using a basis function inventory and a sparse recovery technique is disclosed.

Type: Grant

Filed: October 24, 2011

Date of Patent: August 12, 2014

Assignee: QUALCOMM Incorporated

Inventors: Erik Visser, Yinyi Guo, Mofei Zhu, Sang-Uk Ryu, Lae-Hoon Kim, Jongwon Shin
Bandwidth Extension via Constrained Synthesis

Publication number: 20130332171

Abstract: Audio signal bandwidth extension may be performed on a narrow bandwidth signal received from a remote source over the audio communication network. The narrow band signal bandwidth may be extended such that the bandwidth is greater than that of the audio communication network. The signal may be extended by synthesizing an audio signal having spectral values within an extended bandwidth from synthetic components. The synthetic components may be generated using parameters derived from original narrowband audio signal. The audio signal may be synthesized in the form of an excitation signal and vocal tract envelope. The excitation signal and vocal tract may be extended independently. In various embodiments, excitation components may be derived from constrained synthesis using a constraint filter with nulls in regions where the extension is desired.

Type: Application

Filed: June 12, 2013

Publication date: December 12, 2013

Inventors: Carlos Avendano, Marios Athineos, Ethan Duni
Encoding and decoding speech signals

Patent number: 8571039

Abstract: A method and apparatus for transmitting an audio signal over a communication channel comprising encoding the audio signal with an encoder 204 using a first sampling rate, filtering the audio signal using a first cut off frequency, the first cut off frequency being chosen in dependence upon the first sampling rate, and transmitting the encoded and filtered audio signal over the communication channel. The presence of a condition in which the sampling rate of the encoder 204 is to be switched to a second sampling rate at a switching time is determined and if the condition has been determined to be present, the cut off frequency used in the filtering step is gradually changed from the first cut off frequency to a second cut off frequency, the second cut off frequency being chosen in dependence upon the second sampling rate, such that the audio bandwidth of the transmitted signal changes gradually when the sampling rate is switched to the second sampling rate.

Type: Grant

Filed: June 23, 2010

Date of Patent: October 29, 2013

Assignee: Skype

Inventors: Stefan Strommer, Karsten Vandborg Sorensen, Soren Skak Jensen, Koen Vos, Jon Bergenheim
Method and device for fast algebraic codebook search in speech and audio coding

Patent number: 8566106

Abstract: A method and device for searching an algebraic codebook during encoding of a sound signal, wherein the algebraic codebook comprises a set of codevectors formed of a number of pulse positions and a number of pulses distributed over the pulse positions. In the algebraic codebook searching method and device, a reference signal for use in searching the algebraic codebook is calculated. In a first stage, a position of a first pulse is determined in relation with the reference signal and among the number of pulse positions. In each of a number of stages subsequent to the first stage, (a) an algebraic codebook gain is recomputed, (b) the reference signal is updated using the recomputed algebraic codebook gain and (c) a position of another pulse is determined in relation with the updated reference signal and among the number of pulse positions.

Type: Grant

Filed: September 11, 2008

Date of Patent: October 22, 2013

Assignee: Voiceage Corporation

Inventors: Redwan Salami, Vaclav Eksler, Milan Jelinek
Method for segmenting audio signals

Patent number: 8521529

Abstract: An input signal is converted to a feature-space representation. The feature-space representation is projected onto a discriminant subspace using a linear discriminant analysis transform to enhance the separation of feature clusters. Dynamic programming is used to find global changes to derive optimal cluster boundaries. The cluster boundaries are used to identify the segments of the audio signal.

Type: Grant

Filed: April 18, 2005

Date of Patent: August 27, 2013

Assignee: Creative Technology Ltd

Inventors: Michael M. Goodwin, Jean Laroche
Speech synthesizer, speech synthesizing method and program product

Patent number: 8494856

Abstract: According to one embodiment, a speech synthesizer includes an analyzer, a first estimator, a selector, a generator, a second estimator, and a synthesizer. The analyzer analyzes text and extracts a linguistic feature. The first estimator selects a first prosody model adapted to the linguistic feature and estimates prosody information that maximizes a first likelihood representing probability of the selected first prosody model. The selector selects speech units that minimize a cost function determined in accordance with the prosody information. The generator generates a second prosody model that is a model of the prosody information of the speech units. The second estimator estimates prosody information that maximizes a third likelihood calculated on the basis of the first likelihood and a second likelihood representing probability of the second prosody model. The synthesizer generates synthetic speech by concatenating the speech units on the basis of the prosody information estimated by the second estimator.

Type: Grant

Filed: October 12, 2011

Date of Patent: July 23, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Javier Latorre, Masami Akamine
Method and arrangement for smoothing of stationary background noise

Patent number: 8457953

Abstract: In a method of smoothing background noise in a telecommunication speech session; receiving and decoding S1O a signal representative of a speech session, the signal comprising both a speech component and a background noise component. Subsequently, determining LPC parameters S20 and an excitation signal S30 for the received signal. Thereafter, synthesizing and outputting (S40) an output signal based on the determined LPC parameters and excitation signal. In addition, modifying S35 the determined excitation signal by reducing power and spectral fluctuations of the excitation signal to provide a smoothed output signal.

Type: Grant

Filed: February 13, 2008

Date of Patent: June 4, 2013

Assignee: Telefonaktiebolaget LM Ericsson (Publ)

Inventor: Stefan Bruhn
Fixed codebook searching apparatus and fixed codebook searching method

Patent number: 8452590

Abstract: A fixed codebook searching apparatus, includes a convolution operator, implemented by at least one processor, that convolves an impulse response of a perceptually weighted synthesis filter with an impulse response vector that has values at negative times, to generate a second impulse response vector that has values at negative times. A matrix generator, implemented by at least one processor, generates a Toeplitz-type convolution matrix using the second impulse response vector generated by the convolution operator. A searcher, implemented by at least one processor, performs a codebook search by maximizing a term using the Toeplitz-type convolution matrix.

Type: Grant

Filed: April 25, 2011

Date of Patent: May 28, 2013

Assignee: Panasonic Corporation

Inventors: Hiroyuki Ehara, Koji Yoshida
Voice conversion apparatus and method and speech synthesis apparatus and method

Patent number: 8438033

Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.

Type: Grant

Filed: July 20, 2009

Date of Patent: May 7, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
Predictive speech signal coding

Patent number: 8433563

Abstract: A method, system and computer program for encoding speech according to a source-filter model. The method comprises deriving a spectral envelope signal representative of a modelled filter and a first remaining signal representative of a modelled source signal, and deriving a second remaining signal from the first remaining signal by, at intervals during the encoding: exploiting a correlation between approximately periodic portions in the first remaining signal to generate a predicted version of a later portion from a stored version of an earlier portion, and using the predicted-version of the later portion to remove an effect of said periodicity from the first remaining signal. The method further comprises, once every number of intervals, transforming the stored version of the earlier portion of the first remaining signal prior to generating the predicted version of the respective later portion.

Type: Grant

Filed: June 2, 2009

Date of Patent: April 30, 2013

Assignee: Skype

Inventors: Koen Bernard Vos, Soren Skak Jensen
Method, apparatus and computer program product for providing real glottal pulses in HMM-based text-to-speech synthesis

Patent number: 8386256

Abstract: An apparatus for providing improved speech synthesis may include a processor and a memory storing executable instructions. In response to execution of the instructions by the processor, the apparatus may perform at least selecting a real glottal pulse from among one or more stored real glottal pulses based at least in part on a property associated with the real glottal pulse, utilizing the real glottal pulse selected as a basis for generation of an excitation signal, and modifying the excitation signal based on spectral parameters generated by a model to provide synthetic speech.

Type: Grant

Filed: May 29, 2009

Date of Patent: February 26, 2013

Assignee: Nokia Corporation

Inventors: Tuomo Johannes Raitio, Antti Santeri Suni, Martti Tapani Vainio, Paavo Ilmari Alku, Jani Kristian Nurminen
Method and apparatus for generating an excitation signal for background noise

Patent number: 8370154

Abstract: A method and apparatus for generating an excitation signal for background noise are provided. The method includes: generating a quasi excitation signal by utilizing coding parameters in a speech coding/decoding stage and a transition length of an excitation signal; and obtaining the excitation signal for background noise in a transition stage by generating a weighted sum of the quasi excitation signal and a random excitation signal of a background noise frame. Moreover, the apparatus includes: a quasi excitation signal generation unit and a transition stage excitation signal acquisition unit. Through the synthesizing scheme of comfortable background noise according to the present invention, the transition of a synthesized signal from speech to background noise could be more natural, smooth and continuous, which makes the listeners feel more comfortable.

Type: Grant

Filed: September 21, 2010

Date of Patent: February 5, 2013

Assignee: Huawei Technologies Co., Ltd.

Inventors: Jinliang Dai, Libin Zhang, Eyal Shlomot, Lin Wang
Generating prosodic contours for synthesized speech

Patent number: 8321225

Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method including receiving text to be synthesized as a spoken utterance. The method includes analyzing the received text to determine attributes of the received text and selecting one or more utterances from a database based on a comparison between the attributes of the received text and attributes of text representing the stored utterances. The method includes determining, for each utterance, a distance between a contour of the utterance and a hypothetical contour of the spoken utterance, the determination based on a model that relates distances between pairs of contours of the utterances to relationships between attributes of text for the pairs. The method includes selecting a final utterance having a contour with a closest distance to the hypothetical contour and generating a contour for the received text based on the contour of the final utterance.

Type: Grant

Filed: November 14, 2008

Date of Patent: November 27, 2012

Assignee: Google Inc.

Inventors: Martin Jansche, Michael D. Riley, Andrew M. Rosenberg, Terry Tai
Hidden Markov model based text to speech systems employing rope-jumping algorithm

Patent number: 8315871

Abstract: A rope-jumping algorithm is employed in a Hidden Markov Model based text to speech system to determine start and end models and to modify the start and end models by setting small co-variances. Disordered acoustic parameters due to violation of parameter constraints are avoided through the modification and result in stable line frequency spectrum for the generated speech.

Type: Grant

Filed: June 4, 2009

Date of Patent: November 20, 2012

Assignee: Microsoft Corporation

Inventors: Wenlin Wang, Guoliang Zhang, Jingyang Xu
OBFUSCATED SPEECH SYNTHESIS

Publication number: 20120239406

Abstract: The present invention relates to a method for synthesizing a speech signal; comprising obtaining a speech sequence input signal comprising semantic content corresponding to a speaker's utterance; analyzing the input speech sequence signal to obtain a first sequence of feature vectors for the input speech sequence signal; synthesizing a second sequence of feature vectors different from and based on the first sequence of feature vectors; generating an excitation signal and filtering the excitation signal based on the second sequence of feature vectors to obtain a synthesized speech signal wherein the semantic content is obfuscated.

Type: Application

Filed: December 2, 2009

Publication date: September 20, 2012

Inventors: Johan Nikolaas Langehoveen Brummer, Avery Maxwell Glasser, Luis Buera Rodriquez
SPEECH SYNTHESIS AND CODING METHODS

Publication number: 20120123782

Abstract: The present invention is related to a method for coding excitation signal of a target speech comprising the steps of: extracting from a set of training normalised residual frames, a set of relevant normalised residual frames, said training residual frames being extracted from a training speech, synchronised on Glottal Closure Instant (GCI), pitch and energy normalised; determining the target excitation signal of the target speech; dividing said target excitation signal into GCI synchronised target frames; determining the local pitch and energy of the GCI synchronised target frames; normalising the GCI synchronised target frames in both energy and pitch, to obtain target normalised residual frames; determining coefficients of linear combination of said extracted set of relevant normalised residual frames to build synthetic normalised residual frames close to each target normalised residual frames; wherein the coding parameters for each target residual frames comprise the determined coefficients.

Type: Application

Filed: March 30, 2010

Publication date: May 17, 2012

Inventors: Geoffrey Wilfart, Thomas Drugman, Thierry Dutoit
Systems and methods for reducing speech intelligibility while preserving environmental sounds

Patent number: 8140326

Abstract: An audio privacy system reduces the intelligibility of speech in an audio signal while preserving prosodic information, such as pitch, relative energy and intonation so that a listener has the ability to recognize environmental sounds but not the speech itself. An audio signal is processed to separate non-vocalic information, such as pitch and relative energy of speech, from vocalic regions, after which syllables are identified within the vocalic regions. Representations of the vocalic regions are computed to produce a vocal tract transfer function and an excitation. The vocal tract transfer function for each syllable is then replaced with the vocal tract transfer function from another prerecorded vocalic sound. In one aspect, the identity of the replacement vocalic sound is independent of the identity of the syllable being replaced.

Type: Grant

Filed: June 6, 2008

Date of Patent: March 20, 2012

Assignee: Fuji Xerox Co., Ltd.

Inventors: Francine Chen, John Adcock
Scalable encoding device and scalable encoding method

Patent number: 8036390

Abstract: A scalable encoding device prevents sound quality deterioration of a decoded signal, reduces the encoding rate, and reduces the circuit size. The scalable encoding device includes a first layer encoder for generating a monaural signal by using a plurality of channel signals (L channel signal and R channel signal) constituting a stereo signal and encoding the monaural signal to generate a sound source parameter. The scalable encoding device also includes a second layer encoder for generating a first conversion signal by using the channel signal and the monaural signal, generating a synthesis signal by using the sound source parameter and the first conversion signal, and generating a second conversion coefficient index by using the synthesis signal and the first conversion signal.

Type: Grant

Filed: January 30, 2006

Date of Patent: October 11, 2011

Assignee: Panasonic Corporation

Inventors: Michiyo Goto, Koji Yoshida
Low-complexity code excited linear prediction encoding

Patent number: 8000967

Abstract: Information about excitation signals of a first signal encoded by CELP is used to derive a limited set of candidate excitation signals for a second correlated second signal. Preferably, pulse locations of the excitation signals of the first encoded signal are used for determining the set of candidate excitation signals. More preferably, the pulse locations of the set of candidate excitation signals are positioned in the vicinity of the pulse locations of the excitation signals of the first encoded signal. The first and second signals may be multi-channel signals of a common speech or audio signal. However, the first and second signals may also be identical, whereby the coding of the second signal can be utilized for re-encoding at a lower bit rate.

Type: Grant

Filed: March 9, 2005

Date of Patent: August 16, 2011

Assignee: Telefonaktiebolaget LM Ericsson (publ)

Inventor: Anisse Taleb
Fixed codebook searching apparatus and fixed codebook searching method

Patent number: 7949521

Abstract: A fixed codebook searching apparatus which slightly suppresses an increase in the operation amount, even if the filter applied to the excitation pulse has the characteristic that it cannot be represented by a lower triangular matrix and realizes a quasi-optimal fixed codebook search. This fixed codebook searching apparatus is provided with an algebraic codebook that generates a pulse excitation vector; a convolution operation section that convolutes an impulse response of auditory weighted synthesis filter into an impulse response vector that has a value at negative times, to generate a second impulse response vector that has a value at second negative times; a matrix generating section that generates a Toeplitz-type convolution matrix by means of the second impulse response vector; and a convolution operation section that convolutes the matrix generated by matrix generating section into the pulse excitation vector generated by algebraic codebook.

Type: Grant

Filed: February 25, 2009

Date of Patent: May 24, 2011

Assignee: Panasonic Corporation

Inventors: Hiroyuki Ehara, Koji Yoshida
Sound processing apparatus and method, and program therefor

Patent number: 7945446

Abstract: Spectrum envelope of an input sound is detected. In the meantime, a converting spectrum is acquired which is a frequency spectrum of a converting sound comprising a plurality of sounds, such as unison sounds. Output spectrum is generated by imparting the detected spectrum envelope of the input sound to the acquired converting spectrum. Sound signal is synthesized on the basis of the generated output spectrum. Further, a pitch of the input sound may be detected, and frequencies of peaks in the acquired converting spectrum may be varied in accordance with the detected pitch of the input sound. In this manner, the output spectrum can have the pitch and spectrum envelope of the input sound and spectrum frequency components of the converting sound comprising a plurality of sounds, and thus, unison sounds can be readily generated with simple arrangements.

Type: Grant

Filed: March 9, 2006

Date of Patent: May 17, 2011

Assignee: Yamaha Corporation

Inventors: Hideki Kemmochi, Yasuo Yoshioka, Jordi Bonada
METHOD AND APPARATUS FOR GENERATING AN EXCITATION SIGNAL FOR BACKGROUND NOISE

Publication number: 20110022391

Abstract: A method and apparatus for generating an excitation signal for background noise are provided. The method includes: generating a quasi excitation signal by utilizing coding parameters in a speech coding/decoding stage and a transition length of an excitation signal; and obtaining the excitation signal for background noise in a transition stage by generating a weighted sum of the quasi excitation signal and a random excitation signal of a background noise frame. Moreover, the apparatus includes: a quasi excitation signal generation unit and a transition stage excitation signal acquisition unit. Through the synthesizing scheme of comfortable background noise according to the present invention, the transition of a synthesized signal from speech to background noise could be more natural, smooth and continuous, which makes the listeners feel more comfortable.

Type: Application

Filed: September 21, 2010

Publication date: January 27, 2011

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Jinliang DAI, Libin ZHANG, Eyal SHLOMOT, Lin WANG
PERIODIC SIGNAL PROCESSING METHOD,PERIODIC SIGNAL CONVERSION METHOD,PERIODIC SIGNAL PROCESSING DEVICE, AND PERIODIC SIGNAL ANALYSIS METHOD

Publication number: 20110015931

Abstract: The invention relates to a periodic signal processing method, a periodic signal conversion method, and a periodic signal processing device capable of reducing the influence of periodicity without using a spectral model. Time windows are arranged such that a center of each of the time windows is at a division position which divides a fundamental frequency in a temporal direction into fractions 1/n (where n is an integer equal to or larger than 2) so as to extract a plurality of portions of different ranges from a signal having periodicity. A power spectrum for the plurality of portions extracted by the respective time windows is calculated, and the calculated power spectrum is added with a same ratio.

Type: Application

Filed: July 18, 2008

Publication date: January 20, 2011

Inventors: Hideki Kawahara, Masanori Morise, Toru Takahashi, Toshio Irino
Method and apparatus to encode and/or decode signal using bandwidth extension technology

Patent number: 7864843

Abstract: A method and apparatus to perform bandwidth extension encoding and decoding encodes and/or decodes a high frequency signal using an excitation signal for a low frequency signal encoded in a time domain or a frequency domain or using an excitation spectrum for the low frequency signal. Accordingly, although an audio signal is encoded or decoded using a small number of bits, the quality of sound corresponding to a signal in a high frequency band does not degrade. Therefore, a coding efficiency of the audio signal can be maximized.

Type: Grant

Filed: June 4, 2007

Date of Patent: January 4, 2011

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ki-hyun Choo, Jung-hoe Kim, Eun-mi Oh, Miao Lei, Chang-yong Son
Method and apparatus for speech decoding based on a parameter of the adaptive code vector

Patent number: 7747441

Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal. In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result.

Type: Grant

Filed: January 16, 2007

Date of Patent: June 29, 2010

Assignee: Mitsubishi Denki Kabushiki Kaisha

Inventor: Tadashi Yamaura
VOICE CONVERSION APPARATUS AND METHOD AND SPEECH SYNTHESIS APPARATUS AND METHOD

Publication number: 20100049522

Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.

Type: Application

Filed: July 20, 2009

Publication date: February 25, 2010

Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR PROVIDING IMPROVED SPEECH SYNTHESIS

Publication number: 20090299747

Abstract: An apparatus for providing improved speech synthesis may include a processor and a memory storing executable instructions. In response to execution of the instructions by the processor, the apparatus may perform at least selecting a real glottal pulse from among one or more stored real glottal pulses based at least in part on a property associated with the real glottal pulse, utilizing the real glottal pulse selected as a basis for generation of an excitation signal, and modifying the excitation signal based on spectral parameters generated by a model to provide synthetic speech.

Type: Application

Filed: May 29, 2009

Publication date: December 3, 2009

Inventors: Tuomo Johannes Raitio, Antti Santeri Suni, Martti Tapani Vainio, Paavo Ilmari Alku, Jani Kristian Nurminen
Voice synthesizer of multi sounds

Patent number: 7613612

Abstract: In a voice synthesizer, an envelope acquisition portion obtains a spectral envelope of a reference frequency spectrum of a given voice. A spectrum acquisition portion obtains a collective frequency spectrum of a plurality of voices which are generated in parallel to one another. An envelope adjustment portion adjusts a spectral envelope of the collective frequency spectrum obtained by the spectrum acquisition portion so as to approximately match with the spectral envelope of the reference frequency spectrum obtained by the envelope acquisition portion. A voice generation portion generates an output voice signal from the collective frequency spectrum having the spectral envelope adjusted by the envelope adjustment portion.

Type: Grant

Filed: January 31, 2006

Date of Patent: November 3, 2009

Assignee: Yamaha Corporation

Inventors: Hideki Kemmochi, Jordi Bonada
VOICE RECOGNITION SYSTEM

Publication number: 20090187406

Abstract: A voice recognition system is provided that outputs a talk-back voice in a manner such that a user can distinguish the accuracy of a voice-recognized character string more easily. A voice recognition unit performs voice recognition on a user's articulation in which a character string such as the telephone number “024 636 0123” is entered via a microphone. Based on each sound existing period delimited by silent intervals, each recognized partial character string “024”, “636” and “0123” is obtained. A talk-back voice data generating unit connects each recognized partial character string “024”, “636” and “0123” together in a manner such that space characters are inserted, and generates a character string “024 636 0123”. The generated character string “024 636 0123” is supplied to a voice generating device as talk-back voice data. A voice signal to be produced by the speaker 2 is generated in the form of the talk-back voice.

Type: Application

Filed: December 3, 2008

Publication date: July 23, 2009

Inventors: Kazunori Sakuma, Nozomu Saito, Tohru Masumoto
Bandwidth extension of narrowband speech

Patent number: 7546237

Abstract: A system extends the bandwidth of a narrowband speech signal into a wideband spectrum. The system includes a high-band generator that generates a high frequency spectrum based on a narrowband spectrum. A background noise generator generates a high frequency background noise spectrum based on a background noise within the narrowband spectrum. A summing circuit linked to the high-band generator and the background noise generator combines the high frequency spectrum and narrowband spectrum and the high frequency background noise spectrum.

Type: Grant

Filed: December 23, 2005

Date of Patent: June 9, 2009

Assignee: QNX Software Systems (Wavemakers), Inc.

Inventors: Rajeev Nongpiur, Xueman Li, Phillip A. Hetherington
SPEECH PROCESSING APPARATUS AND SPEECH SYNTHESIS APPARATUS

Publication number: 20090144053

Abstract: An information extraction unit extracts spectral envelope information of L-dimension from each frame of speech data. The spectral envelope information does not have a spectral fine structure. A basis storage unit stores N bases (L>N>1). Each basis is differently a frequency band having a maximum as a peak frequency in a spectral domain having L-dimension. A value corresponding to a frequency outside the frequency band along a frequency axis of the spectral domain is zero. Two frequency bands of which two peak frequencies are adjacent along the frequency axis partially overlap. A parameter calculation unit minimizes a distortion between the spectral envelope information and a linear combination of each basis with a coefficient by changing the coefficient, and sets the coefficient of each basis from which the distortion is minimized to a spectral envelope parameter of the spectral envelope information.

Type: Application

Filed: December 3, 2008

Publication date: June 4, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune TAMURA, Katsumi TSUCHIYA, Takehiko KAGOSHIMA
Speech synthesis using concatenation of speech waveforms

Patent number: 7529672

Abstract: A method of synthesizing a speech signal by providing a first speech unit signal having an end interval and a second speech unit signal having a front interval, wherein at least some of the periods of the end interval are appended in inverted order at the end of the first speech unit signal in order to provide a fade-out interval, and at least some of the periods of the front interval are appended in inverted order at the beginning of the second speech unit signal to provide a fade-in interval. An overlap and add operation is performed on the end and fade-in intervals and the fade-out and front intervals.

Type: Grant

Filed: August 8, 2003

Date of Patent: May 5, 2009

Assignee: Koninklijke Philips Electronics N.V.

Inventor: Ercan Ferit Gigi
SYSTEM AND METHOD FOR IMPROVING SYNTHESIZED SPEECH INTERACTIONS OF A SPOKEN DIALOG SYSTEM

Publication number: 20090112596

Abstract: A system and method are disclosed for synthesizing speech based on a selected speech act. A method includes modifying synthesized speech of a spoken dialogue system, by (1) receiving a user utterance, (2) analyzing the user utterance to determine an appropriate speech act, and (3) generating a response of a type associated with the appropriate speech act, wherein in linguistic variables in the response are selected, based on the appropriate speech act.

Type: Application

Filed: October 30, 2007

Publication date: April 30, 2009

Applicant: AT&T Lab, Inc.

Inventors: Ann K. Syrdal, Mark Beutnagel, Alistair D. Conkie, Yeon-Jun Kim
Annotating programs for automatic summary generations

Patent number: 7403894

Abstract: Audio/video programming content is made available to a receiver from a content provider, and meta data is made available to the receiver from a meta data provider. The meta data corresponds to the programming content, and identifies, for each of multiple portions of the programming content, an indicator of a likelihood that the portion is an exciting portion of the content. In one implementation, the meta data includes probabilities that segments of a baseball program are exciting, and is generated by analyzing the audio data of the baseball program for both excited speech and baseball hits. The meta data can then be used to generate a summary for the baseball program.

Type: Grant

Filed: March 15, 2005

Date of Patent: July 22, 2008

Assignee: Microsoft Corporation

Inventors: Yong Rui, Anoop Gupta, Alejandro Acero
Telephone communication with silent response feature

Patent number: 7305068

Abstract: A telephone call may be received or made by the user of telephony-enabled apparatus in circumstances, such as during a meeting, where spoken responses by the user to what the other party to the call has said are unacceptable. A telephony method and arrangement are disclosed which permits a user to use silent input to the telephony-enabled apparatus in order to generate a response to the other party to the call. Response generation is facilitated by enabling the user to effect a selection from the content of the other party's input, or from options derived from that input, with this selection then being used in forming the response.

Type: Grant

Filed: February 25, 2005

Date of Patent: December 4, 2007

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Roger Cecil Ferry Tucker, Paul St John Brittan
High-quality speech synthesis device and method by classification and prediction processing of synthesized sound

Patent number: 7283961

Abstract: There is disclosed a speech processing device in which prediction taps for finding prediction values of the speech of high sound quality are extracted from the synthesized sound obtained on affording linear prediction coefficients and residual signals, generated from a preset code, to a speech synthesis filter, speech of high sound quality being higher in sound quality than the synthesized sound, and in which the prediction taps are used along with preset tap coefficients to perform preset predictive calculations to find the prediction values of the speech of high sound quality. The speech of high sound quality is higher in sound quality than the synthesized sound.

Type: Grant

Filed: August 3, 2001

Date of Patent: October 16, 2007

Assignee: Sony Corporation

Inventors: Tetsujiro Kondo, Tsutomu Watanabe, Masaaki Hattori, Hiroto Kimura, Yasuhiro Fujimori
Fine granularity scalability speech coding for multi-pulses CELP-based algorithm

Patent number: 7272555

Abstract: A method for speech processing in a code excitation linear prediction (CELP) based speech system having a plurality of modes including at least a first mode and a consecutive second mode. The method includes providing an input speech signal, dividing the speech signal into a plurality of frames, dividing at least one of the plurality of frames into sub-frames including a plurality of pulses, selecting a first number of pulses for the first mode, with a second number of remaining pulses in the frame plus the first number of pulses in the first mode for the second mode, providing a plurality of sub-modes between the first mode and the second mode, forming a base layer, forming an enhancement layer, generating a bit stream including a basic bit stream and an enhancement bit stream, wherein the basic bit stream is used to update memory states of the speech system.

Type: Grant

Filed: July 28, 2003

Date of Patent: September 18, 2007

Assignee: Industrial Technology Research Institute

Inventors: I-Hsien Lee, Fang-Chu Chen
Speech decoding apparatus and method using prediction and class taps

Patent number: 7269559

Abstract: The present invention relates to a data processing apparatus capable of obtaining high-quality sound, etc. A tap generation section 121 generate a prediction tap from synthesized speech data for 40 samples in a subframe of subject data of interest within the synthesized speech data such that speech coded data coded by a CELP method, and synthesized speech data in which a position in the past from a subject subframe by a lag indicated by an L code located in that subject subframe is a starting point. Then, a prediction section 125 decodes high-quality sound data by performing a predetermined prediction computation by using the prediction tap and a tap coefficient stored in a coefficient memory 124. The present invention can be applied to mobile phones for transmitting and receiving speech.

Type: Grant

Filed: January 24, 2002

Date of Patent: September 11, 2007

Assignee: Sony Corporation

Inventors: Tetsujiro Kondo, Hiroto Kimura, Tsutomu Watanabe, Masaaki Hattori
Parametric speech codec for representing synthetic speech in the presence of background noise

Patent number: 7257535

Abstract: A system and method are provided for processing audio and speech signals using a pitch and voicing dependent spectral estimation algorithm (voicing algorithm) to accurately represent voiced speech, unvoiced speech, and mixed speech in the presence of background noise, and background noise with a single model. The present invention also modifies the synthesis model based on an estimate of the current input signal to improve the perceptual quality of the speech and background noise under a variety of input conditions. The present invention also improves the voicing dependent spectral estimation algorithm robustness by introducing the use of a Multi-Layer Neural Network in the estimation process. The voicing dependent spectral estimation algorithm provides an accurate and robust estimate of the voicing probability under a variety of background noise conditions. This is essential to providing high quality intelligible speech in the presence of background noise.

Type: Grant

Filed: October 28, 2005

Date of Patent: August 14, 2007

Assignee: Lucent Technologies Inc.

Inventors: Joseph Gerard Aguilar, Juin-Hwey Chen, Wei Wang, Robert W. Zopf
Sound encoding method and sound decoding method, and sound encoding device and sound decoding device

Patent number: 7092885

Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal. In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result.

Type: Grant

Filed: December 7, 1998

Date of Patent: August 15, 2006

Assignee: Mitsubishi Denki Kabushiki Kaisha

Inventor: Tadashi Yamaura

1 2 next