Excitation Patents (Class 704/264)
-
Patent number: 12255671Abstract: The method provides for separable subchannels sharing a communication channel. A processor receives input of a user setting a transmitter device to a first of at least two subchannels of a communication channel in which the first subchannel comprises a first portion of a bandwidth of the communication channel. The processor receives an audio signal as input to the transmitter device. The processor converts a time-series waveform of the audio signal into a frequency-series waveform. The processor determines that the transmitter device is set to the first subchannel. In response to determining the device is set to the first channel, the processor filters the frequency-series waveform through a series of steep shoulder digital bandpass filters set to transmit through the first portion of the bandwidth, and the processor transmits the audio signal as the filtered frequency-series waveform.Type: GrantFiled: March 16, 2023Date of Patent: March 18, 2025Assignee: International Business Machines CorporationInventors: Hyman David Chantz, Robert Lynch, Elijah Swift
-
Patent number: 11366012Abstract: A method and a system for generating a time-frequency representation of an aperiodic continuous input signal comprising generating a periodic train of short pulses having a repetition frequency, and sampling the signal temporally using the periodic train of short pulses to obtain a temporally sampled signal, the temporally sampled signal comprising a plurality of sampled copies of the input signal, each sampled copy being spaced in function of the repetition frequency of the periodic train of short pulses. The temporally sampled signal is delayed based on the repetition frequency to obtain a delayed temporally sampled signal comprising a plurality of delayed sampled copies, a spectral representation of a given delayed sampled copy being delayed in function of the repetition frequency. The delayed temporally sampled signal is evaluated over consecutive time slots to obtain, for each consecutive time slot, a respective output signal in the time-frequency domain.Type: GrantFiled: September 26, 2019Date of Patent: June 21, 2022Assignees: INSTITUT NATIONAL DE LA RECHERCHE SCIENTIFIQUE (INRS), CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE (CNRS, UNIVERSITÉ GRENOBLE ALPESInventors: Jose Azana, Konatham Saikrishna Reddy, Reza Maram, Hugues Guillet De Chatellus
-
Patent number: 11100938Abstract: An envelope sequence is provided that can improve approximation accuracy near peaks caused by the pitch period of an audio signal. A periodic-combined-envelope-sequence generation device according to the present invention takes, as an input audio signal, a time-domain audio digital signal in each frame, which is a predetermined time segment, and generates a periodic combined envelope sequence as an envelope sequence. The periodic-combined-envelope-sequence generation device according to the present invention comprises at least a spectral-envelope-sequence calculating part and a periodic-combined-envelope generating part. The spectral-envelope-sequence calculating part calculates a spectral envelope sequence of the input audio signal on the basis of time-domain linear prediction of the input audio signal.Type: GrantFiled: May 14, 2020Date of Patent: August 24, 2021Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada
-
Patent number: 10984804Abstract: Embodiments of the invention relate to an error concealment unit for providing an error concealment audio information for concealing a loss of an audio frame in an encoded audio information. The error concealment unit provides a first error concealment audio information component for a first frequency range using a frequency domain concealment. The error concealment unit also provides a second error concealment audio information component for a second frequency range, which includes lower frequencies than the first frequency range, using a time domain concealment. The error concealment unit also combines the first error concealment audio information component and the second error concealment audio information component, to obtain the error concealment audio information. Other embodiments of the invention relate to a decoder including the error concealment unit, as well as related encoders, methods, and computer programs for decoding and/or concealing.Type: GrantFiled: September 7, 2018Date of Patent: April 20, 2021Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Jérémie Lecomte, Adrian Tomasek
-
Patent number: 10014007Abstract: A method is presented for forming the excitation signal for a glottal pulse model based parametric speech synthesis system. In one embodiment, fundamental frequency values are used to form the excitation signal. The excitation is modeled using a voice source pulse selected from a database of a given speaker. The voice source signal is segmented into glottal segments, which are used in vector representation to identify the glottal pulse used for formation of the excitation signal. Use of a novel distance metric and preserving the original signals extracted from the speakers voice samples helps capture low frequency information of the excitation signal. In addition, segment edge artifacts are removed by applying a unique segment joining method to improve the quality of synthetic speech while creating a true representation of the voice quality of a speaker.Type: GrantFiled: May 28, 2014Date of Patent: July 3, 2018Inventors: Rajesh Dachiraju, Aravind Ganapathiraju
-
Publication number: 20150106102Abstract: A method includes determining, at a speech encoder, first gain shape parameters based on a harmonically extended signal and/or based on a high-band residual signal associated with a high-band portion of an audio signal. The method also includes determining second gain shape parameters based on a synthesized high-band signal and based on the high-band portion of the audio signal. The method further includes inserting the first gain parameters and the second gain shape parameters into an encoded version of the audio signal to enable gain adjustment during reproduction of the audio signal from the encoded version of the audio signal.Type: ApplicationFiled: October 7, 2014Publication date: April 16, 2015Inventors: Venkata Subrahmanyam Chandra Sekhar Chebiyyam, Venkatraman S. Atti
-
Publication number: 20150100318Abstract: A method for decoding a speech signal is described. The method includes obtaining a packet. The method also includes obtaining a previous lag value. The method further includes limiting the previous lag value if the previous lag value is greater than a maximum lag threshold. The method additionally includes disallowing an adjustment to a number of synthesized peaks if a combination of the number of synthesized peaks and an estimated number of peaks is not valid.Type: ApplicationFiled: October 4, 2013Publication date: April 9, 2015Applicant: QUALCOMM IncorporatedInventors: Venkatraman Rajagopalan, Venkatesh Krishnan, Alok K. Gupta
-
Patent number: 8949123Abstract: The voice conversion method of a display apparatus includes: in response to the receipt of a first video frame, detecting one or more entities from the first video frame; in response to the selection of one of the detected entities, storing the selected entity; in response to the selection of one of a plurality of previously-stored voice samples, storing the selected voice sample in connection with the selected entity; and in response to the receipt of a second video frame including the selected entity, changing a voice of the selected entity based on the selected voice sample and outputting the changed voice.Type: GrantFiled: April 11, 2012Date of Patent: February 3, 2015Assignee: Samsung Electronics Co., Ltd.Inventors: Aditi Garg, Kasthuri Jayachand Yadlapalli
-
Patent number: 8930200Abstract: A vector joint encoding/decoding method and a vector joint encoder/decoder are provided, more than two vectors are jointly encoded, and an encoding index of at least one vector is split and then combined between different vectors, so that encoding idle spaces of different vectors can be recombined, thereby facilitating saving of encoding bits, and because an encoding index of a vector is split and then shorter split indexes are recombined, thereby facilitating reduction of requirements for the bit width of operating parts in encoding/decoding calculation.Type: GrantFiled: July 24, 2013Date of Patent: January 6, 2015Assignee: Huawei Technologies Co., LtdInventors: Fuwei Ma, Dejun Zhang, Lei Miao, Fengyan Qi
-
Patent number: 8868432Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.Type: GrantFiled: September 28, 2011Date of Patent: October 21, 2014Assignee: Motorola Mobility LLCInventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
-
Patent number: 8805697Abstract: Decomposition of a multi-source signal using a basis function inventory and a sparse recovery technique is disclosed.Type: GrantFiled: October 24, 2011Date of Patent: August 12, 2014Assignee: QUALCOMM IncorporatedInventors: Erik Visser, Yinyi Guo, Mofei Zhu, Sang-Uk Ryu, Lae-Hoon Kim, Jongwon Shin
-
Publication number: 20130332171Abstract: Audio signal bandwidth extension may be performed on a narrow bandwidth signal received from a remote source over the audio communication network. The narrow band signal bandwidth may be extended such that the bandwidth is greater than that of the audio communication network. The signal may be extended by synthesizing an audio signal having spectral values within an extended bandwidth from synthetic components. The synthetic components may be generated using parameters derived from original narrowband audio signal. The audio signal may be synthesized in the form of an excitation signal and vocal tract envelope. The excitation signal and vocal tract may be extended independently. In various embodiments, excitation components may be derived from constrained synthesis using a constraint filter with nulls in regions where the extension is desired.Type: ApplicationFiled: June 12, 2013Publication date: December 12, 2013Inventors: Carlos Avendano, Marios Athineos, Ethan Duni
-
Patent number: 8571039Abstract: A method and apparatus for transmitting an audio signal over a communication channel comprising encoding the audio signal with an encoder 204 using a first sampling rate, filtering the audio signal using a first cut off frequency, the first cut off frequency being chosen in dependence upon the first sampling rate, and transmitting the encoded and filtered audio signal over the communication channel. The presence of a condition in which the sampling rate of the encoder 204 is to be switched to a second sampling rate at a switching time is determined and if the condition has been determined to be present, the cut off frequency used in the filtering step is gradually changed from the first cut off frequency to a second cut off frequency, the second cut off frequency being chosen in dependence upon the second sampling rate, such that the audio bandwidth of the transmitted signal changes gradually when the sampling rate is switched to the second sampling rate.Type: GrantFiled: June 23, 2010Date of Patent: October 29, 2013Assignee: SkypeInventors: Stefan Strommer, Karsten Vandborg Sorensen, Soren Skak Jensen, Koen Vos, Jon Bergenheim
-
Patent number: 8566106Abstract: A method and device for searching an algebraic codebook during encoding of a sound signal, wherein the algebraic codebook comprises a set of codevectors formed of a number of pulse positions and a number of pulses distributed over the pulse positions. In the algebraic codebook searching method and device, a reference signal for use in searching the algebraic codebook is calculated. In a first stage, a position of a first pulse is determined in relation with the reference signal and among the number of pulse positions. In each of a number of stages subsequent to the first stage, (a) an algebraic codebook gain is recomputed, (b) the reference signal is updated using the recomputed algebraic codebook gain and (c) a position of another pulse is determined in relation with the updated reference signal and among the number of pulse positions.Type: GrantFiled: September 11, 2008Date of Patent: October 22, 2013Assignee: Voiceage CorporationInventors: Redwan Salami, Vaclav Eksler, Milan Jelinek
-
Patent number: 8521529Abstract: An input signal is converted to a feature-space representation. The feature-space representation is projected onto a discriminant subspace using a linear discriminant analysis transform to enhance the separation of feature clusters. Dynamic programming is used to find global changes to derive optimal cluster boundaries. The cluster boundaries are used to identify the segments of the audio signal.Type: GrantFiled: April 18, 2005Date of Patent: August 27, 2013Assignee: Creative Technology LtdInventors: Michael M. Goodwin, Jean Laroche
-
Patent number: 8494856Abstract: According to one embodiment, a speech synthesizer includes an analyzer, a first estimator, a selector, a generator, a second estimator, and a synthesizer. The analyzer analyzes text and extracts a linguistic feature. The first estimator selects a first prosody model adapted to the linguistic feature and estimates prosody information that maximizes a first likelihood representing probability of the selected first prosody model. The selector selects speech units that minimize a cost function determined in accordance with the prosody information. The generator generates a second prosody model that is a model of the prosody information of the speech units. The second estimator estimates prosody information that maximizes a third likelihood calculated on the basis of the first likelihood and a second likelihood representing probability of the second prosody model. The synthesizer generates synthetic speech by concatenating the speech units on the basis of the prosody information estimated by the second estimator.Type: GrantFiled: October 12, 2011Date of Patent: July 23, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Javier Latorre, Masami Akamine
-
Patent number: 8457953Abstract: In a method of smoothing background noise in a telecommunication speech session; receiving and decoding S1O a signal representative of a speech session, the signal comprising both a speech component and a background noise component. Subsequently, determining LPC parameters S20 and an excitation signal S30 for the received signal. Thereafter, synthesizing and outputting (S40) an output signal based on the determined LPC parameters and excitation signal. In addition, modifying S35 the determined excitation signal by reducing power and spectral fluctuations of the excitation signal to provide a smoothed output signal.Type: GrantFiled: February 13, 2008Date of Patent: June 4, 2013Assignee: Telefonaktiebolaget LM Ericsson (Publ)Inventor: Stefan Bruhn
-
Patent number: 8452590Abstract: A fixed codebook searching apparatus, includes a convolution operator, implemented by at least one processor, that convolves an impulse response of a perceptually weighted synthesis filter with an impulse response vector that has values at negative times, to generate a second impulse response vector that has values at negative times. A matrix generator, implemented by at least one processor, generates a Toeplitz-type convolution matrix using the second impulse response vector generated by the convolution operator. A searcher, implemented by at least one processor, performs a codebook search by maximizing a term using the Toeplitz-type convolution matrix.Type: GrantFiled: April 25, 2011Date of Patent: May 28, 2013Assignee: Panasonic CorporationInventors: Hiroyuki Ehara, Koji Yoshida
-
Patent number: 8438033Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.Type: GrantFiled: July 20, 2009Date of Patent: May 7, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
-
Patent number: 8433563Abstract: A method, system and computer program for encoding speech according to a source-filter model. The method comprises deriving a spectral envelope signal representative of a modelled filter and a first remaining signal representative of a modelled source signal, and deriving a second remaining signal from the first remaining signal by, at intervals during the encoding: exploiting a correlation between approximately periodic portions in the first remaining signal to generate a predicted version of a later portion from a stored version of an earlier portion, and using the predicted-version of the later portion to remove an effect of said periodicity from the first remaining signal. The method further comprises, once every number of intervals, transforming the stored version of the earlier portion of the first remaining signal prior to generating the predicted version of the respective later portion.Type: GrantFiled: June 2, 2009Date of Patent: April 30, 2013Assignee: SkypeInventors: Koen Bernard Vos, Soren Skak Jensen
-
Patent number: 8386256Abstract: An apparatus for providing improved speech synthesis may include a processor and a memory storing executable instructions. In response to execution of the instructions by the processor, the apparatus may perform at least selecting a real glottal pulse from among one or more stored real glottal pulses based at least in part on a property associated with the real glottal pulse, utilizing the real glottal pulse selected as a basis for generation of an excitation signal, and modifying the excitation signal based on spectral parameters generated by a model to provide synthetic speech.Type: GrantFiled: May 29, 2009Date of Patent: February 26, 2013Assignee: Nokia CorporationInventors: Tuomo Johannes Raitio, Antti Santeri Suni, Martti Tapani Vainio, Paavo Ilmari Alku, Jani Kristian Nurminen
-
Patent number: 8370154Abstract: A method and apparatus for generating an excitation signal for background noise are provided. The method includes: generating a quasi excitation signal by utilizing coding parameters in a speech coding/decoding stage and a transition length of an excitation signal; and obtaining the excitation signal for background noise in a transition stage by generating a weighted sum of the quasi excitation signal and a random excitation signal of a background noise frame. Moreover, the apparatus includes: a quasi excitation signal generation unit and a transition stage excitation signal acquisition unit. Through the synthesizing scheme of comfortable background noise according to the present invention, the transition of a synthesized signal from speech to background noise could be more natural, smooth and continuous, which makes the listeners feel more comfortable.Type: GrantFiled: September 21, 2010Date of Patent: February 5, 2013Assignee: Huawei Technologies Co., Ltd.Inventors: Jinliang Dai, Libin Zhang, Eyal Shlomot, Lin Wang
-
Patent number: 8321225Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method including receiving text to be synthesized as a spoken utterance. The method includes analyzing the received text to determine attributes of the received text and selecting one or more utterances from a database based on a comparison between the attributes of the received text and attributes of text representing the stored utterances. The method includes determining, for each utterance, a distance between a contour of the utterance and a hypothetical contour of the spoken utterance, the determination based on a model that relates distances between pairs of contours of the utterances to relationships between attributes of text for the pairs. The method includes selecting a final utterance having a contour with a closest distance to the hypothetical contour and generating a contour for the received text based on the contour of the final utterance.Type: GrantFiled: November 14, 2008Date of Patent: November 27, 2012Assignee: Google Inc.Inventors: Martin Jansche, Michael D. Riley, Andrew M. Rosenberg, Terry Tai
-
Patent number: 8315871Abstract: A rope-jumping algorithm is employed in a Hidden Markov Model based text to speech system to determine start and end models and to modify the start and end models by setting small co-variances. Disordered acoustic parameters due to violation of parameter constraints are avoided through the modification and result in stable line frequency spectrum for the generated speech.Type: GrantFiled: June 4, 2009Date of Patent: November 20, 2012Assignee: Microsoft CorporationInventors: Wenlin Wang, Guoliang Zhang, Jingyang Xu
-
Publication number: 20120239406Abstract: The present invention relates to a method for synthesizing a speech signal; comprising obtaining a speech sequence input signal comprising semantic content corresponding to a speaker's utterance; analyzing the input speech sequence signal to obtain a first sequence of feature vectors for the input speech sequence signal; synthesizing a second sequence of feature vectors different from and based on the first sequence of feature vectors; generating an excitation signal and filtering the excitation signal based on the second sequence of feature vectors to obtain a synthesized speech signal wherein the semantic content is obfuscated.Type: ApplicationFiled: December 2, 2009Publication date: September 20, 2012Inventors: Johan Nikolaas Langehoveen Brummer, Avery Maxwell Glasser, Luis Buera Rodriquez
-
Publication number: 20120123782Abstract: The present invention is related to a method for coding excitation signal of a target speech comprising the steps of: extracting from a set of training normalised residual frames, a set of relevant normalised residual frames, said training residual frames being extracted from a training speech, synchronised on Glottal Closure Instant (GCI), pitch and energy normalised; determining the target excitation signal of the target speech; dividing said target excitation signal into GCI synchronised target frames; determining the local pitch and energy of the GCI synchronised target frames; normalising the GCI synchronised target frames in both energy and pitch, to obtain target normalised residual frames; determining coefficients of linear combination of said extracted set of relevant normalised residual frames to build synthetic normalised residual frames close to each target normalised residual frames; wherein the coding parameters for each target residual frames comprise the determined coefficients.Type: ApplicationFiled: March 30, 2010Publication date: May 17, 2012Inventors: Geoffrey Wilfart, Thomas Drugman, Thierry Dutoit
-
Patent number: 8140326Abstract: An audio privacy system reduces the intelligibility of speech in an audio signal while preserving prosodic information, such as pitch, relative energy and intonation so that a listener has the ability to recognize environmental sounds but not the speech itself. An audio signal is processed to separate non-vocalic information, such as pitch and relative energy of speech, from vocalic regions, after which syllables are identified within the vocalic regions. Representations of the vocalic regions are computed to produce a vocal tract transfer function and an excitation. The vocal tract transfer function for each syllable is then replaced with the vocal tract transfer function from another prerecorded vocalic sound. In one aspect, the identity of the replacement vocalic sound is independent of the identity of the syllable being replaced.Type: GrantFiled: June 6, 2008Date of Patent: March 20, 2012Assignee: Fuji Xerox Co., Ltd.Inventors: Francine Chen, John Adcock
-
Patent number: 8036390Abstract: A scalable encoding device prevents sound quality deterioration of a decoded signal, reduces the encoding rate, and reduces the circuit size. The scalable encoding device includes a first layer encoder for generating a monaural signal by using a plurality of channel signals (L channel signal and R channel signal) constituting a stereo signal and encoding the monaural signal to generate a sound source parameter. The scalable encoding device also includes a second layer encoder for generating a first conversion signal by using the channel signal and the monaural signal, generating a synthesis signal by using the sound source parameter and the first conversion signal, and generating a second conversion coefficient index by using the synthesis signal and the first conversion signal.Type: GrantFiled: January 30, 2006Date of Patent: October 11, 2011Assignee: Panasonic CorporationInventors: Michiyo Goto, Koji Yoshida
-
Patent number: 8000967Abstract: Information about excitation signals of a first signal encoded by CELP is used to derive a limited set of candidate excitation signals for a second correlated second signal. Preferably, pulse locations of the excitation signals of the first encoded signal are used for determining the set of candidate excitation signals. More preferably, the pulse locations of the set of candidate excitation signals are positioned in the vicinity of the pulse locations of the excitation signals of the first encoded signal. The first and second signals may be multi-channel signals of a common speech or audio signal. However, the first and second signals may also be identical, whereby the coding of the second signal can be utilized for re-encoding at a lower bit rate.Type: GrantFiled: March 9, 2005Date of Patent: August 16, 2011Assignee: Telefonaktiebolaget LM Ericsson (publ)Inventor: Anisse Taleb
-
Patent number: 7949521Abstract: A fixed codebook searching apparatus which slightly suppresses an increase in the operation amount, even if the filter applied to the excitation pulse has the characteristic that it cannot be represented by a lower triangular matrix and realizes a quasi-optimal fixed codebook search. This fixed codebook searching apparatus is provided with an algebraic codebook that generates a pulse excitation vector; a convolution operation section that convolutes an impulse response of auditory weighted synthesis filter into an impulse response vector that has a value at negative times, to generate a second impulse response vector that has a value at second negative times; a matrix generating section that generates a Toeplitz-type convolution matrix by means of the second impulse response vector; and a convolution operation section that convolutes the matrix generated by matrix generating section into the pulse excitation vector generated by algebraic codebook.Type: GrantFiled: February 25, 2009Date of Patent: May 24, 2011Assignee: Panasonic CorporationInventors: Hiroyuki Ehara, Koji Yoshida
-
Patent number: 7945446Abstract: Spectrum envelope of an input sound is detected. In the meantime, a converting spectrum is acquired which is a frequency spectrum of a converting sound comprising a plurality of sounds, such as unison sounds. Output spectrum is generated by imparting the detected spectrum envelope of the input sound to the acquired converting spectrum. Sound signal is synthesized on the basis of the generated output spectrum. Further, a pitch of the input sound may be detected, and frequencies of peaks in the acquired converting spectrum may be varied in accordance with the detected pitch of the input sound. In this manner, the output spectrum can have the pitch and spectrum envelope of the input sound and spectrum frequency components of the converting sound comprising a plurality of sounds, and thus, unison sounds can be readily generated with simple arrangements.Type: GrantFiled: March 9, 2006Date of Patent: May 17, 2011Assignee: Yamaha CorporationInventors: Hideki Kemmochi, Yasuo Yoshioka, Jordi Bonada
-
Publication number: 20110022391Abstract: A method and apparatus for generating an excitation signal for background noise are provided. The method includes: generating a quasi excitation signal by utilizing coding parameters in a speech coding/decoding stage and a transition length of an excitation signal; and obtaining the excitation signal for background noise in a transition stage by generating a weighted sum of the quasi excitation signal and a random excitation signal of a background noise frame. Moreover, the apparatus includes: a quasi excitation signal generation unit and a transition stage excitation signal acquisition unit. Through the synthesizing scheme of comfortable background noise according to the present invention, the transition of a synthesized signal from speech to background noise could be more natural, smooth and continuous, which makes the listeners feel more comfortable.Type: ApplicationFiled: September 21, 2010Publication date: January 27, 2011Applicant: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Jinliang DAI, Libin ZHANG, Eyal SHLOMOT, Lin WANG
-
Publication number: 20110015931Abstract: The invention relates to a periodic signal processing method, a periodic signal conversion method, and a periodic signal processing device capable of reducing the influence of periodicity without using a spectral model. Time windows are arranged such that a center of each of the time windows is at a division position which divides a fundamental frequency in a temporal direction into fractions 1/n (where n is an integer equal to or larger than 2) so as to extract a plurality of portions of different ranges from a signal having periodicity. A power spectrum for the plurality of portions extracted by the respective time windows is calculated, and the calculated power spectrum is added with a same ratio.Type: ApplicationFiled: July 18, 2008Publication date: January 20, 2011Inventors: Hideki Kawahara, Masanori Morise, Toru Takahashi, Toshio Irino
-
Patent number: 7864843Abstract: A method and apparatus to perform bandwidth extension encoding and decoding encodes and/or decodes a high frequency signal using an excitation signal for a low frequency signal encoded in a time domain or a frequency domain or using an excitation spectrum for the low frequency signal. Accordingly, although an audio signal is encoded or decoded using a small number of bits, the quality of sound corresponding to a signal in a high frequency band does not degrade. Therefore, a coding efficiency of the audio signal can be maximized.Type: GrantFiled: June 4, 2007Date of Patent: January 4, 2011Assignee: Samsung Electronics Co., Ltd.Inventors: Ki-hyun Choo, Jung-hoe Kim, Eun-mi Oh, Miao Lei, Chang-yong Son
-
Patent number: 7747441Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal. In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result.Type: GrantFiled: January 16, 2007Date of Patent: June 29, 2010Assignee: Mitsubishi Denki Kabushiki KaishaInventor: Tadashi Yamaura
-
Publication number: 20100049522Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.Type: ApplicationFiled: July 20, 2009Publication date: February 25, 2010Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
-
Publication number: 20090299747Abstract: An apparatus for providing improved speech synthesis may include a processor and a memory storing executable instructions. In response to execution of the instructions by the processor, the apparatus may perform at least selecting a real glottal pulse from among one or more stored real glottal pulses based at least in part on a property associated with the real glottal pulse, utilizing the real glottal pulse selected as a basis for generation of an excitation signal, and modifying the excitation signal based on spectral parameters generated by a model to provide synthetic speech.Type: ApplicationFiled: May 29, 2009Publication date: December 3, 2009Inventors: Tuomo Johannes Raitio, Antti Santeri Suni, Martti Tapani Vainio, Paavo Ilmari Alku, Jani Kristian Nurminen
-
Patent number: 7613612Abstract: In a voice synthesizer, an envelope acquisition portion obtains a spectral envelope of a reference frequency spectrum of a given voice. A spectrum acquisition portion obtains a collective frequency spectrum of a plurality of voices which are generated in parallel to one another. An envelope adjustment portion adjusts a spectral envelope of the collective frequency spectrum obtained by the spectrum acquisition portion so as to approximately match with the spectral envelope of the reference frequency spectrum obtained by the envelope acquisition portion. A voice generation portion generates an output voice signal from the collective frequency spectrum having the spectral envelope adjusted by the envelope adjustment portion.Type: GrantFiled: January 31, 2006Date of Patent: November 3, 2009Assignee: Yamaha CorporationInventors: Hideki Kemmochi, Jordi Bonada
-
Publication number: 20090187406Abstract: A voice recognition system is provided that outputs a talk-back voice in a manner such that a user can distinguish the accuracy of a voice-recognized character string more easily. A voice recognition unit performs voice recognition on a user's articulation in which a character string such as the telephone number “024 636 0123” is entered via a microphone. Based on each sound existing period delimited by silent intervals, each recognized partial character string “024”, “636” and “0123” is obtained. A talk-back voice data generating unit connects each recognized partial character string “024”, “636” and “0123” together in a manner such that space characters are inserted, and generates a character string “024 636 0123”. The generated character string “024 636 0123” is supplied to a voice generating device as talk-back voice data. A voice signal to be produced by the speaker 2 is generated in the form of the talk-back voice.Type: ApplicationFiled: December 3, 2008Publication date: July 23, 2009Inventors: Kazunori Sakuma, Nozomu Saito, Tohru Masumoto
-
Patent number: 7546237Abstract: A system extends the bandwidth of a narrowband speech signal into a wideband spectrum. The system includes a high-band generator that generates a high frequency spectrum based on a narrowband spectrum. A background noise generator generates a high frequency background noise spectrum based on a background noise within the narrowband spectrum. A summing circuit linked to the high-band generator and the background noise generator combines the high frequency spectrum and narrowband spectrum and the high frequency background noise spectrum.Type: GrantFiled: December 23, 2005Date of Patent: June 9, 2009Assignee: QNX Software Systems (Wavemakers), Inc.Inventors: Rajeev Nongpiur, Xueman Li, Phillip A. Hetherington
-
Publication number: 20090144053Abstract: An information extraction unit extracts spectral envelope information of L-dimension from each frame of speech data. The spectral envelope information does not have a spectral fine structure. A basis storage unit stores N bases (L>N>1). Each basis is differently a frequency band having a maximum as a peak frequency in a spectral domain having L-dimension. A value corresponding to a frequency outside the frequency band along a frequency axis of the spectral domain is zero. Two frequency bands of which two peak frequencies are adjacent along the frequency axis partially overlap. A parameter calculation unit minimizes a distortion between the spectral envelope information and a linear combination of each basis with a coefficient by changing the coefficient, and sets the coefficient of each basis from which the distortion is minimized to a spectral envelope parameter of the spectral envelope information.Type: ApplicationFiled: December 3, 2008Publication date: June 4, 2009Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune TAMURA, Katsumi TSUCHIYA, Takehiko KAGOSHIMA
-
Patent number: 7529672Abstract: A method of synthesizing a speech signal by providing a first speech unit signal having an end interval and a second speech unit signal having a front interval, wherein at least some of the periods of the end interval are appended in inverted order at the end of the first speech unit signal in order to provide a fade-out interval, and at least some of the periods of the front interval are appended in inverted order at the beginning of the second speech unit signal to provide a fade-in interval. An overlap and add operation is performed on the end and fade-in intervals and the fade-out and front intervals.Type: GrantFiled: August 8, 2003Date of Patent: May 5, 2009Assignee: Koninklijke Philips Electronics N.V.Inventor: Ercan Ferit Gigi
-
Publication number: 20090112596Abstract: A system and method are disclosed for synthesizing speech based on a selected speech act. A method includes modifying synthesized speech of a spoken dialogue system, by (1) receiving a user utterance, (2) analyzing the user utterance to determine an appropriate speech act, and (3) generating a response of a type associated with the appropriate speech act, wherein in linguistic variables in the response are selected, based on the appropriate speech act.Type: ApplicationFiled: October 30, 2007Publication date: April 30, 2009Applicant: AT&T Lab, Inc.Inventors: Ann K. Syrdal, Mark Beutnagel, Alistair D. Conkie, Yeon-Jun Kim
-
Patent number: 7403894Abstract: Audio/video programming content is made available to a receiver from a content provider, and meta data is made available to the receiver from a meta data provider. The meta data corresponds to the programming content, and identifies, for each of multiple portions of the programming content, an indicator of a likelihood that the portion is an exciting portion of the content. In one implementation, the meta data includes probabilities that segments of a baseball program are exciting, and is generated by analyzing the audio data of the baseball program for both excited speech and baseball hits. The meta data can then be used to generate a summary for the baseball program.Type: GrantFiled: March 15, 2005Date of Patent: July 22, 2008Assignee: Microsoft CorporationInventors: Yong Rui, Anoop Gupta, Alejandro Acero
-
Patent number: 7305068Abstract: A telephone call may be received or made by the user of telephony-enabled apparatus in circumstances, such as during a meeting, where spoken responses by the user to what the other party to the call has said are unacceptable. A telephony method and arrangement are disclosed which permits a user to use silent input to the telephony-enabled apparatus in order to generate a response to the other party to the call. Response generation is facilitated by enabling the user to effect a selection from the content of the other party's input, or from options derived from that input, with this selection then being used in forming the response.Type: GrantFiled: February 25, 2005Date of Patent: December 4, 2007Assignee: Hewlett-Packard Development Company, L.P.Inventors: Roger Cecil Ferry Tucker, Paul St John Brittan
-
Patent number: 7283961Abstract: There is disclosed a speech processing device in which prediction taps for finding prediction values of the speech of high sound quality are extracted from the synthesized sound obtained on affording linear prediction coefficients and residual signals, generated from a preset code, to a speech synthesis filter, speech of high sound quality being higher in sound quality than the synthesized sound, and in which the prediction taps are used along with preset tap coefficients to perform preset predictive calculations to find the prediction values of the speech of high sound quality. The speech of high sound quality is higher in sound quality than the synthesized sound.Type: GrantFiled: August 3, 2001Date of Patent: October 16, 2007Assignee: Sony CorporationInventors: Tetsujiro Kondo, Tsutomu Watanabe, Masaaki Hattori, Hiroto Kimura, Yasuhiro Fujimori
-
Patent number: 7272555Abstract: A method for speech processing in a code excitation linear prediction (CELP) based speech system having a plurality of modes including at least a first mode and a consecutive second mode. The method includes providing an input speech signal, dividing the speech signal into a plurality of frames, dividing at least one of the plurality of frames into sub-frames including a plurality of pulses, selecting a first number of pulses for the first mode, with a second number of remaining pulses in the frame plus the first number of pulses in the first mode for the second mode, providing a plurality of sub-modes between the first mode and the second mode, forming a base layer, forming an enhancement layer, generating a bit stream including a basic bit stream and an enhancement bit stream, wherein the basic bit stream is used to update memory states of the speech system.Type: GrantFiled: July 28, 2003Date of Patent: September 18, 2007Assignee: Industrial Technology Research InstituteInventors: I-Hsien Lee, Fang-Chu Chen
-
Patent number: 7269559Abstract: The present invention relates to a data processing apparatus capable of obtaining high-quality sound, etc. A tap generation section 121 generate a prediction tap from synthesized speech data for 40 samples in a subframe of subject data of interest within the synthesized speech data such that speech coded data coded by a CELP method, and synthesized speech data in which a position in the past from a subject subframe by a lag indicated by an L code located in that subject subframe is a starting point. Then, a prediction section 125 decodes high-quality sound data by performing a predetermined prediction computation by using the prediction tap and a tap coefficient stored in a coefficient memory 124. The present invention can be applied to mobile phones for transmitting and receiving speech.Type: GrantFiled: January 24, 2002Date of Patent: September 11, 2007Assignee: Sony CorporationInventors: Tetsujiro Kondo, Hiroto Kimura, Tsutomu Watanabe, Masaaki Hattori
-
Patent number: 7257535Abstract: A system and method are provided for processing audio and speech signals using a pitch and voicing dependent spectral estimation algorithm (voicing algorithm) to accurately represent voiced speech, unvoiced speech, and mixed speech in the presence of background noise, and background noise with a single model. The present invention also modifies the synthesis model based on an estimate of the current input signal to improve the perceptual quality of the speech and background noise under a variety of input conditions. The present invention also improves the voicing dependent spectral estimation algorithm robustness by introducing the use of a Multi-Layer Neural Network in the estimation process. The voicing dependent spectral estimation algorithm provides an accurate and robust estimate of the voicing probability under a variety of background noise conditions. This is essential to providing high quality intelligible speech in the presence of background noise.Type: GrantFiled: October 28, 2005Date of Patent: August 14, 2007Assignee: Lucent Technologies Inc.Inventors: Joseph Gerard Aguilar, Juin-Hwey Chen, Wei Wang, Robert W. Zopf
-
Sound encoding method and sound decoding method, and sound encoding device and sound decoding device
Patent number: 7092885Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal. In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result.Type: GrantFiled: December 7, 1998Date of Patent: August 15, 2006Assignee: Mitsubishi Denki Kabushiki KaishaInventor: Tadashi Yamaura