Excitation (epo) Patents (Class 704/E13.007)
  • Patent number: 11848005
    Abstract: There is provided a computer-implemented method of training a speech-to-speech (S2S) machine learning (ML) model for adapting at least one voice attribute of speech, comprising: creating an S2S training dataset of a plurality of S2S records, wherein an S2S record comprises: a first audio content comprising speech having at least one first voice attribute, and a ground truth label of a second audio content comprising speech having at least one second voice attribute, wherein the first audio content and the second audio content have the same lexical content and are time-synchronized, and training the S2S ML model using the S2S training dataset, wherein the S2S ML model is fed an input of a source audio content with at least one source voice attribute and generates an outcome of the source audio content with at least one target voice attribute.
    Type: Grant
    Filed: April 28, 2022
    Date of Patent: December 19, 2023
    Assignee: Meaning.Team, Inc
    Inventors: Yishay Carmiel, Lukasz Wojciak, Piotr Zelasko, Jan Vainer, Tomas Nekvinda, Ondrej Platek
  • Patent number: 11646044
    Abstract: A method obtains a first sound signal representative of a first sound, including a first spectrum envelope contour and a first reference spectrum envelope contour; obtains a second sound signal, representative of a second sound differing in sound characteristics from the first sound, including a second spectrum envelope contour and a second reference spectrum envelope contour; generates a synthesis spectrum envelope contour by transforming the first spectrum envelope contour based on a first difference between the first spectrum envelope contour and the first reference spectrum envelope contour at a first time point of the first sound signal, and a second difference between the second spectrum envelope contour and the second reference spectrum envelope contour at a second time point of the second sound signal; and generates a third sound signal representative of the first sound that has been transformed using the generated synthesis spectrum envelope contour.
    Type: Grant
    Filed: September 8, 2020
    Date of Patent: May 9, 2023
    Assignee: YAMAHA CORPORATION
    Inventors: Ryunosuke Daido, Hiraku Kayama
  • Patent number: 11636850
    Abstract: A method and system for performing real-time sentiment modulation in conversation systems is disclosed. The method includes generating an impact table comprising a plurality of sentiment vectors and a plurality of emotion vectors associated with the plurality of sentences. The method further includes generating for each of the plurality of sentences, a dependency vector based on the associated sentiment vector and the associated emotion vector. The method further includes stacking the dependency vector generated to generate a waveform representing variance in sentiment and emotions across words within the plurality of sentences. The method further includes altering at least one portion of the waveform based on a desired emotional output to generate a reshaped waveform. The method further includes generating a set of rephrased sentences associated with the at least one portion, based on the reshaped waveform, the set of sentences, a user defined sentiment output.
    Type: Grant
    Filed: July 24, 2020
    Date of Patent: April 25, 2023
    Assignee: Wipro Limited
    Inventor: Manjunath Ramachandra Iyer
  • Publication number: 20120123782
    Abstract: The present invention is related to a method for coding excitation signal of a target speech comprising the steps of: extracting from a set of training normalised residual frames, a set of relevant normalised residual frames, said training residual frames being extracted from a training speech, synchronised on Glottal Closure Instant (GCI), pitch and energy normalised; determining the target excitation signal of the target speech; dividing said target excitation signal into GCI synchronised target frames; determining the local pitch and energy of the GCI synchronised target frames; normalising the GCI synchronised target frames in both energy and pitch, to obtain target normalised residual frames; determining coefficients of linear combination of said extracted set of relevant normalised residual frames to build synthetic normalised residual frames close to each target normalised residual frames; wherein the coding parameters for each target residual frames comprise the determined coefficients.
    Type: Application
    Filed: March 30, 2010
    Publication date: May 17, 2012
    Inventors: Geoffrey Wilfart, Thomas Drugman, Thierry Dutoit
  • Publication number: 20110022391
    Abstract: A method and apparatus for generating an excitation signal for background noise are provided. The method includes: generating a quasi excitation signal by utilizing coding parameters in a speech coding/decoding stage and a transition length of an excitation signal; and obtaining the excitation signal for background noise in a transition stage by generating a weighted sum of the quasi excitation signal and a random excitation signal of a background noise frame. Moreover, the apparatus includes: a quasi excitation signal generation unit and a transition stage excitation signal acquisition unit. Through the synthesizing scheme of comfortable background noise according to the present invention, the transition of a synthesized signal from speech to background noise could be more natural, smooth and continuous, which makes the listeners feel more comfortable.
    Type: Application
    Filed: September 21, 2010
    Publication date: January 27, 2011
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Jinliang DAI, Libin ZHANG, Eyal SHLOMOT, Lin WANG
  • Publication number: 20110015931
    Abstract: The invention relates to a periodic signal processing method, a periodic signal conversion method, and a periodic signal processing device capable of reducing the influence of periodicity without using a spectral model. Time windows are arranged such that a center of each of the time windows is at a division position which divides a fundamental frequency in a temporal direction into fractions 1/n (where n is an integer equal to or larger than 2) so as to extract a plurality of portions of different ranges from a signal having periodicity. A power spectrum for the plurality of portions extracted by the respective time windows is calculated, and the calculated power spectrum is added with a same ratio.
    Type: Application
    Filed: July 18, 2008
    Publication date: January 20, 2011
    Inventors: Hideki Kawahara, Masanori Morise, Toru Takahashi, Toshio Irino
  • Publication number: 20100049522
    Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.
    Type: Application
    Filed: July 20, 2009
    Publication date: February 25, 2010
    Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
  • Publication number: 20090144053
    Abstract: An information extraction unit extracts spectral envelope information of L-dimension from each frame of speech data. The spectral envelope information does not have a spectral fine structure. A basis storage unit stores N bases (L>N>1). Each basis is differently a frequency band having a maximum as a peak frequency in a spectral domain having L-dimension. A value corresponding to a frequency outside the frequency band along a frequency axis of the spectral domain is zero. Two frequency bands of which two peak frequencies are adjacent along the frequency axis partially overlap. A parameter calculation unit minimizes a distortion between the spectral envelope information and a linear combination of each basis with a coefficient by changing the coefficient, and sets the coefficient of each basis from which the distortion is minimized to a spectral envelope parameter of the spectral envelope information.
    Type: Application
    Filed: December 3, 2008
    Publication date: June 4, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masatsune TAMURA, Katsumi TSUCHIYA, Takehiko KAGOSHIMA
  • Publication number: 20080004861
    Abstract: A system and method for characterizing, synthesizing, and/or canceling out acoustic signals from inanimate sound sources is disclosed. Propagating wave electromagnetic sensors monitor excitation sources in sound producing systems, such as machines, musical instruments, and various other structures. Acoustical output from these sound producing systems is also monitored. From such information, a transfer function characterizing the sound producing system is generated. From the transfer function, acoustical output from the sound producing system may be synthesized or canceled. The methods disclosed enable accurate calculation of matched transfer functions relating specific excitations to specific acoustical outputs. Knowledge of such signals and functions can be used to effect various sound replication, sound source identification, and sound cancellation applications.
    Type: Application
    Filed: September 6, 2007
    Publication date: January 3, 2008
    Inventors: John Holzrichter, Greg Burnett, Lawrence Ng