Excitation (epo) Patents (Class 704/E13.007)

Upsampling of audio using generative adversarial networks

Patent number: 12170096

Abstract: Introduced here are approaches to training and then employing computer-implemented models designed to upsample discrete audio signals to higher sampling rates. Assume, for example, that a media production platform obtains a first discrete signal at a relatively low sampling rate. The relatively low sampling frequency may make the first discrete audio signal unsuitable for inclusion in media compilations, so the media production platform may attempt to improve its quality through upsampling. To accomplish this, the media production platform can apply a transform to the first discrete signal to produce a first magnitude spectrogram. Then, the media production platform can apply a computer-implemented model to the first magnitude spectrogram to produce a second magnitude spectrogram. Thereafter, the media production platform can apply an inverse transform to the second magnitude spectrogram to create a second discrete signal that has a higher sampling rate than the first discrete audio signal.

Type: Grant

Filed: September 17, 2021

Date of Patent: December 17, 2024

Assignee: Descript, Inc.

Inventors: Rithesh Kumar, Kundan Kumar
Voice attribute conversion using speech to speech

Patent number: 11848005

Abstract: There is provided a computer-implemented method of training a speech-to-speech (S2S) machine learning (ML) model for adapting at least one voice attribute of speech, comprising: creating an S2S training dataset of a plurality of S2S records, wherein an S2S record comprises: a first audio content comprising speech having at least one first voice attribute, and a ground truth label of a second audio content comprising speech having at least one second voice attribute, wherein the first audio content and the second audio content have the same lexical content and are time-synchronized, and training the S2S ML model using the S2S training dataset, wherein the S2S ML model is fed an input of a source audio content with at least one source voice attribute and generates an outcome of the source audio content with at least one target voice attribute.

Type: Grant

Filed: April 28, 2022

Date of Patent: December 19, 2023

Assignee: Meaning.Team, Inc

Inventors: Yishay Carmiel, Lukasz Wojciak, Piotr Zelasko, Jan Vainer, Tomas Nekvinda, Ondrej Platek
Sound processing method, sound processing apparatus, and recording medium

Patent number: 11646044

Abstract: A method obtains a first sound signal representative of a first sound, including a first spectrum envelope contour and a first reference spectrum envelope contour; obtains a second sound signal, representative of a second sound differing in sound characteristics from the first sound, including a second spectrum envelope contour and a second reference spectrum envelope contour; generates a synthesis spectrum envelope contour by transforming the first spectrum envelope contour based on a first difference between the first spectrum envelope contour and the first reference spectrum envelope contour at a first time point of the first sound signal, and a second difference between the second spectrum envelope contour and the second reference spectrum envelope contour at a second time point of the second sound signal; and generates a third sound signal representative of the first sound that has been transformed using the generated synthesis spectrum envelope contour.

Type: Grant

Filed: September 8, 2020

Date of Patent: May 9, 2023

Assignee: YAMAHA CORPORATION

Inventors: Ryunosuke Daido, Hiraku Kayama
Method, system, and device for performing real-time sentiment modulation in conversation systems

Patent number: 11636850

Abstract: A method and system for performing real-time sentiment modulation in conversation systems is disclosed. The method includes generating an impact table comprising a plurality of sentiment vectors and a plurality of emotion vectors associated with the plurality of sentences. The method further includes generating for each of the plurality of sentences, a dependency vector based on the associated sentiment vector and the associated emotion vector. The method further includes stacking the dependency vector generated to generate a waveform representing variance in sentiment and emotions across words within the plurality of sentences. The method further includes altering at least one portion of the waveform based on a desired emotional output to generate a reshaped waveform. The method further includes generating a set of rephrased sentences associated with the at least one portion, based on the reshaped waveform, the set of sentences, a user defined sentiment output.

Type: Grant

Filed: July 24, 2020

Date of Patent: April 25, 2023

Assignee: Wipro Limited

Inventor: Manjunath Ramachandra Iyer
SPEECH SYNTHESIS AND CODING METHODS

Publication number: 20120123782

Abstract: The present invention is related to a method for coding excitation signal of a target speech comprising the steps of: extracting from a set of training normalised residual frames, a set of relevant normalised residual frames, said training residual frames being extracted from a training speech, synchronised on Glottal Closure Instant (GCI), pitch and energy normalised; determining the target excitation signal of the target speech; dividing said target excitation signal into GCI synchronised target frames; determining the local pitch and energy of the GCI synchronised target frames; normalising the GCI synchronised target frames in both energy and pitch, to obtain target normalised residual frames; determining coefficients of linear combination of said extracted set of relevant normalised residual frames to build synthetic normalised residual frames close to each target normalised residual frames; wherein the coding parameters for each target residual frames comprise the determined coefficients.

Type: Application

Filed: March 30, 2010

Publication date: May 17, 2012

Inventors: Geoffrey Wilfart, Thomas Drugman, Thierry Dutoit
METHOD AND APPARATUS FOR GENERATING AN EXCITATION SIGNAL FOR BACKGROUND NOISE

Publication number: 20110022391

Abstract: A method and apparatus for generating an excitation signal for background noise are provided. The method includes: generating a quasi excitation signal by utilizing coding parameters in a speech coding/decoding stage and a transition length of an excitation signal; and obtaining the excitation signal for background noise in a transition stage by generating a weighted sum of the quasi excitation signal and a random excitation signal of a background noise frame. Moreover, the apparatus includes: a quasi excitation signal generation unit and a transition stage excitation signal acquisition unit. Through the synthesizing scheme of comfortable background noise according to the present invention, the transition of a synthesized signal from speech to background noise could be more natural, smooth and continuous, which makes the listeners feel more comfortable.

Type: Application

Filed: September 21, 2010

Publication date: January 27, 2011

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Jinliang DAI, Libin ZHANG, Eyal SHLOMOT, Lin WANG
PERIODIC SIGNAL PROCESSING METHOD,PERIODIC SIGNAL CONVERSION METHOD,PERIODIC SIGNAL PROCESSING DEVICE, AND PERIODIC SIGNAL ANALYSIS METHOD

Publication number: 20110015931

Abstract: The invention relates to a periodic signal processing method, a periodic signal conversion method, and a periodic signal processing device capable of reducing the influence of periodicity without using a spectral model. Time windows are arranged such that a center of each of the time windows is at a division position which divides a fundamental frequency in a temporal direction into fractions 1/n (where n is an integer equal to or larger than 2) so as to extract a plurality of portions of different ranges from a signal having periodicity. A power spectrum for the plurality of portions extracted by the respective time windows is calculated, and the calculated power spectrum is added with a same ratio.

Type: Application

Filed: July 18, 2008

Publication date: January 20, 2011

Inventors: Hideki Kawahara, Masanori Morise, Toru Takahashi, Toshio Irino
VOICE CONVERSION APPARATUS AND METHOD AND SPEECH SYNTHESIS APPARATUS AND METHOD

Publication number: 20100049522

Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.

Type: Application

Filed: July 20, 2009

Publication date: February 25, 2010

Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
SPEECH PROCESSING APPARATUS AND SPEECH SYNTHESIS APPARATUS

Publication number: 20090144053

Abstract: An information extraction unit extracts spectral envelope information of L-dimension from each frame of speech data. The spectral envelope information does not have a spectral fine structure. A basis storage unit stores N bases (L>N>1). Each basis is differently a frequency band having a maximum as a peak frequency in a spectral domain having L-dimension. A value corresponding to a frequency outside the frequency band along a frequency axis of the spectral domain is zero. Two frequency bands of which two peak frequencies are adjacent along the frequency axis partially overlap. A parameter calculation unit minimizes a distortion between the spectral envelope information and a linear combination of each basis with a coefficient by changing the coefficient, and sets the coefficient of each basis from which the distortion is minimized to a spectral envelope parameter of the spectral envelope information.

Type: Application

Filed: December 3, 2008

Publication date: June 4, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune TAMURA, Katsumi TSUCHIYA, Takehiko KAGOSHIMA
System and method for characterizing, synthesizing, and/or canceling out acoustic signals from inanimate sound sources

Publication number: 20080004861

Abstract: A system and method for characterizing, synthesizing, and/or canceling out acoustic signals from inanimate sound sources is disclosed. Propagating wave electromagnetic sensors monitor excitation sources in sound producing systems, such as machines, musical instruments, and various other structures. Acoustical output from these sound producing systems is also monitored. From such information, a transfer function characterizing the sound producing system is generated. From the transfer function, acoustical output from the sound producing system may be synthesized or canceled. The methods disclosed enable accurate calculation of matched transfer functions relating specific excitations to specific acoustical outputs. Knowledge of such signals and functions can be used to effect various sound replication, sound source identification, and sound cancellation applications.

Type: Application

Filed: September 6, 2007

Publication date: January 3, 2008

Inventors: John Holzrichter, Greg Burnett, Lawrence Ng