Patents by Inventor Byung Ha Chun

Byung Ha Chun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Text-to-speech synthesis using an autoencoder

Patent number: 10249289

Abstract: Methods, systems, and computer-readable media for text-to-speech synthesis using an autoencoder. In some implementations, data indicating a text for text-to-speech synthesis is obtained. Data indicating a linguistic unit of the text is provided as input to an encoder. The encoder is configured to output speech unit representations indicative of acoustic characteristics based on linguistic information. A speech unit representation that the encoder outputs is received. A speech unit is selected to represent the linguistic unit, the speech unit being selected from among a collection of speech units based on the speech unit representation output by the encoder. Audio data for a synthesized utterance of the text that includes the selected speech unit is provided.

Type: Grant

Filed: July 13, 2017

Date of Patent: April 2, 2019

Assignee: Google LLC

Inventors: Byung Ha Chun, Javier Gonzalvo, Chun-an Chan, Ioannis Agiomyrgiannakis, Vincent Ping Leung Wan, Robert Andrew James Clark, Jakub Vit
TEXT-TO-SPEECH SYNTHESIS USING AN AUTOENCODER

Publication number: 20180268806

Abstract: Methods, systems, and computer-readable media for text-to-speech synthesis using an autoencoder. In some implementations, data indicating a text for text-to-speech synthesis is obtained. Data indicating a linguistic unit of the text is provided as input to an encoder. The encoder is configured to output speech unit representations indicative of acoustic characteristics based on linguistic information. A speech unit representation that the encoder outputs is received. A speech unit is selected to represent the linguistic unit, the speech unit being selected from among a collection of speech units based on the speech unit representation output by the encoder. Audio data for a synthesized utterance of the text that includes the selected speech unit is provided.

Type: Application

Filed: July 13, 2017

Publication date: September 20, 2018

Inventors: Byung Ha Chun, Javier Gonzalvo, Chun-an Chan, Ioannis Agiomyrgiannakis, Vincent Ping Leung Wan, Robert Andrew James Clark, Jakub Vit
Cross-lingual speaker adaptation for multi-lingual speech synthesis

Patent number: 9922641

Abstract: The subject matter of the disclosure is embodied in a method that includes receiving input speech data from a speaker in a first language, and estimating, based on a universal speech model, a speaker transform representing speaker characteristics associated with the input speech data. The method also includes accessing a speaker-independent speech model for generating speech data in a second language that is different from the first language. The method further includes modifying the speaker-independent speech model using the speaker transform to obtain a speaker-specific speech model, and generating speech data in the second language using the speaker-specific speech model.

Type: Grant

Filed: October 31, 2012

Date of Patent: March 20, 2018

Assignee: Google LLC

Inventor: Byung Ha Chun
Devices and methods for use of phase information in speech synthesis systems

Patent number: 9865247

Abstract: A device may receive a speech signal. The device may determine acoustic feature parameters for the speech signal. The acoustic feature parameters may include phase data. The device may determine circular space representations for the phase data based on an alignment of the phase data with given axes of the circular space representations. The device may map the phase data to linguistic features based on the circular space representations. The linguistic features may be associated with linguistic content that includes phonemic content or text content. The device may provide a synthetic audio pronunciation of the linguistic content based on the mapping.

Type: Grant

Filed: February 25, 2015

Date of Patent: January 9, 2018

Assignee: Google Inc.

Inventors: Ioannis Agiomyrgiannakis, Byung Ha Chun
Speech recognition and synthesis utilizing context dependent acoustic models containing decision trees

Patent number: 9043213

Abstract: A speech recognition method including the steps of receiving a speech input from a known speaker of a sequence of observations and determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model. The acoustic model has a plurality of model parameters describing probability distributions which relate a word or part thereof to an observation and has been trained using first training data and adapted using second training data to said speaker. The speech recognition method also determines the likelihood of a sequence of observations occurring in a given language using a language model and combines the likelihoods determined by the acoustic model and the language model and outputs a sequence of words identified from said speech input signal. The acoustic model is context based for the speaker, the context based information being contained in the model using a plurality of decision trees and the structure of the decision trees is based on second training data.

Type: Grant

Filed: January 26, 2011

Date of Patent: May 26, 2015

Assignee: Kabushiki Kaisha Toshiba

Inventor: Byung Ha Chun
Voice conversion method and system

Patent number: 8930183

Abstract: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.

Type: Grant

Filed: August 25, 2011

Date of Patent: January 6, 2015

Assignee: Kabushiki Kaisha Toshiba

Inventors: Byung Ha Chun, Mark John Francis Gales
Text to speech method and system converting acoustic units to speech vectors using language dependent weights for a selected language

Patent number: 8825485

Abstract: A text-to-speech method for use in a plurality of languages, including: inputting text in a selected language; dividing the inputted text into a sequence of acoustic units; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model, wherein the model has a plurality of model parameters describing probability distributions which relate an acoustic unit to a speech vector; and outputting the sequence of speech vectors as audio in the selected language. A parameter of a predetermined type of each probability distribution in the selected language is expressed as a weighted sum of language independent parameters of the same type. The weighting used is language dependent, such that converting the sequence of acoustic units to a sequence of speech vectors includes retrieving the language dependent weights for the selected language.

Type: Grant

Filed: June 10, 2009

Date of Patent: September 2, 2014

Assignee: Kabushiki Kaisha Toshiba

Inventors: Byung Ha Chun, Sacha Krstulovic
TEXT TO SPEECH METHOD AND SYSTEM

Publication number: 20120278081

Abstract: A text-to-speech method for use in a plurality of languages, including: inputting text in a selected language; dividing the inputted text into a sequence of acoustic units; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model, wherein the model has a plurality of model parameters describing probability distributions which relate an acoustic unit to a speech vector; and outputting the sequence of speech vectors as audio in the selected language. A parameter of a predetermined type of each probability distribution in the selected language is expressed as a weighted sum of language independent parameters of the same type. The weighting used is language dependent, such that converting the sequence of acoustic units to a sequence of speech vectors includes retrieving the language dependent weights for the selected language.

Type: Application

Filed: June 10, 2009

Publication date: November 1, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Byung Ha Chun, Sacha Krstulovic
VOICE CONVERSION METHOD AND SYSTEM

Publication number: 20120253794

Abstract: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.

Type: Application

Filed: August 25, 2011

Publication date: October 4, 2012

Applicant: Kabushiki Kaisha Toshiba

Inventors: Byung Ha CHUN, Mark John Francis GALES
SPEECH PROCESSING METHOD AND APPARATUS

Publication number: 20110276332

Abstract: A speech synthesis method comprising: receiving a text input and outputting speech corresponding to said text input using a stochastic model, said stochastic model comprising an acoustic model and an excitation model, said acoustic model having a plurality of model parameters describing probability distributions which relate a word or part thereof to a feature, said excitation model comprising excitation model parameters which are used to model the vocal chords and lungs to output the speech using said features; wherein said acoustic parameters and excitation parameters have been jointly estimated; and outputting said speech.

Type: Application

Filed: May 6, 2011

Publication date: November 10, 2011

Applicant: Kabushiki Kaisha Toshiba

Inventors: Ranniery MAIA, Byung Ha Chun
SPEECH PROCESSOR, A SPEECH PROCESSING METHOD AND A METHOD OF TRAINING A SPEECH PROCESSOR

Publication number: 20110218804

Abstract: A speech recognition method, the method involving: receiving a speech input from a known speaker of a sequence of observations; and determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model, the acoustic model having a plurality of model parameters describing probability distributions which relate a word or part thereof to an observation, the acoustic model having been trained using first training data and adapted using second training data to said speaker, the speech recognition method also determining the likelihood of a sequence of observations occurring in a given language using a language model; and combining the likelihoods determined by the acoustic model and the language model and outputting a sequence of words identified from said speech input signal, wherein said acoustic model is context based for said speaker, said context based information being contained in said model using a plurality of decision trees, wherein the structure of said d

Type: Application

Filed: January 26, 2011

Publication date: September 8, 2011

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Byung Ha Chun