Patents by Inventor Byung Ha Chun
Byung Ha Chun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10249289Abstract: Methods, systems, and computer-readable media for text-to-speech synthesis using an autoencoder. In some implementations, data indicating a text for text-to-speech synthesis is obtained. Data indicating a linguistic unit of the text is provided as input to an encoder. The encoder is configured to output speech unit representations indicative of acoustic characteristics based on linguistic information. A speech unit representation that the encoder outputs is received. A speech unit is selected to represent the linguistic unit, the speech unit being selected from among a collection of speech units based on the speech unit representation output by the encoder. Audio data for a synthesized utterance of the text that includes the selected speech unit is provided.Type: GrantFiled: July 13, 2017Date of Patent: April 2, 2019Assignee: Google LLCInventors: Byung Ha Chun, Javier Gonzalvo, Chun-an Chan, Ioannis Agiomyrgiannakis, Vincent Ping Leung Wan, Robert Andrew James Clark, Jakub Vit
-
Publication number: 20180268806Abstract: Methods, systems, and computer-readable media for text-to-speech synthesis using an autoencoder. In some implementations, data indicating a text for text-to-speech synthesis is obtained. Data indicating a linguistic unit of the text is provided as input to an encoder. The encoder is configured to output speech unit representations indicative of acoustic characteristics based on linguistic information. A speech unit representation that the encoder outputs is received. A speech unit is selected to represent the linguistic unit, the speech unit being selected from among a collection of speech units based on the speech unit representation output by the encoder. Audio data for a synthesized utterance of the text that includes the selected speech unit is provided.Type: ApplicationFiled: July 13, 2017Publication date: September 20, 2018Inventors: Byung Ha Chun, Javier Gonzalvo, Chun-an Chan, Ioannis Agiomyrgiannakis, Vincent Ping Leung Wan, Robert Andrew James Clark, Jakub Vit
-
Patent number: 9922641Abstract: The subject matter of the disclosure is embodied in a method that includes receiving input speech data from a speaker in a first language, and estimating, based on a universal speech model, a speaker transform representing speaker characteristics associated with the input speech data. The method also includes accessing a speaker-independent speech model for generating speech data in a second language that is different from the first language. The method further includes modifying the speaker-independent speech model using the speaker transform to obtain a speaker-specific speech model, and generating speech data in the second language using the speaker-specific speech model.Type: GrantFiled: October 31, 2012Date of Patent: March 20, 2018Assignee: Google LLCInventor: Byung Ha Chun
-
Patent number: 9865247Abstract: A device may receive a speech signal. The device may determine acoustic feature parameters for the speech signal. The acoustic feature parameters may include phase data. The device may determine circular space representations for the phase data based on an alignment of the phase data with given axes of the circular space representations. The device may map the phase data to linguistic features based on the circular space representations. The linguistic features may be associated with linguistic content that includes phonemic content or text content. The device may provide a synthetic audio pronunciation of the linguistic content based on the mapping.Type: GrantFiled: February 25, 2015Date of Patent: January 9, 2018Assignee: Google Inc.Inventors: Ioannis Agiomyrgiannakis, Byung Ha Chun
-
Patent number: 9043213Abstract: A speech recognition method including the steps of receiving a speech input from a known speaker of a sequence of observations and determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model. The acoustic model has a plurality of model parameters describing probability distributions which relate a word or part thereof to an observation and has been trained using first training data and adapted using second training data to said speaker. The speech recognition method also determines the likelihood of a sequence of observations occurring in a given language using a language model and combines the likelihoods determined by the acoustic model and the language model and outputs a sequence of words identified from said speech input signal. The acoustic model is context based for the speaker, the context based information being contained in the model using a plurality of decision trees and the structure of the decision trees is based on second training data.Type: GrantFiled: January 26, 2011Date of Patent: May 26, 2015Assignee: Kabushiki Kaisha ToshibaInventor: Byung Ha Chun
-
Patent number: 8930183Abstract: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.Type: GrantFiled: August 25, 2011Date of Patent: January 6, 2015Assignee: Kabushiki Kaisha ToshibaInventors: Byung Ha Chun, Mark John Francis Gales
-
Patent number: 8825485Abstract: A text-to-speech method for use in a plurality of languages, including: inputting text in a selected language; dividing the inputted text into a sequence of acoustic units; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model, wherein the model has a plurality of model parameters describing probability distributions which relate an acoustic unit to a speech vector; and outputting the sequence of speech vectors as audio in the selected language. A parameter of a predetermined type of each probability distribution in the selected language is expressed as a weighted sum of language independent parameters of the same type. The weighting used is language dependent, such that converting the sequence of acoustic units to a sequence of speech vectors includes retrieving the language dependent weights for the selected language.Type: GrantFiled: June 10, 2009Date of Patent: September 2, 2014Assignee: Kabushiki Kaisha ToshibaInventors: Byung Ha Chun, Sacha Krstulovic
-
Publication number: 20120278081Abstract: A text-to-speech method for use in a plurality of languages, including: inputting text in a selected language; dividing the inputted text into a sequence of acoustic units; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model, wherein the model has a plurality of model parameters describing probability distributions which relate an acoustic unit to a speech vector; and outputting the sequence of speech vectors as audio in the selected language. A parameter of a predetermined type of each probability distribution in the selected language is expressed as a weighted sum of language independent parameters of the same type. The weighting used is language dependent, such that converting the sequence of acoustic units to a sequence of speech vectors includes retrieving the language dependent weights for the selected language.Type: ApplicationFiled: June 10, 2009Publication date: November 1, 2012Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Byung Ha Chun, Sacha Krstulovic
-
Publication number: 20120253794Abstract: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.Type: ApplicationFiled: August 25, 2011Publication date: October 4, 2012Applicant: Kabushiki Kaisha ToshibaInventors: Byung Ha CHUN, Mark John Francis GALES
-
Publication number: 20110276332Abstract: A speech synthesis method comprising: receiving a text input and outputting speech corresponding to said text input using a stochastic model, said stochastic model comprising an acoustic model and an excitation model, said acoustic model having a plurality of model parameters describing probability distributions which relate a word or part thereof to a feature, said excitation model comprising excitation model parameters which are used to model the vocal chords and lungs to output the speech using said features; wherein said acoustic parameters and excitation parameters have been jointly estimated; and outputting said speech.Type: ApplicationFiled: May 6, 2011Publication date: November 10, 2011Applicant: Kabushiki Kaisha ToshibaInventors: Ranniery MAIA, Byung Ha Chun
-
Publication number: 20110218804Abstract: A speech recognition method, the method involving: receiving a speech input from a known speaker of a sequence of observations; and determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model, the acoustic model having a plurality of model parameters describing probability distributions which relate a word or part thereof to an observation, the acoustic model having been trained using first training data and adapted using second training data to said speaker, the speech recognition method also determining the likelihood of a sequence of observations occurring in a given language using a language model; and combining the likelihoods determined by the acoustic model and the language model and outputting a sequence of words identified from said speech input signal, wherein said acoustic model is context based for said speaker, said context based information being contained in said model using a plurality of decision trees, wherein the structure of said dType: ApplicationFiled: January 26, 2011Publication date: September 8, 2011Applicant: KABUSHIKI KAISHA TOSHIBAInventor: Byung Ha Chun