Patents by Inventor Katherine Mary Knill

Katherine Mary Knill has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Text to speech method and system using voice characteristic dependent weighting

Patent number: 9454963

Abstract: A text-to-speech method for simulating a plurality of different voice characteristics includes dividing inputted text into a sequence of acoustic units; selecting voice characteristics for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model having a plurality of model parameters provided in clusters each having at least one sub-cluster and describing probability distributions which relate an acoustic unit to a speech vector; and outputting the sequence of speech vectors as audio with the selected voice characteristics. A parameter of a predetermined type of each probability distribution is expressed as a weighted sum of parameters of the same type using voice characteristic dependent weighting. In converting the sequence of acoustic units to a sequence of speech vectors, the voice characteristic dependent weights for the selected voice characteristics are retrieved for each cluster such that there is one weight per sub-cluster.

Type: Grant

Filed: March 13, 2013

Date of Patent: September 27, 2016

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine, Byung Ha Chung
Text to speech system

Patent number: 9269347

Abstract: A text-to-speech method configured to output speech having a selected speaker voice and a selected speaker attribute, including: inputting text; dividing the inputted text into a sequence of acoustic units; selecting a speaker for the inputted text; selecting a speaker attribute for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model; and outputting the sequence of speech vectors as audio with the selected speaker voice and a selected speaker attribute. The acoustic model includes a first set of parameters relating to speaker voice and a second set of parameters relating to speaker attributes, which parameters do not overlap. The selecting a speaker voice includes selecting parameters from the first set of parameters and the selecting the speaker attribute includes selecting the parameters from the second set of parameters.

Type: Grant

Filed: March 15, 2013

Date of Patent: February 23, 2016

Assignee: Kabushiki Kaisha Toshiba

Inventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine
SPEECH PROCESSING SYSTEM

Publication number: 20140025382

Abstract: A text to speech method, the method comprising: receiving input text; dividing said inputted text into a sequence of acoustic units; converting said sequence of acoustic units to a sequence of speech vectors using an acoustic model, wherein said model has a plurality of model parameters describing probability distributions which relate an acoustic unit to a speech vector; and outputting said sequence of speech vectors as audio, the method further comprising determining at least some of said model parameters by: extracting expressive features from said input text to form an expressive linguistic feature vector constructed in a first space; and mapping said expressive linguistic feature vector to an expressive synthesis feature vector which is constructed in a second space.

Type: Application

Filed: July 15, 2013

Publication date: January 23, 2014

Inventors: Langzhou CHEN, Mark John Francis Gales, Katherine Mary Knill, Akamine Masami
Speech processing system and method

Patent number: 8612224

Abstract: A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers; the method comprising: receiving speech; dividing the speech into segments as it is received; processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising: performing primary decoding of the segment using an acoustic model and a language model; obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding; comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker; updating the selected speaker profile; performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile; outputting the decoded speech for the identified speaker, wherein the speaker profiles are upd

Type: Grant

Filed: August 23, 2011

Date of Patent: December 17, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Catherine Breslin, Mark John Francis Gales, Kean Kheong Chin, Katherine Mary Knill
TEXT TO SPEECH METHOD AND SYSTEM

Publication number: 20130262109

Abstract: A text-to-speech method for simulating a plurality of different voice characteristics includes dividing inputted text into a sequence of acoustic units; selecting voice characteristics for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model having a plurality of model parameters provided in clusters each having at least one sub-cluster and describing probability distributions which relate an acoustic unit to a speech vector; and outputting the sequence of speech vectors as audio with the selected voice characteristics. A parameter of a predetermined type of each probability distribution is expressed as a weighted sum of parameters of the same type using voice characteristic dependent weighting. In converting the sequence of acoustic units to a sequence of speech vectors, the voice characteristic dependent weights for the selected voice characteristics are retrieved for each cluster such that there is one weight per sub-cluster.

Type: Application

Filed: March 13, 2013

Publication date: October 3, 2013

Inventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine, Byung Ha Chung
TEXT TO SPEECH SYSTEM

Publication number: 20130262119

Abstract: A text-to-speech method configured to output speech having a selected speaker voice and a selected speaker attribute, including: inputting text; dividing the inputted text into a sequence of acoustic units; selecting a speaker for the inputted text; selecting a speaker attribute for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model; and outputting the sequence of speech vectors as audio with the selected speaker voice and a selected speaker attribute. The acoustic model includes a first set of parameters relating to speaker voice and a second set of parameters relating to speaker attributes, which parameters do not overlap. The selecting a speaker voice includes selecting parameters from the first set of parameters and the selecting the speaker attribute includes selecting the parameters from the second set of parameters.

Type: Application

Filed: March 15, 2013

Publication date: October 3, 2013

Applicant: Kabushiki Kaisha Toshiba

Inventors: Javier LATORRE-MARTINEZ, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine
SPEECH PROCESSING SYSTEM AND METHOD

Publication number: 20120253811

Abstract: A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers; the method comprising: receiving speech; dividing the speech into segments as it is received; processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising: performing primary decoding of the segment using an acoustic model and a language model; obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding; comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker; updating the selected speaker profile; performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile; outputting the decoded speech for the identified speaker, wherein the speaker profiles are upd

Type: Application

Filed: August 23, 2011

Publication date: October 4, 2012

Applicant: Kabushiki Kaisha Toshiba

Inventors: Catherine BRESLIN, Mark John Francis Gales, Kean Kheong Chin, Katherine Mary Knill
Word spotting using both filler and phone recognition

Patent number: 5950159

Abstract: The present invention relates to a word-spotting system and a method for finding a keyword in ascoustic data. The method includes a filler recognition phase and a keyword recognition phase wherein: during the filler recognition phase the acoustic data is processed to identify phones and to generate temporal delimiters and likelihood scores for the phones; during the keyword recognition phase, the acoustic data is processed to identify instances of a specified keyword including a sequence of phones; wherein the temporal delimiters and likelihood scores generated in the filler recognition phase are used in the keyword recognition phase.

Type: Grant

Filed: March 19, 1997

Date of Patent: September 7, 1999

Assignee: Hewlett-Packard Company

Inventor: Katherine Mary Knill