Patents by Inventor Katherine Mary Knill
Katherine Mary Knill has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9454963Abstract: A text-to-speech method for simulating a plurality of different voice characteristics includes dividing inputted text into a sequence of acoustic units; selecting voice characteristics for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model having a plurality of model parameters provided in clusters each having at least one sub-cluster and describing probability distributions which relate an acoustic unit to a speech vector; and outputting the sequence of speech vectors as audio with the selected voice characteristics. A parameter of a predetermined type of each probability distribution is expressed as a weighted sum of parameters of the same type using voice characteristic dependent weighting. In converting the sequence of acoustic units to a sequence of speech vectors, the voice characteristic dependent weights for the selected voice characteristics are retrieved for each cluster such that there is one weight per sub-cluster.Type: GrantFiled: March 13, 2013Date of Patent: September 27, 2016Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine, Byung Ha Chung
-
Patent number: 9269347Abstract: A text-to-speech method configured to output speech having a selected speaker voice and a selected speaker attribute, including: inputting text; dividing the inputted text into a sequence of acoustic units; selecting a speaker for the inputted text; selecting a speaker attribute for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model; and outputting the sequence of speech vectors as audio with the selected speaker voice and a selected speaker attribute. The acoustic model includes a first set of parameters relating to speaker voice and a second set of parameters relating to speaker attributes, which parameters do not overlap. The selecting a speaker voice includes selecting parameters from the first set of parameters and the selecting the speaker attribute includes selecting the parameters from the second set of parameters.Type: GrantFiled: March 15, 2013Date of Patent: February 23, 2016Assignee: Kabushiki Kaisha ToshibaInventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine
-
Publication number: 20140025382Abstract: A text to speech method, the method comprising: receiving input text; dividing said inputted text into a sequence of acoustic units; converting said sequence of acoustic units to a sequence of speech vectors using an acoustic model, wherein said model has a plurality of model parameters describing probability distributions which relate an acoustic unit to a speech vector; and outputting said sequence of speech vectors as audio, the method further comprising determining at least some of said model parameters by: extracting expressive features from said input text to form an expressive linguistic feature vector constructed in a first space; and mapping said expressive linguistic feature vector to an expressive synthesis feature vector which is constructed in a second space.Type: ApplicationFiled: July 15, 2013Publication date: January 23, 2014Inventors: Langzhou CHEN, Mark John Francis Gales, Katherine Mary Knill, Akamine Masami
-
Patent number: 8612224Abstract: A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers; the method comprising: receiving speech; dividing the speech into segments as it is received; processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising: performing primary decoding of the segment using an acoustic model and a language model; obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding; comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker; updating the selected speaker profile; performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile; outputting the decoded speech for the identified speaker, wherein the speaker profiles are updType: GrantFiled: August 23, 2011Date of Patent: December 17, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Catherine Breslin, Mark John Francis Gales, Kean Kheong Chin, Katherine Mary Knill
-
Publication number: 20130262119Abstract: A text-to-speech method configured to output speech having a selected speaker voice and a selected speaker attribute, including: inputting text; dividing the inputted text into a sequence of acoustic units; selecting a speaker for the inputted text; selecting a speaker attribute for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model; and outputting the sequence of speech vectors as audio with the selected speaker voice and a selected speaker attribute. The acoustic model includes a first set of parameters relating to speaker voice and a second set of parameters relating to speaker attributes, which parameters do not overlap. The selecting a speaker voice includes selecting parameters from the first set of parameters and the selecting the speaker attribute includes selecting the parameters from the second set of parameters.Type: ApplicationFiled: March 15, 2013Publication date: October 3, 2013Applicant: Kabushiki Kaisha ToshibaInventors: Javier LATORRE-MARTINEZ, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine
-
Publication number: 20130262109Abstract: A text-to-speech method for simulating a plurality of different voice characteristics includes dividing inputted text into a sequence of acoustic units; selecting voice characteristics for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model having a plurality of model parameters provided in clusters each having at least one sub-cluster and describing probability distributions which relate an acoustic unit to a speech vector; and outputting the sequence of speech vectors as audio with the selected voice characteristics. A parameter of a predetermined type of each probability distribution is expressed as a weighted sum of parameters of the same type using voice characteristic dependent weighting. In converting the sequence of acoustic units to a sequence of speech vectors, the voice characteristic dependent weights for the selected voice characteristics are retrieved for each cluster such that there is one weight per sub-cluster.Type: ApplicationFiled: March 13, 2013Publication date: October 3, 2013Inventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine, Byung Ha Chung
-
Publication number: 20120253811Abstract: A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers; the method comprising: receiving speech; dividing the speech into segments as it is received; processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising: performing primary decoding of the segment using an acoustic model and a language model; obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding; comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker; updating the selected speaker profile; performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile; outputting the decoded speech for the identified speaker, wherein the speaker profiles are updType: ApplicationFiled: August 23, 2011Publication date: October 4, 2012Applicant: Kabushiki Kaisha ToshibaInventors: Catherine BRESLIN, Mark John Francis Gales, Kean Kheong Chin, Katherine Mary Knill
-
Patent number: 5950159Abstract: The present invention relates to a word-spotting system and a method for finding a keyword in ascoustic data. The method includes a filler recognition phase and a keyword recognition phase wherein: during the filler recognition phase the acoustic data is processed to identify phones and to generate temporal delimiters and likelihood scores for the phones; during the keyword recognition phase, the acoustic data is processed to identify instances of a specified keyword including a sequence of phones; wherein the temporal delimiters and likelihood scores generated in the filler recognition phase are used in the keyword recognition phase.Type: GrantFiled: March 19, 1997Date of Patent: September 7, 1999Assignee: Hewlett-Packard CompanyInventor: Katherine Mary Knill