Patents by Inventor Mark Beutnagel

Mark Beutnagel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method for generalized preselection for unit selection synthesis

Patent number: 9564121

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for unit selection synthesis. The method causes a computing device to add a supplemental phoneset to a speech synthesizer front end having an existing phoneset, modify a unit preselection process based on the supplemental phoneset, preselect units from the supplemental phoneset and the existing phoneset based on the modified unit preselection process, and generate speech based on the preselected units. The supplemental phoneset can be a variation of the existing phoneset, can include a word boundary feature, can include a cluster feature where initial consonant clusters and some word boundaries are marked with diacritics, can include a function word feature which marks units as originating from a function word or a content word, and/or can include a pre-vocalic or post-vocalic feature. The speech synthesizer front end can incorporates the supplemental phoneset as an extra feature.

Type: Grant

Filed: August 7, 2014

Date of Patent: February 7, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Alistair D. Conkie, Mark Beutnagel, Yeon-Jun Kim, Ann K. Syrdal
System and Method for Generalized Preselection for Unit Selection Synthesis

Publication number: 20140350940

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for unit selection synthesis. The method causes a computing device to add a supplemental phoneset to a speech synthesizer front end having an existing phoneset, modify a unit preselection process based on the supplemental phoneset, preselect units from the supplemental phoneset and the existing phoneset based on the modified unit preselection process, and generate speech based on the preselected units. The supplemental phoneset can be a variation of the existing phoneset, can include a word boundary feature, can include a cluster feature where initial consonant clusters and some word boundaries are marked with diacritics, can include a function word feature which marks units as originating from a function word or a content word, and/or can include a pre-vocalic or post-vocalic feature. The speech synthesizer front end can incorporates the supplemental phoneset as an extra feature.

Type: Application

Filed: August 7, 2014

Publication date: November 27, 2014

Inventors: Alistair D. CONKIE, Mark BEUTNAGEL, Yeon-Jun KIM, Ann K. SYRDAL
System and method for generalized preselection for unit selection synthesis

Patent number: 8805687

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for unit selection synthesis. The method causes a computing device to add a supplemental phoneset to a speech synthesizer front end having an existing phoneset, modify a unit preselection process based on the supplemental phoneset, preselect units from the supplemental phoneset and the existing phoneset based on the modified unit preselection process, and generate speech based on the preselected units. The supplemental phoneset can be a variation of the existing phoneset, can include a word boundary feature, can include a cluster feature where initial consonant clusters and some word boundaries are marked with diacritics, can include a function word feature which marks units as originating from a function word or a content word, and/or can include a pre-vocalic or post-vocalic feature. The speech synthesizer front end can incorporates the supplemental phoneset as an extra feature.

Type: Grant

Filed: September 21, 2009

Date of Patent: August 12, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Alistair D. Conkie, Mark Beutnagel, Yeon-Jun Kim, Ann K. Syrdal
System and method for improving synthesized speech interactions of a spoken dialog system

Patent number: 8566098

Abstract: A system and method are disclosed for synthesizing speech based on a selected speech act. A method includes modifying synthesized speech of a spoken dialogue system, by (1) receiving a user utterance, (2) analyzing the user utterance to determine an appropriate speech act, and (3) generating a response of a type associated with the appropriate speech act, wherein in linguistic variables in the response are selected, based on the appropriate speech act.

Type: Grant

Filed: October 30, 2007

Date of Patent: October 22, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Ann K Syrdal, Mark Beutnagel, Alistair D Conkie, Yeon-Jun Kim
SYSTEM AND METHOD FOR GENERALIZED PRESELECTION FOR UNIT SELECTION SYNTHESIS

Publication number: 20110071836

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for unit selection synthesis. The method causes a computing device to add a supplemental phoneset to a speech synthesizer front end having an existing phoneset, modify a unit preselection process based on the supplemental phoneset, preselect units from the supplemental phoneset and the existing phoneset based on the modified unit preselection process, and generate speech based on the preselected units. The supplemental phoneset can be a variation of the existing phoneset, can include a word boundary feature, can include a cluster feature where initial consonant clusters and some word boundaries are marked with diacritics, can include a function word feature which marks units as originating from a function word or a content word, and/or can include a pre-vocalic or post-vocalic feature. The speech synthesizer front end can incorporates the supplemental phoneset as an extra feature.

Type: Application

Filed: September 21, 2009

Publication date: March 24, 2011

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Alistair D. CONKIE, Mark BEUTNAGEL, Yeon-Jun KIM, Ann K. SYRDAL
SYSTEM AND METHOD FOR IMPROVING SYNTHESIZED SPEECH INTERACTIONS OF A SPOKEN DIALOG SYSTEM

Publication number: 20090112596

Abstract: A system and method are disclosed for synthesizing speech based on a selected speech act. A method includes modifying synthesized speech of a spoken dialogue system, by (1) receiving a user utterance, (2) analyzing the user utterance to determine an appropriate speech act, and (3) generating a response of a type associated with the appropriate speech act, wherein in linguistic variables in the response are selected, based on the appropriate speech act.

Type: Application

Filed: October 30, 2007

Publication date: April 30, 2009

Applicant: AT&T Lab, Inc.

Inventors: Ann K. Syrdal, Mark Beutnagel, Alistair D. Conkie, Yeon-Jun Kim
PHONETICALLY ENRICHED LABELING IN UNIT SELECTION SPEECH SYNTHESIS

Publication number: 20080077407

Abstract: A system, method and computer-readable media are disclosed for improving speech synthesis. A text-to-speech (TTS) voice database for use in a TTS system is generated by a method comprising labeling a voice database phonemically and applying a pre-/post-vocalic distinction to the phonemic labels to generate a TTS voice database. When a system synthesizes speech using speech units from the TTS voice database, the database provides phonemes for selection using the pre-/post-vocalic distinctions which improve unit selection to render the synthetic speech more natural.

Type: Application

Filed: September 26, 2006

Publication date: March 27, 2008

Applicant: AT&T Corp.

Inventors: Mark Beutnagel, Alistair Conkie, Yeon-Jun Kim, Ann K. Syrdal
METHOD AND SYSTEM FOR ALIGNING NATURAL AND SYNTHETIC VIDEO TO SPEECH SYNTHESIS

Publication number: 20080059194

Abstract: According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text.

Type: Application

Filed: October 31, 2007

Publication date: March 6, 2008

Applicant: AT&T Corp.

Inventors: Andrea Basso, Mark Beutnagel, Joern Ostermann
Method and system for aligning natural and synthetic video to speech synthesis

Publication number: 20050119877

Abstract: According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text, and Facial Animation Parameters. In this architecture, text input is sent to a Text-To-Speech converter at a decoder that drives the mouth shapes of the face. Facial Animation Parameters are sent from an encoder to the face over the communication channel. The present invention includes codes (known as bookmarks) in the text string transmitted to the Text-to-Speech converter, which bookmarks are placed between words as well as inside them. According to the present invention, the bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. In addition, the Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text.

Type: Application

Filed: January 7, 2005

Publication date: June 2, 2005

Applicant: AT&T Corp.

Inventors: Andrea Basso, Mark Beutnagel, Joern Ostermann