Patents by Inventor Ellen Marie Eide

Ellen Marie Eide has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and system for prompt construction for selection from a list of acoustically confusable items in spoken dialog systems

Patent number: 8909528

Abstract: A method (and system) of determining confusable list items and resolving this confusion in a spoken dialog system includes receiving user input, processing the user input and determining if a list of items needs to be played back to the user, retrieving the list to be played back to the user, identifying acoustic confusions between items on the list, changing the items on the list as necessary to remove the acoustic confusions, and playing unambiguous list items back to the user.

Type: Grant

Filed: May 9, 2007

Date of Patent: December 9, 2014

Assignee: Nuance Communications, Inc.

Inventors: Ellen Marie Eide, Vaibhava Goel, Ramesh Gopinath, Osamuyimen T. Stewart
System for tuning synthesized speech

Patent number: 8849669

Abstract: An embodiment of the invention is a software tool used to convert text, speech synthesis markup language (SSML), and/or extended SSML to synthesized audio. Provisions are provided to create, view, play, and edit the synthesized speech, including editing pitch and duration targets, speaking type, paralinguistic events, and prosody. Prosody can be provided by way of a sample recording. Users can interact with the software tool by way of a graphical user interface (GUI). The software tool can produce synthesized audio file output in many file formats.

Type: Grant

Filed: April 3, 2013

Date of Patent: September 30, 2014

Assignee: Nuance Communications, Inc.

Inventors: Raimo Bakis, Ellen Marie Eide, Roberto Pieraccini, Maria E. Smith, Jie Z. Zeng
SYSTEM FOR TUNING SYNTHESIZED SPEECH

Publication number: 20140058734

Abstract: An embodiment of the invention is a software tool used to convert text, speech synthesis markup language (SSML), and/or extended SSML to synthesized audio. Provisions are provided to create, view, play, and edit the synthesized speech, including editing pitch and duration targets, speaking type, paralinguistic events, and prosody. Prosody can be provided by way of a sample recording. Users can interact with the software tool by way of a graphical user interface (GUI). The software tool can produce synthesized audio file output in many file formats.

Type: Application

Filed: April 3, 2013

Publication date: February 27, 2014

Inventors: Raimo Bakis, Ellen Marie Eide, Roberto Pieraccini, Maria E. Smith, Jie Z. Zeng
Generating a frequency warping function based on phoneme and context

Patent number: 8401861

Abstract: A method for generating a frequency warping function comprising preparing the training speech of a source and a target speaker; performing frame alignment on the training speech of the speakers; selecting aligned frames from the frame-aligned training speech of the speakers; extracting corresponding sets of formant parameters from the selected aligned frames; and generating a frequency warping function based on the corresponding sets of formant parameters. The step of selecting aligned frames preferably selects a pair of aligned frames in the middle of the same or similar frame-aligned phonemes with the same or similar contexts in the speech of the source speaker and target speaker. The step of generating a frequency warping function preferably uses the various pairs of corresponding formant parameters in the corresponding sets of formant parameters as key positions in a piecewise linear frequency warping function to generate the frequency warping function.

Type: Grant

Filed: January 17, 2007

Date of Patent: March 19, 2013

Assignee: Nuance Communications, Inc.

Inventors: Shuang Zhi Wei, Raimo Bakis, Ellen Marie Eide, Liqin Shen
Methods and apparatus for masking latency in text-to-speech systems

Patent number: 8355484

Abstract: A technique for masking latency in an automatic dialog system is provided. A communication is received from a user at the automatic dialog system. The communication is processed in the automatic dialog system to provide a response. At least one transitional message is provided to the user from the automatic dialog system while processing the communication. A response is provided to the user from the automatic dialog system in accordance with the received communication from the user.

Type: Grant

Filed: January 8, 2007

Date of Patent: January 15, 2013

Assignee: Nuance Communications, Inc.

Inventors: Ellen Marie Eide, Wael Mohamed Hamza
On demand TTS vocabulary for a telematics system

Patent number: 8311804

Abstract: A driving directions system loads into memory a limited subset of prerecorded, spoken utterances of geographic names from a mass media storage. The subset of spoken utterances may be limited, for example, to the geographic names within a predetermined radius (e.g., a few miles) of the driver's present location. The present location of the driver may be manually entered into the driving directions system by the driver, or automatically determined using a global positioning system (“GPS”) receiver. As the vehicle moves from its present location, the driving directions system loads into memory new names from the mass media storage and overwrites, if necessary, those which are now geographically out of range. Based on the current location of the driving, the driving directions system can audibly output geographic names from the run-time memory.

Type: Grant

Filed: October 24, 2011

Date of Patent: November 13, 2012

Assignee: Nuance Communications, Inc.

Inventors: Raimo Bakis, Ellen Marie Eide, Wael Mohamed Hamza
Methods and apparatus for conveying synthetic speech style from a text-to-speech system

Patent number: 7747440

Abstract: A technique for producing speech output in a text-to-speech system is provided. A message is created for communication to a user in a natural language generator of the text-to-speech system. The message is annotated in the natural language generator with a synthetic speech output style. The message is conveyed to the user through a speech synthesis system in communication with the natural language generator, wherein the message is conveyed in accordance with the synthetic speech output style.

Type: Grant

Filed: July 1, 2008

Date of Patent: June 29, 2010

Assignee: Nuance Communications, Inc.

Inventors: Ellen Marie Eide, Wael Mohamed Hamza
APPLYING VOCAL CHARACTERISTICS FROM A TARGET SPEAKER TO A SOURCE SPEAKER FOR SYNTHETIC SPEECH

Publication number: 20090177473

Abstract: A computer implemented method, system and computer usable program code for synthesizing speech. A computer implemented method for synthesizing speech includes providing a database of speech of a source speaker, and providing a prosody model of speech of a target speaker different from the source speaker. Text input to be synthesized is received, and the prosody model of speech of the target speaker is applied to the text input to select segments of the speech of the source speaker in the database to form synthesized speech of the text input. The synthesized speech of the text input is then output.

Type: Application

Filed: January 7, 2008

Publication date: July 9, 2009

Inventors: Andrew S. Aaron, Ellen Marie Eide, Raul Fernandez
Methods and apparatus for adapting output speech in accordance with context of communication

Patent number: 7490042

Abstract: A technique for producing speech output in an automatic dialog system in accordance with a detected context is provided. Communication is received from a user at the automatic dialog system. A context of the communication from the user is detected in a context detector of the automatic dialog system. A message is created in a natural language generator of the automatic dialog system in communication with the context detector. The message is conveyed to the user through a speech synthesis system of the automatic dialog system, in communication with the natural language generator and the context detector. Responsive to a detected level of ambient noise, the context detector provides at least one command in a markup language to cause the natural language generator to create the message using maximally intelligible words and to cause the speech synthesis system to convey the message with increased volume and decreased speed.

Type: Grant

Filed: March 29, 2005

Date of Patent: February 10, 2009

Assignee: International Business Machines Corporation

Inventors: Ellen Marie Eide, Wael Mohamed Hamza, Michael Alan Picheny
Methods and Apparatus for Conveying Synthetic Speech Style from a Text-to-Speech System

Publication number: 20080300882

Abstract: A technique for producing speech output in a text-to-speech system is provided. A message is created for communication to a user in a natural language generator of the text-to-speech system. The message is annotated in the natural language generator with a synthetic speech output style. The message is conveyed to the user through a speech synthesis system in communication with the natural language generator, wherein the message is conveyed in accordance with the synthetic speech output style.

Type: Application

Filed: July 1, 2008

Publication date: December 4, 2008

Applicant: International Business Machines Corporation

Inventors: Ellen Marie Eide, Wael Mohamed Hamza
METHOD AND SYSTEM FOR PROMPT CONSTRUCTION FOR SELECTION FROM A LIST OF ACOUSTICALLY CONFUSABLE ITEMS IN SPOKEN DIALOG SYSTEMS

Publication number: 20080281598

Abstract: A method (and system) of determining confusable list items and resolving this confusion in a spoken dialog system includes receiving user input, processing the user input and determining if a list of items needs to be played back to the user, retrieving the list to be played back to the user, identifying acoustic confusions between items on the list, changing the items on the list as necessary to remove the acoustic confusions, and playing unambiguous list items back to the user.

Type: Application

Filed: May 9, 2007

Publication date: November 13, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ellen Marie Eide, Vaibhava Goel, Ramesh Gopinath, Osamuyimen T. Stewart
Methods for conveying synthetic speech style from a text-to-speech system

Patent number: 7415413

Abstract: A technique for producing speech output in a text-to-speech system is provided. A message is created for communication to a user in a natural language generator of the text-to-speech system. The message is annotated in the natural language generator with a synthetic speech output style. The message is conveyed to the user through a speech synthesis system in communication with the natural language generator, wherein the message is conveyed in accordance with the synthetic speech output style.

Type: Grant

Filed: March 29, 2005

Date of Patent: August 19, 2008

Assignee: International Business Machines Corporation

Inventors: Ellen Marie Eide, Wael Mohamed Hamza
Methods and Apparatus for Masking Latency in Text-to-Speech Systems

Publication number: 20080167874

Abstract: A technique for masking latency in an automatic dialog system is provided. A communication is received from a user at the automatic dialog system. The communication is processed in the automatic dialog system to provide a response. At least one transitional message is provided to the user from the automatic dialog system while processing the communication. A response is provided to the user from the automatic dialog system in accordance with the received communication from the user.

Type: Application

Filed: January 8, 2007

Publication date: July 10, 2008

Inventors: Ellen Marie Eide, Wael Mohamed Hamza
Method and apparatus for producing natural sounding pitch contours in a speech synthesizer

Patent number: 7280969

Abstract: A speech synthesis system is disclosed that utilizes a pitch contour resulting in a more natural-sounding speech. The present invention modifies the predicted pitch, b(t), for synthesized speech using a low frequency energy booster. The low frequency energy booster interpolates the discrete pitch values, if necessary, and increase the amount of energy of the pitch contour associated with low frequency values, such as all frequency values below 10 Hertz. The amount of energy of the pitch contour associated with low frequency values can be increased, for example, by adding band-limited noise (a carrier signal) to the pitch contour, b(t), or by filtering the pitch values with an impulse response filter having a pole at the desired low frequency value. The present invention serves to add vibrato to the to the original pitch contour, b(t), and thereby improves the naturalness of the synthetic waveform.

Type: Grant

Filed: December 7, 2000

Date of Patent: October 9, 2007

Assignee: International Business Machines Corporation

Inventors: Ellen Marie Eide, Raimo Bakis
Speech and signal digitization by using recognition metrics to select from multiple techniques

Patent number: 7016835

Abstract: A characteristic-specific digitization method and apparatus are disclosed that reduces the error rate in converting input information into a computer-readable format. The input information is analyzed and subsets of the input information are classified according to whether the input information exhibits a specific physical parameter affecting recognition accuracy. If the input information exhibits the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a characteristic-specific recognizer that demonstrates improved performance for the given physical parameter. If the input information does not exhibit the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a general recognizer that performs well for typical input information.

Type: Grant

Filed: December 19, 2002

Date of Patent: March 21, 2006

Assignee: International Business Machines Corporation

Inventors: Ellen Marie Eide, Ramesh Ambat Gopinath, Dimitri Kanevsky, Peder Andreas Olsen
Method and apparatus for masking unnatural phenomena in synthetic speech using a simulated environmental effect

Publication number: 20040102975

Abstract: A speech synthesis system is disclosed that masks any unnatural phenomena in the synthetic speech. A disclosed environmental effect processor manipulates the background environment into which the synthesized speech is embedded to thereby mask any unnatural phenomena in the synthesized speech. The environmental effect processor can manipulate the background environment, for example, by (i) adding a low level of background noise to the synthesized speech; (ii) superimposing the synthetic speech on a music waveform; or (iii) adding reverberation to the synthesized signal. The speech segments can be recorded in a quiet environment, and the background environment is manipulated in accordance with the present invention at the time of synthesis.

Type: Application

Filed: November 26, 2002

Publication date: May 27, 2004

Applicant: International Business Machines Corporation

Inventor: Ellen Marie Eide
Methods and apparatus for improving automatic digitization techniques using recognition metrics

Publication number: 20030115053

Abstract: A characteristic-specific digitization method and apparatus are disclosed that reduces the error rate in converting input information into a computer-readable format. The input information is analyzed and subsets of the input information are classified according to whether the input information exhibits a specific physical parameter affecting recognition accuracy. If the input information exhibits the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a characteristic-specific recognizer that demonstrates improved performance for the given physical parameter. If the input information does not exhibit the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a general recognizer that performs well for typical input information.

Type: Application

Filed: December 19, 2002

Publication date: June 19, 2003

Applicant: International Business Machines Corporation, Inc.

Inventors: Ellen Marie Eide, Ramesh Ambat Gopinath, Dimitri Kanevsky, Peder Andreas Olsen
Method and apparatus for producing natural sounding pitch contours in a speech synthesizer

Publication number: 20020072909

Abstract: A speech synthesis system is disclosed that utilizes a pitch contour resulting in a more natural-sounding speech. The present invention modifies the predicted pitch, b(t), for synthesized speech using a low frequency energy booster. The low frequency energy booster interpolates the discrete pitch values, if necessary, and increase the amount of energy of the pitch contour associated with low frequency values, such as all frequency values below 10 Hertz. The amount of energy of the pitch contour associated with low frequency values can be increased, for example, by adding band-limited noise (a carrier signal) to the pitch contour, b(t), or by filtering the pitch values with an impulse response filter having a pole at the desired low frequency value. The present invention serves to add vibrato to the to the original pitch contour, b(t), and thereby improves the naturalness of the synthetic waveform.

Type: Application

Filed: December 7, 2000

Publication date: June 13, 2002

Inventors: Ellen Marie Eide, Raimo Bakis
Fast vocabulary independent method and apparatus for spotting words in speech

Patent number: 6073095

Abstract: A fast vocabulary independent method for spotting words in speech utilizes a preprocessing step and a coarse-to-detailed search strategy for spotting a word/phone sequence in speech. The preprocessing includes a Viterbi-beam phone level decoding using a tree-based phone language model. The coarse search matches phone-ngrams to identify regions of speech as putative word hits, and the detailed search performs an acoustic match at the putative hits with a model of the given word included in the vocabulary of the recognizer.

Type: Grant

Filed: October 15, 1997

Date of Patent: June 6, 2000

Assignee: International Business Machines Corporation

Inventors: Satyanarayana Dharanipragada, Ellen Marie Eide, Salim Estephan Roukos
Method and apparatus for a time-synchronous tree-based search strategy

Patent number: 5884259

Abstract: A method and apparatus for using a tree structure to constrain a time-synchronous, fast search for candidate words in an acoustic stream is described. A minimum stay of three frames in each graph node visited is imposed by allowing transitions only every third frame. This constraint enables the simplest possible Markov model for each phoneme while enforcing the desired minimum duration. The fast, time-synchronous search for likely words is done for an entire sentence/utterance. The list of hypotheses beginning at each time frame is stored for providing, on-demand, lists of contender/candidate words to the asynchronous, detailed match phase of decoding.

Type: Grant

Filed: February 12, 1997

Date of Patent: March 16, 1999

Assignee: International Business Machines Corporation

Inventors: Lalit Rai Bahl, Ellen Marie Eide