Patents by Inventor Ellen Marie Eide
Ellen Marie Eide has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8909528Abstract: A method (and system) of determining confusable list items and resolving this confusion in a spoken dialog system includes receiving user input, processing the user input and determining if a list of items needs to be played back to the user, retrieving the list to be played back to the user, identifying acoustic confusions between items on the list, changing the items on the list as necessary to remove the acoustic confusions, and playing unambiguous list items back to the user.Type: GrantFiled: May 9, 2007Date of Patent: December 9, 2014Assignee: Nuance Communications, Inc.Inventors: Ellen Marie Eide, Vaibhava Goel, Ramesh Gopinath, Osamuyimen T. Stewart
-
Patent number: 8849669Abstract: An embodiment of the invention is a software tool used to convert text, speech synthesis markup language (SSML), and/or extended SSML to synthesized audio. Provisions are provided to create, view, play, and edit the synthesized speech, including editing pitch and duration targets, speaking type, paralinguistic events, and prosody. Prosody can be provided by way of a sample recording. Users can interact with the software tool by way of a graphical user interface (GUI). The software tool can produce synthesized audio file output in many file formats.Type: GrantFiled: April 3, 2013Date of Patent: September 30, 2014Assignee: Nuance Communications, Inc.Inventors: Raimo Bakis, Ellen Marie Eide, Roberto Pieraccini, Maria E. Smith, Jie Z. Zeng
-
Publication number: 20140058734Abstract: An embodiment of the invention is a software tool used to convert text, speech synthesis markup language (SSML), and/or extended SSML to synthesized audio. Provisions are provided to create, view, play, and edit the synthesized speech, including editing pitch and duration targets, speaking type, paralinguistic events, and prosody. Prosody can be provided by way of a sample recording. Users can interact with the software tool by way of a graphical user interface (GUI). The software tool can produce synthesized audio file output in many file formats.Type: ApplicationFiled: April 3, 2013Publication date: February 27, 2014Inventors: Raimo Bakis, Ellen Marie Eide, Roberto Pieraccini, Maria E. Smith, Jie Z. Zeng
-
Patent number: 8401861Abstract: A method for generating a frequency warping function comprising preparing the training speech of a source and a target speaker; performing frame alignment on the training speech of the speakers; selecting aligned frames from the frame-aligned training speech of the speakers; extracting corresponding sets of formant parameters from the selected aligned frames; and generating a frequency warping function based on the corresponding sets of formant parameters. The step of selecting aligned frames preferably selects a pair of aligned frames in the middle of the same or similar frame-aligned phonemes with the same or similar contexts in the speech of the source speaker and target speaker. The step of generating a frequency warping function preferably uses the various pairs of corresponding formant parameters in the corresponding sets of formant parameters as key positions in a piecewise linear frequency warping function to generate the frequency warping function.Type: GrantFiled: January 17, 2007Date of Patent: March 19, 2013Assignee: Nuance Communications, Inc.Inventors: Shuang Zhi Wei, Raimo Bakis, Ellen Marie Eide, Liqin Shen
-
Patent number: 8355484Abstract: A technique for masking latency in an automatic dialog system is provided. A communication is received from a user at the automatic dialog system. The communication is processed in the automatic dialog system to provide a response. At least one transitional message is provided to the user from the automatic dialog system while processing the communication. A response is provided to the user from the automatic dialog system in accordance with the received communication from the user.Type: GrantFiled: January 8, 2007Date of Patent: January 15, 2013Assignee: Nuance Communications, Inc.Inventors: Ellen Marie Eide, Wael Mohamed Hamza
-
Patent number: 8311804Abstract: A driving directions system loads into memory a limited subset of prerecorded, spoken utterances of geographic names from a mass media storage. The subset of spoken utterances may be limited, for example, to the geographic names within a predetermined radius (e.g., a few miles) of the driver's present location. The present location of the driver may be manually entered into the driving directions system by the driver, or automatically determined using a global positioning system (“GPS”) receiver. As the vehicle moves from its present location, the driving directions system loads into memory new names from the mass media storage and overwrites, if necessary, those which are now geographically out of range. Based on the current location of the driving, the driving directions system can audibly output geographic names from the run-time memory.Type: GrantFiled: October 24, 2011Date of Patent: November 13, 2012Assignee: Nuance Communications, Inc.Inventors: Raimo Bakis, Ellen Marie Eide, Wael Mohamed Hamza
-
Patent number: 7747440Abstract: A technique for producing speech output in a text-to-speech system is provided. A message is created for communication to a user in a natural language generator of the text-to-speech system. The message is annotated in the natural language generator with a synthetic speech output style. The message is conveyed to the user through a speech synthesis system in communication with the natural language generator, wherein the message is conveyed in accordance with the synthetic speech output style.Type: GrantFiled: July 1, 2008Date of Patent: June 29, 2010Assignee: Nuance Communications, Inc.Inventors: Ellen Marie Eide, Wael Mohamed Hamza
-
Publication number: 20090177473Abstract: A computer implemented method, system and computer usable program code for synthesizing speech. A computer implemented method for synthesizing speech includes providing a database of speech of a source speaker, and providing a prosody model of speech of a target speaker different from the source speaker. Text input to be synthesized is received, and the prosody model of speech of the target speaker is applied to the text input to select segments of the speech of the source speaker in the database to form synthesized speech of the text input. The synthesized speech of the text input is then output.Type: ApplicationFiled: January 7, 2008Publication date: July 9, 2009Inventors: Andrew S. Aaron, Ellen Marie Eide, Raul Fernandez
-
Patent number: 7490042Abstract: A technique for producing speech output in an automatic dialog system in accordance with a detected context is provided. Communication is received from a user at the automatic dialog system. A context of the communication from the user is detected in a context detector of the automatic dialog system. A message is created in a natural language generator of the automatic dialog system in communication with the context detector. The message is conveyed to the user through a speech synthesis system of the automatic dialog system, in communication with the natural language generator and the context detector. Responsive to a detected level of ambient noise, the context detector provides at least one command in a markup language to cause the natural language generator to create the message using maximally intelligible words and to cause the speech synthesis system to convey the message with increased volume and decreased speed.Type: GrantFiled: March 29, 2005Date of Patent: February 10, 2009Assignee: International Business Machines CorporationInventors: Ellen Marie Eide, Wael Mohamed Hamza, Michael Alan Picheny
-
Publication number: 20080300882Abstract: A technique for producing speech output in a text-to-speech system is provided. A message is created for communication to a user in a natural language generator of the text-to-speech system. The message is annotated in the natural language generator with a synthetic speech output style. The message is conveyed to the user through a speech synthesis system in communication with the natural language generator, wherein the message is conveyed in accordance with the synthetic speech output style.Type: ApplicationFiled: July 1, 2008Publication date: December 4, 2008Applicant: International Business Machines CorporationInventors: Ellen Marie Eide, Wael Mohamed Hamza
-
Publication number: 20080281598Abstract: A method (and system) of determining confusable list items and resolving this confusion in a spoken dialog system includes receiving user input, processing the user input and determining if a list of items needs to be played back to the user, retrieving the list to be played back to the user, identifying acoustic confusions between items on the list, changing the items on the list as necessary to remove the acoustic confusions, and playing unambiguous list items back to the user.Type: ApplicationFiled: May 9, 2007Publication date: November 13, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ellen Marie Eide, Vaibhava Goel, Ramesh Gopinath, Osamuyimen T. Stewart
-
Patent number: 7415413Abstract: A technique for producing speech output in a text-to-speech system is provided. A message is created for communication to a user in a natural language generator of the text-to-speech system. The message is annotated in the natural language generator with a synthetic speech output style. The message is conveyed to the user through a speech synthesis system in communication with the natural language generator, wherein the message is conveyed in accordance with the synthetic speech output style.Type: GrantFiled: March 29, 2005Date of Patent: August 19, 2008Assignee: International Business Machines CorporationInventors: Ellen Marie Eide, Wael Mohamed Hamza
-
Publication number: 20080167874Abstract: A technique for masking latency in an automatic dialog system is provided. A communication is received from a user at the automatic dialog system. The communication is processed in the automatic dialog system to provide a response. At least one transitional message is provided to the user from the automatic dialog system while processing the communication. A response is provided to the user from the automatic dialog system in accordance with the received communication from the user.Type: ApplicationFiled: January 8, 2007Publication date: July 10, 2008Inventors: Ellen Marie Eide, Wael Mohamed Hamza
-
Patent number: 7280969Abstract: A speech synthesis system is disclosed that utilizes a pitch contour resulting in a more natural-sounding speech. The present invention modifies the predicted pitch, b(t), for synthesized speech using a low frequency energy booster. The low frequency energy booster interpolates the discrete pitch values, if necessary, and increase the amount of energy of the pitch contour associated with low frequency values, such as all frequency values below 10 Hertz. The amount of energy of the pitch contour associated with low frequency values can be increased, for example, by adding band-limited noise (a carrier signal) to the pitch contour, b(t), or by filtering the pitch values with an impulse response filter having a pole at the desired low frequency value. The present invention serves to add vibrato to the to the original pitch contour, b(t), and thereby improves the naturalness of the synthetic waveform.Type: GrantFiled: December 7, 2000Date of Patent: October 9, 2007Assignee: International Business Machines CorporationInventors: Ellen Marie Eide, Raimo Bakis
-
Patent number: 7016835Abstract: A characteristic-specific digitization method and apparatus are disclosed that reduces the error rate in converting input information into a computer-readable format. The input information is analyzed and subsets of the input information are classified according to whether the input information exhibits a specific physical parameter affecting recognition accuracy. If the input information exhibits the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a characteristic-specific recognizer that demonstrates improved performance for the given physical parameter. If the input information does not exhibit the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a general recognizer that performs well for typical input information.Type: GrantFiled: December 19, 2002Date of Patent: March 21, 2006Assignee: International Business Machines CorporationInventors: Ellen Marie Eide, Ramesh Ambat Gopinath, Dimitri Kanevsky, Peder Andreas Olsen
-
Publication number: 20040102975Abstract: A speech synthesis system is disclosed that masks any unnatural phenomena in the synthetic speech. A disclosed environmental effect processor manipulates the background environment into which the synthesized speech is embedded to thereby mask any unnatural phenomena in the synthesized speech. The environmental effect processor can manipulate the background environment, for example, by (i) adding a low level of background noise to the synthesized speech; (ii) superimposing the synthetic speech on a music waveform; or (iii) adding reverberation to the synthesized signal. The speech segments can be recorded in a quiet environment, and the background environment is manipulated in accordance with the present invention at the time of synthesis.Type: ApplicationFiled: November 26, 2002Publication date: May 27, 2004Applicant: International Business Machines CorporationInventor: Ellen Marie Eide
-
Publication number: 20030115053Abstract: A characteristic-specific digitization method and apparatus are disclosed that reduces the error rate in converting input information into a computer-readable format. The input information is analyzed and subsets of the input information are classified according to whether the input information exhibits a specific physical parameter affecting recognition accuracy. If the input information exhibits the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a characteristic-specific recognizer that demonstrates improved performance for the given physical parameter. If the input information does not exhibit the specific physical parameter affecting recognition accuracy, the characteristic-specific digitization system recognizes the input information using a general recognizer that performs well for typical input information.Type: ApplicationFiled: December 19, 2002Publication date: June 19, 2003Applicant: International Business Machines Corporation, Inc.Inventors: Ellen Marie Eide, Ramesh Ambat Gopinath, Dimitri Kanevsky, Peder Andreas Olsen
-
Publication number: 20020072909Abstract: A speech synthesis system is disclosed that utilizes a pitch contour resulting in a more natural-sounding speech. The present invention modifies the predicted pitch, b(t), for synthesized speech using a low frequency energy booster. The low frequency energy booster interpolates the discrete pitch values, if necessary, and increase the amount of energy of the pitch contour associated with low frequency values, such as all frequency values below 10 Hertz. The amount of energy of the pitch contour associated with low frequency values can be increased, for example, by adding band-limited noise (a carrier signal) to the pitch contour, b(t), or by filtering the pitch values with an impulse response filter having a pole at the desired low frequency value. The present invention serves to add vibrato to the to the original pitch contour, b(t), and thereby improves the naturalness of the synthetic waveform.Type: ApplicationFiled: December 7, 2000Publication date: June 13, 2002Inventors: Ellen Marie Eide, Raimo Bakis
-
Patent number: 6073095Abstract: A fast vocabulary independent method for spotting words in speech utilizes a preprocessing step and a coarse-to-detailed search strategy for spotting a word/phone sequence in speech. The preprocessing includes a Viterbi-beam phone level decoding using a tree-based phone language model. The coarse search matches phone-ngrams to identify regions of speech as putative word hits, and the detailed search performs an acoustic match at the putative hits with a model of the given word included in the vocabulary of the recognizer.Type: GrantFiled: October 15, 1997Date of Patent: June 6, 2000Assignee: International Business Machines CorporationInventors: Satyanarayana Dharanipragada, Ellen Marie Eide, Salim Estephan Roukos
-
Patent number: 5884259Abstract: A method and apparatus for using a tree structure to constrain a time-synchronous, fast search for candidate words in an acoustic stream is described. A minimum stay of three frames in each graph node visited is imposed by allowing transitions only every third frame. This constraint enables the simplest possible Markov model for each phoneme while enforcing the desired minimum duration. The fast, time-synchronous search for likely words is done for an entire sentence/utterance. The list of hypotheses beginning at each time frame is stored for providing, on-demand, lists of contender/candidate words to the asynchronous, detailed match phase of decoding.Type: GrantFiled: February 12, 1997Date of Patent: March 16, 1999Assignee: International Business Machines CorporationInventors: Lalit Rai Bahl, Ellen Marie Eide