Patents by Inventor Michael A. Picheny

Michael A. Picheny has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20160267906
    Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.
    Type: Application
    Filed: March 11, 2015
    Publication date: September 15, 2016
    Inventors: Brian E.D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
  • Patent number: 9195650
    Abstract: Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.
    Type: Grant
    Filed: September 23, 2014
    Date of Patent: November 24, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Sara H. Basson, Rick A. Hamilton, II, Dan Ning Jiang, Dimitri Kanevsky, David Nahamoo, Michael A. Picheny, Bhuvana Ramabhadran, Tara N. Sainath
  • Publication number: 20150120275
    Abstract: Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.
    Type: Application
    Filed: September 23, 2014
    Publication date: April 30, 2015
    Applicant: Nuance Communications, Inc.
    Inventors: Sara H. Basson, Rick A. Hamilton, II, Dan Ning Jiang, Dimitri Kanevsky, David Nahamoo, Michael A. Picheny, Bhuvana Ramabhadran, Tara N. Sainath
  • Patent number: 8924210
    Abstract: Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.
    Type: Grant
    Filed: May 28, 2014
    Date of Patent: December 30, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Sara H. Basson, Rick Hamilton, Dan Ning Jiang, Dimitri Kanevsky, David Nahamoo, Michael Picheny, Bhuvana Ramabhadran, Tara N. Sainath
  • Patent number: 8856004
    Abstract: Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.
    Type: Grant
    Filed: May 13, 2011
    Date of Patent: October 7, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Sara H. Basson, Rick Hamilton, Dan Ning Jiang, Dimitri Kanevsky, David Nahamoo, Michael Picheny, Bhuvana Ramabhadran, Tara N. Sainath
  • Publication number: 20140278410
    Abstract: Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.
    Type: Application
    Filed: May 28, 2014
    Publication date: September 18, 2014
    Applicant: Nuance Communications, Inc.
    Inventors: Sara H. Basson, Rick Hamilton, Dan Ning Jiang, Dimitri Kanevsky, David Nahamoo, Michael Picheny, Bhuvana Ramabhadran, Tara N. Sainath
  • Publication number: 20120290299
    Abstract: Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.
    Type: Application
    Filed: May 13, 2011
    Publication date: November 15, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sara H. Basson, Rick Hamilton, Dan Ning Jiang, Dimitri Kanevsky, David Nahamoo, Michael Picheny, Bhuvana Ramabhadran, Tara N. Sainath
  • Patent number: 7716052
    Abstract: A method, apparatus and a computer program product to generate an audible speech word that corresponds to text. The method includes providing a text word and, in response to the text word, processing pre-recorded speech segments that are derived from a plurality of speakers to selectively concatenate together speech segments based on at least one cost function to form audio data for generating an audible speech word that corresponds to the text word. A data structure is also provided for use in a concatenative text-to-speech system that includes a plurality of speech segments derived from a plurality of speakers, where each speech segment includes an associated attribute vector each of which is comprised of at least one attribute vector element that identifies the speaker from which the speech segment was derived.
    Type: Grant
    Filed: April 7, 2005
    Date of Patent: May 11, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Andrew S. Aaron, Ellen M. Eide, Wael M. Hamza, Michael A. Picheny, Charles T. Rutherfoord, Zhi Wei Shuang, Maria E. Smith
  • Patent number: 7702510
    Abstract: Systems and methods for dynamically selecting among text-to-speech (TTS) systems. Exemplary embodiments of the systems and methods include identifying text for converting into a speech waveform, synthesizing said text by three TTS systems, generating a candidate waveform from each of the three systems, generating a score from each of the three systems, comparing each of the three scores, selecting a score based on a criteria and selecting one of the three waveforms based on the selected of the three scores.
    Type: Grant
    Filed: January 12, 2007
    Date of Patent: April 20, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Ellen M. Eide, Raul Fernandez, Wael M. Hamza, Michael A. Picheny
  • Patent number: 7475015
    Abstract: A system and method for speech recognition includes generating a set of likely hypotheses in recognizing speech, rescoring the likely hypotheses by using semantic content by employing semantic structured language models, and scoring parse trees to identify a best sentence according to the sentence's parse tree by employing the semantic structured language models to clarify the recognized speech.
    Type: Grant
    Filed: September 5, 2003
    Date of Patent: January 6, 2009
    Assignee: International Business Machines Corporation
    Inventors: Mark E. Epstein, Hakan Erdogan, Yuqing Gao, Michael A. Picheny, Ruhi Sarikaya
  • Publication number: 20080172234
    Abstract: Systems and methods for dynamically selecting among text-to-speech (TTS) systems. Exemplary embodiments of the systems and methods include identifying text for converting into a speech waveform, synthesizing said text by three TTS systems, generating a candidate waveform from each of the three systems, generating a score from each of the three systems, comparing each of the three scores, selecting a score based on a criteria and selecting one of the three waveforms based on the selected of the three scores.
    Type: Application
    Filed: January 12, 2007
    Publication date: July 17, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ellen M. Eide, Raul Fernandez, Wael M. Hamza, Michael A. Picheny
  • Publication number: 20080167876
    Abstract: A method and computer program product for providing paraphrasing in a text-to-speech (TTS) system is provided. The method includes receiving an input text, parsing the input text, and determining a paraphrase of the input text. The method also includes synthesizing the paraphrase into synthesized speech. The method further includes selecting synthesized speech to output, which includes: assigning a score to each synthesized speech associated with each paraphrase, comparing the score of each synthesized speech associated with each paraphrase, and selecting the top-scoring synthesized speech to output. Furthermore, the method includes outputting the selected synthesized speech.
    Type: Application
    Filed: January 4, 2007
    Publication date: July 10, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Raimo Bakis, Ellen M. Eide, Wael Hamza, Michael A. Picheny
  • Publication number: 20060229873
    Abstract: A technique for producing speech output in an automatic dialog system is provided. Communication is received from a user at the automatic dialog system. A context of the communication from the user is detected in a context detector of the automatic dialog system. A message is provided to the user from a text-to-speech system of the automatic dialog system in communication with the context detector, wherein the message is provided in accordance with the detected context of the communication.
    Type: Application
    Filed: March 29, 2005
    Publication date: October 12, 2006
    Applicant: International Business Machines Corporation
    Inventors: Ellen Eide, Wael Hamza, Michael Picheny
  • Publication number: 20060229876
    Abstract: A method, apparatus and a computer program product to generate an audible speech word that corresponds to text. The method includes providing a text word and, in response to the text word, processing pre-recorded speech segments that are derived from a plurality of speakers to selectively concatenate together speech segments based on at least one cost function to form audio data for generating an audible speech word that corresponds to the text word. A data structure is also provided for use in a concatenative text-to-speech system that includes a plurality of speech segments derived from a plurality of speakers, where each speech segment includes an associated attribute vector each of which is comprised of at least one attribute vector element that identifies the speaker from which the speech segment was derived.
    Type: Application
    Filed: April 7, 2005
    Publication date: October 12, 2006
    Inventors: Andrew Aaron, Ellen Eide, Wael Hamza, Michael Picheny, Charles Rutherfoord, Zhi Shuang, Maria Smith
  • Patent number: 7054810
    Abstract: N sets of feature vectors are generated from a set of observation vectors which are indicative of a pattern which it is desired to recognize. At least one of the sets of feature vectors is different than at least one other of the sets of feature vectors, and is preselected for purposes of containing at least some complimentary information with regard to the at least one other set of feature vectors. The N sets of feature vectors are combined in a manner to obtain an optimized set of feature vectors which best represents the pattern. The combination is performed via one of a weighted likelihood combination scheme and a rank-based state-selection scheme; preferably, it is done in accordance with an equation set forth herein. In one aspect, a weighted likelihood combination can be employed, while in another aspect, rank-based state selection can be employed. An apparatus suitable for performing the method is described, and implementation in a computer program product is also contemplated.
    Type: Grant
    Filed: October 1, 2001
    Date of Patent: May 30, 2006
    Assignee: International Business Machines Corporation
    Inventors: Yuging Gao, Michael A. Picheny, Bhuvana Ramabhadran
  • Patent number: 7043432
    Abstract: In a text-to-speech system, a method of converting text-to-speech can include receiving a text input and comparing the received text input to at least one entry in a text-to-speech cache memory. Each entry in the text-to-speech cache memory can specify a corresponding spoken output. If the text input matches one of the entries in the text-to-speech cache memory, the cached speech output specified by the matching entry can be provided.
    Type: Grant
    Filed: August 29, 2001
    Date of Patent: May 9, 2006
    Assignee: International Business Machines Corporation
    Inventors: Raimo Bakis, Hari Chittaluru, Edward A. Epstein, Steven J. Friedland, Abraham Ittycheriah, Stephen G. Lawrence, Michael A. Picheny, Charles Rutherfoord, Maria E. Smith
  • Publication number: 20060074634
    Abstract: A method, apparatus and computer instructions is provided for fast semi-automatic semantic annotation. Given a limited annotated corpus, the present invention assigns a tag and a label to each word of the next limited annotated corpus using a parser engine, a similarity engine, and a SVM engine. A rover then combines the parse trees from the three engines and annotates the next chunk of limited annotated corpus with confidence, such that the efforts required for human annotation is reduced.
    Type: Application
    Filed: October 6, 2004
    Publication date: April 6, 2006
    Applicant: International Business Machines Corporation
    Inventors: Yuqing Gao, Michael Picheny, Ruhi Sarikaya
  • Patent number: 6925154
    Abstract: Techniques for providing an automated conversational name dialing system for placing a call in response to an input by a user. One technique begins with the step of analyzing an input from a user, wherein the input includes information directed to identifying an intended recipient of a telephone call from the user. At least one candidate for the intended recipient is identified in response to the input, wherein the at least one candidate represents at least one potential match between the intended recipient and a predetermined vocabulary. A confidence measure indicative of a likelihood that the at least one candidate is the intended recipient is determined, and additional information is obtained from the user to increase the likelihood that the at least one candidate is the intended recipient, based on the determined confidence measure.
    Type: Grant
    Filed: May 3, 2002
    Date of Patent: August 2, 2005
    Assignee: International Business Machines Corproation
    Inventors: Yuqing Gao, Bhuvana Ramabhadran, Chengjun Julian Chen, Hakan Erdogan, Michael A. Picheny
  • Publication number: 20050119885
    Abstract: In a speech recognition system, the combination of a log-linear model with a multitude of speech features is provided to recognize unknown speech utterances. The speech recognition system models the posterior probability of linguistic units relevant to speech recognition using a log-linear model. The posterior model captures the probability of the linguistic unit given the observed speech features and the parameters of the posterior model. The posterior model may be determined using the probability of the word sequence hypotheses given a multitude of speech features. Log-linear models are used with features derived from sparse or incomplete data. The speech features that are utilized may include asynchronous, overlapping, and statistically non-independent speech features. Not all features used in training need to appear in testing/recognition.
    Type: Application
    Filed: November 28, 2003
    Publication date: June 2, 2005
    Inventors: Scott Axelrod, Sreeram Balakrishnan, Stanley Chen, Yuging Gao, Ramesh Gopinath, Hong-Kwang Kuo, Benoit Maison, David Nahamoo, Michael Picheny, George Saon, Geoffrey Zweig
  • Publication number: 20050055209
    Abstract: A system and method for speech recognition includes generating a set of likely hypotheses in recognizing speech, rescoring the likely hypotheses by using semantic content by employing semantic structured language models, and scoring parse trees to identify a best sentence according to the sentence's parse tree by employing the semantic structured language models to clarify the recognized speech.
    Type: Application
    Filed: September 5, 2003
    Publication date: March 10, 2005
    Inventors: Mark Epstein, Hakan Erdogan, Yuqing Gao, Michael Picheny, Ruhi Sarikaya