Patents by Inventor Horacio Franco

Horacio Franco has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10777188
    Abstract: A computing system determines whether a reference audio signal contains a query. A time-frequency convolutional neural network (TFCNN) comprises a time and frequency convolutional layers and a series of additional layers, which include a bottleneck layer. The computation engine applies the TFCNN to samples of a query utterance at least through the bottleneck layer. A query feature vector comprises output values of the bottleneck layer generated when the computation engine applies the TFCNN to the samples of the query utterance. The computation engine also applies the TFCNN to samples of the reference audio signal at least through the bottleneck layer. A reference feature vector comprises output values of the bottleneck layer generated when the computation engine applies the TFCNN to the samples of the reference audio signal. The computation engine determines at least one detection score based on the query feature vector and the reference feature vector.
    Type: Grant
    Filed: November 14, 2018
    Date of Patent: September 15, 2020
    Assignee: SRI International
    Inventors: Julien van Hout, Vikramjit Mitra, Horacio Franco, Emre Yilmaz
  • Publication number: 20200152179
    Abstract: A computing system determines whether a reference audio signal contains a query. A time-frequency convolutional neural network (TFCNN) comprises a time and frequency convolutional layers and a series of additional layers, which include a bottleneck layer. The computation engine applies the TFCNN to samples of a query utterance at least through the bottleneck layer. A query feature vector comprises output values of the bottleneck layer generated when the computation engine applies the TFCNN to the samples of the query utterance. The computation engine also applies the TFCNN to samples of the reference audio signal at least through the bottleneck layer. A reference feature vector comprises output values of the bottleneck layer generated when the computation engine applies the TFCNN to the samples of the reference audio signal. The computation engine determines at least one detection score based on the query feature vector and the reference feature vector.
    Type: Application
    Filed: November 14, 2018
    Publication date: May 14, 2020
    Inventors: Julien van Hout, Vikramjit Mitra, Horacio Franco, Emre Yilmaz
  • Patent number: 9576570
    Abstract: The present invention relates to a method and apparatus for adding new vocabulary to interactive translation and dialog systems. In one embodiment, a method for adding a new word to a vocabulary of an interactive dialog includes receiving an input signal that includes at least one word not currently in the vocabulary, inserting the word into a dynamic component of a search graph associated with the vocabulary, and compiling the dynamic component independently of a permanent component of the search graph to produce a new sub-grammar, where the permanent component comprises a plurality of words that are permanently part of the search graph.
    Type: Grant
    Filed: July 30, 2010
    Date of Patent: February 21, 2017
    Assignee: SRI INTERNATIONAL
    Inventors: Kristin Precoda, Horacio Franco, Jing Zheng, Michael Frandsen, Victor Abrash, Murat Akbacak, Andreas Stolcke
  • Patent number: 8527270
    Abstract: The present invention relates to a method and apparatus for enhancing interactive translation and dialogue systems. In one embodiment, a method for conducting an interactive dialogue includes receiving an input signal in a first language, where the input signal includes one or more words, processing the words in accordance with a vocabulary, and adjusting a probability relating to at least one of the words in the vocabulary for an output signal. Subsequently, the method may output a translation of the input signal in a second language, in accordance with the vocabulary. In one embodiment, adjusting the probability involves adjusting a probability that the word will be used in actual conversation.
    Type: Grant
    Filed: July 30, 2010
    Date of Patent: September 3, 2013
    Assignee: SRI International
    Inventors: Kristin Precoda, Horacio Franco
  • Publication number: 20120029904
    Abstract: The present invention relates to a method and apparatus for adding new vocabulary to interactive translation and dialogue systems. In one embodiment, a method for adding a new word to a vocabulary of an interactive dialogue includes receiving an input signal that includes at least one word not currently in the vocabulary, inserting the word into a dynamic component of a search graph associated with the vocabulary, and compiling the dynamic component independently of a permanent component of the search graph to produce a new sub-grammar, where the permanent component comprises a plurality of words that are permanently part of the search graph.
    Type: Application
    Filed: July 30, 2010
    Publication date: February 2, 2012
    Inventors: KRISTIN PRECODA, HORACIO FRANCO, JING ZHENG, MICHAEL FRANDSEN, VICTOR ABRASH, MURAT AKBACAK, ANDREAS STOLCKE
  • Publication number: 20120029903
    Abstract: The present invention relates to a method and apparatus for enhancing interactive translation and dialogue systems. In one embodiment, a method for conducting an interactive dialogue includes receiving an input signal in a first language, where the input signal includes one or more words, processing the words in accordance with a vocabulary, and adjusting a probability relating to at least one of the words in the vocabulary for an output signal. Subsequently, the method may output a translation of the input signal in a second language, in accordance with the vocabulary. In one embodiment, adjusting the probability involves adjusting a probability that the word will be used in actual conversation.
    Type: Application
    Filed: July 30, 2010
    Publication date: February 2, 2012
    Inventors: KRISTIN PRECODA, HORACIO FRANCO
  • Patent number: 7756710
    Abstract: In one embodiment, the present invention is a method and apparatus for error correction in speech recognition applications. In one embodiment, a method for recognizing user speech includes receiving a first utterance from the user, receiving a subsequent utterance from the user, and combining acoustic evidence from the first utterance with acoustic evidence from the subsequent utterance in order to recognize the first utterance. It is assumed that, if the first utterance has been incorrectly recognized on a first attempt, the user will repeat the first utterance (or at least the incorrectly recognized portion of the first utterance) in the subsequent utterance.
    Type: Grant
    Filed: July 13, 2006
    Date of Patent: July 13, 2010
    Assignee: SRI International
    Inventors: Horacio Franco, Gregory Myers, Jing Zheng, Federico Cesari, Cregg Cowan
  • Publication number: 20100125458
    Abstract: In one embodiment, the present invention is a method and apparatus for error correction in speech recognition applications. In one embodiment, a method for recognizing user speech includes receiving a first utterance from the user, receiving a subsequent utterance from the user, and combining acoustic evidence from the first utterance with acoustic evidence from the subsequent utterance in order to recognize the first utterance. It is assumed that, if the first utterance has been incorrectly recognized on a first attempt, the user will repeat the first utterance (or at least the incorrectly recognized portion of the first utterance) in the subsequent utterance.
    Type: Application
    Filed: July 13, 2006
    Publication date: May 20, 2010
    Inventors: Horacio Franco, Gregory Myers, Jing Zheng, Federico Cesari, Cregg Cowan
  • Patent number: 7610199
    Abstract: The present invention relates to a method and apparatus for obtaining complete speech signals for speech recognition applications. In one embodiment, the method continuously records an audio stream comprising a sequence of frames to a circular buffer. When a user command to commence or terminate speech recognition is received, the method obtains a number of frames of the audio stream occurring before or after the user command in order to identify an augmented audio signal for speech recognition processing. In further embodiments, the method analyzes the augmented audio signal in order to locate starting and ending speech endpoints that bound at least a portion of speech to be processed for recognition. At least one of the speech endpoints is located using a Hidden Markov Model.
    Type: Grant
    Filed: September 1, 2005
    Date of Patent: October 27, 2009
    Assignee: SRI International
    Inventors: Victor Abrash, Federico Cesari, Horacio Franco, Christopher George, Jing Zheng
  • Patent number: 7571095
    Abstract: An apparatus and a concomitant method for recognizing speech in a noisy environment are provided. The present method includes applying a first interpolation weight to a clean speech model to produce a weighted clean speech model, applying a second interpolation weight to a noise model to produce a weighted noise model, and deriving a noisy speech model directly from the weighted clean speech model and the weighted noise model. At least one of the first interpolation weight and the second interpolation weight is computed in a maximum likelihood framework.
    Type: Grant
    Filed: August 31, 2005
    Date of Patent: August 4, 2009
    Assignee: SRI International
    Inventors: Martin Graciarena, Horacio Franco, Venkata Ramana Rao Gadde
  • Patent number: 7533020
    Abstract: A method and apparatus are provided for performing speech recognition using observable and meaningful relationships between words within a single utterance and using a structured data source as a source of constraints on the recognition process. Results from a first constrained speech recognition pass can be combined with information about the observable and meaningful word relationships to constrain or simplify subsequent recognition passes. This iterative process greatly reduces the search space required for each recognition pass, making the speech recognition process more efficient, faster and accurate.
    Type: Grant
    Filed: February 23, 2005
    Date of Patent: May 12, 2009
    Assignee: Nuance Communications, Inc.
    Inventors: James F. Arnold, Michael W. Frandsen, Anand Venkataraman, Douglas A. Bercow, Gregory K. Myers, David J. Israel, Venkata Ramana Rao Gadde, Horacio Franco, Harry Bratt
  • Publication number: 20060241948
    Abstract: The present invention relates to a method and apparatus for obtaining complete speech signals for speech recognition applications. In one embodiment, the method continuously records an audio stream comprising a sequence of frames to a circular buffer. When a user command to commence or terminate speech recognition is received, the method obtains a number of frames of the audio stream occurring before or after the user command in order to identify an augmented audio signal for speech recognition processing. In further embodiments, the method analyzes the augmented audio signal in order to locate starting and ending speech endpoints that bound at least a portion of speech to be processed for recognition. At least one of the speech endpoints is located using a Hidden Markov Model.
    Type: Application
    Filed: September 1, 2005
    Publication date: October 26, 2006
    Inventors: Victor Abrash, Federico Cesari, Horacio Franco, Christopher George, Jing Zheng
  • Patent number: 7120580
    Abstract: An apparatus and a concomitant method for speech recognition. In one embodiment, the present method is referred to as a “Dynamic Noise Compensation” (DNC) method where the method estimates the models for noisy speech using models for clean speech and a noise model. Specifically, the model for the noisy speech is estimated by interpolation between the clean speech model and the noise model. This approach reduces computational cycles and does not require large memory capacity.
    Type: Grant
    Filed: August 15, 2001
    Date of Patent: October 10, 2006
    Assignee: SRI International
    Inventors: Venkata Ramana Rao Gadde, Horacio Franco, John Butzberger
  • Publication number: 20060195317
    Abstract: An apparatus and a concomitant method for recognizing speech in a noisy environment are provided. The present method includes applying a first interpolation weight to a clean speech model to produce a weighted clean speech model, applying a second interpolation weight to a noise model to produce a weighted noise model, and deriving a noisy speech model directly from the weighted clean speech model and the weighted noise model. At least one of the first interpolation weight and the second interpolation weight is computed in a maximum likelihood framework.
    Type: Application
    Filed: August 31, 2005
    Publication date: August 31, 2006
    Inventors: Martin Graciarena, Horacio Franco, Venkata Gadde
  • Publication number: 20050234723
    Abstract: A method and apparatus are provided for performing speech recognition using observable and meaningful relationships between words within a single utterance and using a structured data source as a source of constraints on the recognition process. Results from a first constrained speech recognition pass can be combined with information about the observable and meaningful word relationships to constrain or simplify subsequent recognition passes. This iterative process greatly reduces the search space required for each recognition pass, making the speech recognition process more efficient, faster and accurate.
    Type: Application
    Filed: February 23, 2005
    Publication date: October 20, 2005
    Inventors: James Arnold, Michael Frandsen, Anand Venkataraman, Douglas Bercow, Gregory Myers, David Israel, Venkata Gadde, Horacio Franco, Harry Bratt
  • Publication number: 20050125224
    Abstract: A method and apparatus are provided for fusion of recognition results from multiple types of data sources. In one embodiment, the inventive method implementing a first processing technique to recognize at least a portion of terms contained in a first media source, implementing a second processing technique to recognize at least a portion of terms contained in a second media source that contains a different type of data than that contained in the first media source, and adapting the first processing technique based at least in part on results generated by the second processing technique.
    Type: Application
    Filed: November 8, 2004
    Publication date: June 9, 2005
    Inventors: Gregory Myers, Harry Bratt, Anand Venkataraman, Andreas Stolcke, Horacio Franco, Venkata Rao Gadde
  • Publication number: 20050055210
    Abstract: A method and apparatus are provided for performing speech recognition using a dynamic vocabulary. Results from a preliminary speech recognition pass can be used to update or refine a language model in order to improve the accuracy of search results and to simplify subsequent recognition passes. This iterative process greatly reduces the number of alternative hypotheses produced during each speech recognition pass, as well as the time required to process subsequent passes, making the speech recognition process faster, more efficient and more accurate. The iterative process is characterized by the use of results from one or more data set queries, where the keys used to query the data set, as well as the queries themselves, are constructed in a manner that produces more effective language models for use in subsequent attempts at decoding a given speech signal.
    Type: Application
    Filed: August 5, 2004
    Publication date: March 10, 2005
    Inventors: Anand Venkataraman, Horacio Franco, Douglas A. Bercow
  • Patent number: 6226611
    Abstract: Pronunciation quality is automatically evaluated for an utterance of speech based on one or more pronunciation scores. One type of pronunciation score is based on duration of acoustic units. Examples of acoustic units include phones and syllables. Another type of pronunciation score is based on a posterior probability that a piece of input speech corresponds to a certain model such as an HMM, given the piece of input speech. Speech may be segmented into phones and syllables for evaluation with respect to the models. The utterance of speech may be an arbitrary utterance made up of a sequence of words which had not been encountered before. Pronunciation scores are converted into grades as would be assigned by human graders. Pronunciation quality may be evaluated in a client-server language instruction environment.
    Type: Grant
    Filed: January 26, 2000
    Date of Patent: May 1, 2001
    Assignee: SRI International
    Inventors: Leonardo Neumeyer, Horacio Franco, Mitchel Weintraub, Patti Price, Vassilios Digalakis
  • Patent number: 6055498
    Abstract: Pronunciation quality is automatically evaluated for an utterance of speech based on one or more pronunciation scores. One type of pronunciation score is based on duration of acoustic units. Examples of acoustic units include phones and syllables. Another type of pronunciation score is based on a posterior probability that a piece of input speech corresponds to a certain model, such as a hidden Markov model, given the piece of input speech. Speech may be segmented into phones and syllable for evaluation with respect to the models. The utterance of speech may be an arbitrary utterance made up of a sequence of words which had not been encountered before. Pronunciation scores are converted into grades as would be assigned by human graders. Pronunciation quality may be evaluated in a client-server language instruction environment.
    Type: Grant
    Filed: October 2, 1997
    Date of Patent: April 25, 2000
    Assignee: SRI International
    Inventors: Leonardo Neumeyer, Horacio Franco, Mitchel Weintraub, Patti Price, Vassilios Digalakis