Patents by Inventor Franz Gerl

Franz Gerl has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9530432
    Abstract: This invention provides a method for determining, in a speech dialog system issuing speech prompts, a score value as an indicator for the presence of a wanted signal component in an input signal stemming from a microphone, comprising the steps of: using a first likelihood function to determine a first likelihood value for the presence of the wanted signal component in the input signal, using a second likelihood function to determine a second likelihood value for the presence of a noise signal component in the input signal, and determining a score value based on the first and the second likelihood values, wherein the first likelihood function is based on a predetermined reference wanted signal, and the second likelihood function is based on a predetermined reference noise signal.
    Type: Grant
    Filed: July 22, 2009
    Date of Patent: December 27, 2016
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Tobias Herbig, Franz Gerl
  • Patent number: 9026438
    Abstract: A method for detecting barge-in in a speech dialog system comprising determining whether a speech prompt is output by the speech dialog system, and detecting whether speech activity is present in an input signal based on a time-varying sensitivity threshold of a speech activity detector and/or based on speaker information, where the sensitivity threshold is increased if output of a speech prompt is determined and decreased if no output of a speech prompt is determined. If speech activity is detected in the input signal, the speech prompt may be interrupted or faded out. A speech dialog system configured to detect barge-in is also disclosed.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: May 5, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Markus Buck, Franz Gerl, Tim Haulick, Tobias Herbig, Gerhard Uwe Schmidt, Matthias Schulz
  • Patent number: 8706483
    Abstract: A system enhances the quality of a digital speech signal that may include noise. The system identifies vocal expressions that correspond to the digital speech signal. A signal-to-noise ratio of the digital speech signal is measured before a portion of the digital speech signal is synthesized. The selected portion of the digital speech signal may have a signal-to-noise ratio below a predetermined level and the synthesis of the digital speech signal may be based on speaker identification.
    Type: Grant
    Filed: October 20, 2008
    Date of Patent: April 22, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Franz Gerl, Tobias Herbig, Mohamed Krini, Gerhard Uwe Schmidt
  • Patent number: 8666743
    Abstract: The invention provides a speech recognition method for selecting a combination of list elements via a speech input, wherein a first list element of the combination is part of a first set of list elements and a second list element of the combination is part of a second set of list elements, the method comprising the steps of receiving the speech input, comparing each list element of the first set with the speech input to obtain a first candidate list of best matching list elements, processing the second set using the first candidate list to obtain a subset of the second set, comparing each list element of the subset of the second set with the speech input to obtain a second candidate list of best matching list elements, and selecting a combination of list elements using the first and the second candidate list.
    Type: Grant
    Filed: June 2, 2010
    Date of Patent: March 4, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Markus Schwarz, Matthias Schulz, Marc Biedert, Christian Hillebrecht, Franz Gerl, Udo Haiber
  • Patent number: 8346551
    Abstract: A method for adapting a codebook for speech recognition, wherein the codebook is from a set of codebooks comprising a speaker-independent codebook and at least one speaker dependent codebook. A speech input is received and a feature vector based on the received speech input is determined. For each of the Gaussian densities, a first mean vector is estimated using an expectation process and taking into account the determined feature vector. For each of the Gaussian densities, a second mean vector using an Eigenvoice adaptation is determined taking into account the determined feature vector. For each of the Gaussian densities, the mean vector is set to a convex combination of the first and the second mean vector. Thus, this process allows for adaptation during operation and does not require a lengthy training phase.
    Type: Grant
    Filed: November 20, 2009
    Date of Patent: January 1, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Tobias Herbig, Franz Gerl
  • Patent number: 8131544
    Abstract: A system distinguishes a primary audio source and background noise to improve the quality of an audio signal. A speech signal from a microphone may be improved by identifying and dampening background noise to enhance speech. Stochastic models may be used to model speech and to model background noise. The models may determine which portions of the signal are speech and which portions are noise. The distinction may be used to improve the signal's quality, and for speaker identification or verification.
    Type: Grant
    Filed: November 12, 2008
    Date of Patent: March 6, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Tobias Herbig, Oliver Gaupp, Franz Gerl
  • Publication number: 20100305947
    Abstract: The invention provides a speech recognition method for selecting a combination of list elements via a speech input, wherein a first list element of the combination is part of a first set of list elements and a second list element of the combination is part of a second set of list elements, the method comprising the steps of receiving the speech input, comparing each list element of the first set with the speech input to obtain a first candidate list of best matching list elements, processing the second set using the first candidate list to obtain a subset of the second set, comparing each list element of the subset of the second set with the speech input to obtain a second candidate list of best matching list elements, and selecting a combination of list elements using the first and the second candidate list.
    Type: Application
    Filed: June 2, 2010
    Publication date: December 2, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Markus Schwarz, Matthias Schulz, Marc Biedert, Christian Hillebrecht, Franz Gerl, Udo Haiber
  • Publication number: 20100198598
    Abstract: A method for recognizing a speaker of an utterance in a speech recognition system is disclosed. A likelihood score for each of a plurality of speaker models for different speakers is determined. The likelihood score indicating how well the speaker model corresponds to the utterance. For each of the plurality of speaker models, a probability that the utterance originates from that speaker is determined. The probability is determined based on the likelihood score for the speaker model and requires the estimation of a distribution of likelihood scores expected based at least in part on the training state of the speaker.
    Type: Application
    Filed: February 4, 2010
    Publication date: August 5, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Tobias Herbig, Franz Gerl
  • Publication number: 20100138222
    Abstract: A method for adapting a codebook for speech recognition, wherein the codebook is from a set of codebooks comprising a speaker-independent codebook and at least one speaker-dependent codebook is disclosed. A speech input is received and a feature vector based on the received speech input is determined. For each of the Gaussian densities, a first mean vector is estimated using an expectation process and taking into account the determined feature vector. For each of the Gaussian densities, a second mean vector using an Eigenvoice adaptation is determined taking into account the determined feature vector. For each of the Gaussian densities, the mean vector is set to a convex combination of the first and the second mean vector. Thus, this process allows for adaptation during operation and does not require a lengthy training phase.
    Type: Application
    Filed: November 20, 2009
    Publication date: June 3, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Tobias Herbig, Franz Gerl
  • Publication number: 20100030558
    Abstract: This invention provides a method for determining, in a speech dialogue system issuing speech prompts, a score value as an indicator for the presence of a wanted signal component in an input signal stemming from a microphone, comprising the steps of: using a first likelihood function to determine a first likelihood value for the presence of the wanted signal component in the input signal, using a second likelihood function to determine a second likelihood value for the presence of a noise signal component in the input signal, and determining a score value based on the first and the second likelihood values, wherein the first likelihood function is based on a predetermined reference wanted signal, and the second likelihood function is based on a predetermined reference noise signal.
    Type: Application
    Filed: July 22, 2009
    Publication date: February 4, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Tobias Herbig, Franz Gerl
  • Publication number: 20090254342
    Abstract: A method for detecting barge-in in a speech dialogue system comprising determining whether a speech prompt is output by the speech dialogue system, and detecting whether speech activity is present in an input signal based on a time-varying sensitivity threshold of a speech activity detector and/or based on speaker information, where the sensitivity threshold is increased if output of a speech prompt is determined and decreased if no output of a speech prompt is determined. If speech activity is detected in the input signal, the speech prompt may be interrupted or faded out. A speech dialogue system configured to detect barge-in is also disclosed.
    Type: Application
    Filed: March 31, 2009
    Publication date: October 8, 2009
    Applicant: Harman Becker Automotive Systems GmbH
    Inventors: Markus Buck, Franz Gerl, Tim Haulick, Tobias Herbig, Gerhard Uwe Schmidt, Matthias Schulz
  • Publication number: 20090228272
    Abstract: A system distinguishes a primary audio source and background noise to improve the quality of an audio signal. A speech signal from a microphone may be improved by identifying and dampening background noise to enhance speech. Stochastic models may be used to model speech and to model background noise. The models may determine which portions of the signal are speech and which portions are noise. The distinction may be used to improve the signal's quality, and for speaker identification or verification.
    Type: Application
    Filed: November 12, 2008
    Publication date: September 10, 2009
    Inventors: Tobias Herbig, Oliver Gaupp, Franz Gerl
  • Publication number: 20090182559
    Abstract: A system enables devices to recognize and process speech. The system includes a database that retains one or more lexical lists. A speech input detects a verbal utterance and generates a speech signal corresponding to the detected verbal utterance. A processor generates a phonetic representation of the speech signal that is designated a first recognition result. The processor generates variants of the phonetic representation based on context information provided by the phonetic representation. One or more of the variants of the phonetic representation selected by the processor are designated as a second recognition result. The processor matches the second recognition result with stored phonetic representations of one or more of the stored lexical lists.
    Type: Application
    Filed: October 7, 2008
    Publication date: July 16, 2009
    Inventors: Franz Gerl, Christian Hillebrecht, Roland Romer, Ulrich Schatz
  • Publication number: 20090119103
    Abstract: A method automatically recognizes speech received through an input. The method accesses one or more speaker-independent speaker models. The method detects whether the received speech input matches a speaker model according to an adaptable predetermined criterion. The method creates a speaker model assigned to a speaker model set when no match occurs based on the input.
    Type: Application
    Filed: October 10, 2008
    Publication date: May 7, 2009
    Inventors: Franz Gerl, Tobias Herbig
  • Publication number: 20090119096
    Abstract: A system enhances the quality of a digital speech signal that may include noise. The system identifies vocal expressions that correspond to the digital speech signal. A signal-to-noise ratio of the digital speech signal is measured before a portion of the digital speech signal is synthesized. The selected portion of the digital speech signal may have a signal-to-noise ratio below a predetermined level and the synthesis of the digital speech signal may be based on speaker identification.
    Type: Application
    Filed: October 20, 2008
    Publication date: May 7, 2009
    Inventors: Franz Gerl, Tobias Herbig, Mohamed Krini, Gerhard Uwe Schmidt
  • Publication number: 20080065382
    Abstract: A system and method for detecting a refrain in an audio file having vocal components. The method and system includes generating a phonetic transcription of a portion of the audio file, analyzing the phonetic transcription and identifying a vocal segment in the generated phonetic transcription that is repeated frequently. The method and system further relate to the speech-driven selection based on similarity of detected refrain and user input.
    Type: Application
    Filed: February 12, 2007
    Publication date: March 13, 2008
    Applicant: Harman Becker Automotive Systems GmbH
    Inventors: Franz GERL, Daniel Willett, Raymond Brueckner
  • Publication number: 20070156405
    Abstract: A speech recognition system receives digital data. The system determines whether a memory contains some or all of the digital data. When some or all of the digital data does not exist in the memory, the system generates a transcription of the missing parts and stores the missing portion and a corresponding transcription in the memory.
    Type: Application
    Filed: November 21, 2006
    Publication date: July 5, 2007
    Inventors: Matthias Schulz, Franz Gerl, Markus Schwarz, Andreas Kosmala, Barbel Jeschke