Patents by Inventor Rainer Gruhn

Rainer Gruhn has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9595257
    Abstract: An approach for phoneme recognition is described. A sequence of intermediate output posterior vectors is generated from an input sequence of cepstral features using a first layer perceptron. The intermediate output posterior vectors are then downsampled to form a reduced input set of intermediate posterior vectors for a second layer perceptron. A sequence of final posterior vectors is generated from the reduced input set of intermediate posterior vectors using the second layer perceptron. Then the final posterior vectors are decoded to determine an output recognized phoneme sequence representative of the input sequence of cepstral features.
    Type: Grant
    Filed: September 28, 2009
    Date of Patent: March 14, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Daniel Andrés Vásquez Cano, Guillermo Aradilla, Rainer Gruhn
  • Patent number: 8554555
    Abstract: The invention provides a method for automated training of a plurality of artificial neural networks for phoneme recognition using training data, wherein the training data comprises speech signals subdivided into frames, each frame associated with a phoneme label, wherein the phoneme label indicates a phoneme associated with the frame. A sequence of frames from the training data are provided, wherein the number of frames in the sequence of frames is at least equal to the number of artificial neural networks. Each of the artificial neural networks is assigned a different subsequence of the provided sequence, wherein each subsequence comprises a predetermined number of frames. A common phoneme label for the sequence of frames is determined based on the phoneme labels of one or more frames of one or more subsequences of the provided sequence. Each artificial neural network using the common phoneme label.
    Type: Grant
    Filed: February 17, 2010
    Date of Patent: October 8, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Rainer Gruhn, Daniel Vasquez, Guillermo Aradilla
  • Patent number: 8340958
    Abstract: A system and method are provided for recognizing a user's speech input. The method includes the steps for detecting the user's speech input, recognizing the user's speech input by comparing the speech input to a list of entries using language model statistics to determine the most likely entry matching the user's speech input, and detecting navigation information of a trip to a predetermined destination, where the most likely entry is determined by modifying the language model statistics taking into account the navigation information. A system and method is further provided that takes into account navigation trip information to determine the most likely entry using language model statistics for recognizing text input.
    Type: Grant
    Filed: January 25, 2010
    Date of Patent: December 25, 2012
    Assignee: Harman Becker Automotive Systems GmbH
    Inventors: Rainer Gruhn, Andreas Marcel Riechert
  • Patent number: 8301445
    Abstract: Embodiments of the invention relate to methods for generating a multilingual acoustic model. A main acoustic model comprising a main acoustic model having probability distribution functions and a probabilistic state sequence model including first states is provided to a processor. At least one second acoustic model including probability distribution functions and a probabilistic state sequence model including states is also provided to the processor. The processor replaces each of the probability distribution functions of the at least one second acoustic model by one of the probability distribution functions and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with the state of the probabilistic state sequence model of the main acoustic model based on a criteria set to obtain at least one modified second acoustic model. The criteria set may be a distance measurement.
    Type: Grant
    Filed: November 25, 2009
    Date of Patent: October 30, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Rainer Gruhn, Martin Raab, Raymond Brueckner
  • Publication number: 20120245919
    Abstract: An automatic speech recognition (ASR) apparatus for an embedded device application is described. A speech decoder receives an input sequence of speech feature vectors in a first language and outputs an acoustic segment lattice representing a probabilistic combination of basic linguistic units in a second language. A vocabulary matching module compares the acoustic segment lattice to vocabulary models in the first language to determine an output set of probability-ranked recognition hypotheses. A detailed matching module compares the set of probability-ranked recognition hypotheses to detailed match models in the first language to determine a recognition output representing a vocabulary word most likely to correspond to the input sequence of speech feature vectors.
    Type: Application
    Filed: September 23, 2009
    Publication date: September 27, 2012
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Guillermo Aradilla, Rainer Gruhn
  • Patent number: 8275619
    Abstract: The present invention relates to a method for speech recognition of a speech signal comprising the steps of providing at least one codebook comprising codebook entries, in particular, multivariate Gaussians of feature vectors, that are frequency weighted such that higher weights are assigned to entries corresponding to frequencies below a predetermined level than to entries corresponding to frequencies above the predetermined level and processing the speech signal for speech recognition comprising extracting at least one feature vector from the speech signal and matching the feature vector with the entries of the codebook.
    Type: Grant
    Filed: September 2, 2009
    Date of Patent: September 25, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Tobias Herbig, Martin Raab, Raymond Brueckner, Rainer Gruhn
  • Publication number: 20120239403
    Abstract: An approach for phoneme recognition is described. A sequence of intermediate output posterior vectors is generated from an input sequence of cepstral features using a first layer perceptron. The intermediate output posterior vectors are then downsampled to form a reduced input set of intermediate posterior vectors for a second layer perceptron. A sequence of final posterior vectors is generated from the reduced input set of intermediate posterior vectors using the second layer perceptron. Then the final posterior vectors are decoded to determine an output recognized phoneme sequence representative of the input sequence of cepstral features.
    Type: Application
    Filed: September 28, 2009
    Publication date: September 20, 2012
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Daniel Andrés Vásquez Cano, Guillermo Aradilla, Rainer Gruhn
  • Publication number: 20110161079
    Abstract: The present invention relates to a communication system, comprising a database including classes of speech templates, in particular, classified according to a predetermined grammar; an input configured to receive and to digitize speech signals corresponding to a spoken utterance; a speech recognizer configured to receive and recognize the digitized speech signals; and wherein the speech recognizer is configured to recognize the digitized speech signals based on speech templates stored in the database and a predetermined grammatical structure.
    Type: Application
    Filed: December 9, 2009
    Publication date: June 30, 2011
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Rainer Gruhn, Stefan Hamerich
  • Publication number: 20100217589
    Abstract: The invention provides a method for automated training of a plurality of artificial neural networks for phoneme recognition using training data, wherein the training data comprises speech signals subdivided into frames, each frame associated with a phoneme label, wherein the phoneme label indicates a phoneme associated with the frame. A sequence of frames from the training data are provided, wherein the number of frames in the sequence of frames is at least equal to the number of artificial neural networks. Each of the artificial neural networks is assigned a different subsequence of the provided sequence, wherein each subsequence comprises a predetermined number of frames. A common phoneme label for the sequence of frames is determined based on the phoneme labels of one or more frames of one or more subsequences of the provided sequence. Each artificial neural network using the common phoneme label.
    Type: Application
    Filed: February 17, 2010
    Publication date: August 26, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Rainer Gruhn, Daniel Vasquez, Guillermo Aradilla
  • Publication number: 20100191520
    Abstract: A system and method are provided for recognizing a user's speech input. The method includes the steps for detecting the user's speech input, recognizing the user's speech input by comparing the speech input to a list of entries using language model statistics to determine the most likely entry matching the user's speech input, and detecting navigation information of a trip to a predetermined destination, where the most likely entry is determined by modifying the language model statistics taking into account the navigation information. A system and method is further provided that takes into account navigation trip information to determine the most likely entry using language model statistics for recognizing text input.
    Type: Application
    Filed: January 25, 2010
    Publication date: July 29, 2010
    Applicant: Harman Becker Automotive Systems GmbH
    Inventors: Rainer Gruhn, Andreas Marcel Riechert
  • Publication number: 20100131262
    Abstract: Embodiments of the invention relate to methods for generating a multilingual acoustic model. A main acoustic model comprising a main acoustic model having probability distribution functions and a probabilistic state sequence model including first states is provided to a processor. At least one second acoustic model including probability distribution functions and a probabilistic state sequence model including states is also provided to the processor. The processor replaces each of the probability distribution functions of the at least one second acoustic model by one of the probability distribution functions and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with the state of the probabilistic state sequence model of the main acoustic model based on a criteria set to obtain at least one modified second acoustic model. The criteria set may be a distance measurement.
    Type: Application
    Filed: November 25, 2009
    Publication date: May 27, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Rainer Gruhn, Martin Raab, Raymond Brueckner
  • Publication number: 20100057462
    Abstract: The present invention relates to a method for speech recognition of a speech signal comprising the steps of providing at least one codebook comprising codebook entries, in particular, multivariate Gaussians of feature vectors, that are frequency weighted such that higher weights are assigned to entries corresponding to frequencies below a predetermined level than to entries corresponding to frequencies above the predetermined level and processing the speech signal for speech recognition comprising extracting at least one feature vector from the speech signal and matching the feature vector with the entries of the codebook.
    Type: Application
    Filed: September 2, 2009
    Publication date: March 4, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Tobias Herbig, Martin Raab, Raymond Brueckner, Rainer Gruhn
  • Publication number: 20090254335
    Abstract: Examples of methods are provided for generating a multilingual codebook. According to an example method, a main language codebook and at least one additional codebook corresponding to a language different from the main language are provided. A multilingual codebook is generated from the main language codebook and the at least one additional codebook by adding a sub-set of code vectors of the at least one additional codebook to the main codebook based on distances between the code vectors of the at least one additional codebook to code vectors of the main language codebook. Systems and methods for speech recognition using the multilingual codebook and applications that use speech recognition based on the multilingual codebook are also provided.
    Type: Application
    Filed: April 1, 2009
    Publication date: October 8, 2009
    Applicant: Harman Becker Automotive Systems GmbH
    Inventors: Raymond Brückner, Martin Raab, Rainer Gruhn