Patents by Inventor Rainer Gruhn
Rainer Gruhn has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9595257Abstract: An approach for phoneme recognition is described. A sequence of intermediate output posterior vectors is generated from an input sequence of cepstral features using a first layer perceptron. The intermediate output posterior vectors are then downsampled to form a reduced input set of intermediate posterior vectors for a second layer perceptron. A sequence of final posterior vectors is generated from the reduced input set of intermediate posterior vectors using the second layer perceptron. Then the final posterior vectors are decoded to determine an output recognized phoneme sequence representative of the input sequence of cepstral features.Type: GrantFiled: September 28, 2009Date of Patent: March 14, 2017Assignee: Nuance Communications, Inc.Inventors: Daniel Andrés Vásquez Cano, Guillermo Aradilla, Rainer Gruhn
-
Patent number: 8554555Abstract: The invention provides a method for automated training of a plurality of artificial neural networks for phoneme recognition using training data, wherein the training data comprises speech signals subdivided into frames, each frame associated with a phoneme label, wherein the phoneme label indicates a phoneme associated with the frame. A sequence of frames from the training data are provided, wherein the number of frames in the sequence of frames is at least equal to the number of artificial neural networks. Each of the artificial neural networks is assigned a different subsequence of the provided sequence, wherein each subsequence comprises a predetermined number of frames. A common phoneme label for the sequence of frames is determined based on the phoneme labels of one or more frames of one or more subsequences of the provided sequence. Each artificial neural network using the common phoneme label.Type: GrantFiled: February 17, 2010Date of Patent: October 8, 2013Assignee: Nuance Communications, Inc.Inventors: Rainer Gruhn, Daniel Vasquez, Guillermo Aradilla
-
Patent number: 8340958Abstract: A system and method are provided for recognizing a user's speech input. The method includes the steps for detecting the user's speech input, recognizing the user's speech input by comparing the speech input to a list of entries using language model statistics to determine the most likely entry matching the user's speech input, and detecting navigation information of a trip to a predetermined destination, where the most likely entry is determined by modifying the language model statistics taking into account the navigation information. A system and method is further provided that takes into account navigation trip information to determine the most likely entry using language model statistics for recognizing text input.Type: GrantFiled: January 25, 2010Date of Patent: December 25, 2012Assignee: Harman Becker Automotive Systems GmbHInventors: Rainer Gruhn, Andreas Marcel Riechert
-
Patent number: 8301445Abstract: Embodiments of the invention relate to methods for generating a multilingual acoustic model. A main acoustic model comprising a main acoustic model having probability distribution functions and a probabilistic state sequence model including first states is provided to a processor. At least one second acoustic model including probability distribution functions and a probabilistic state sequence model including states is also provided to the processor. The processor replaces each of the probability distribution functions of the at least one second acoustic model by one of the probability distribution functions and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with the state of the probabilistic state sequence model of the main acoustic model based on a criteria set to obtain at least one modified second acoustic model. The criteria set may be a distance measurement.Type: GrantFiled: November 25, 2009Date of Patent: October 30, 2012Assignee: Nuance Communications, Inc.Inventors: Rainer Gruhn, Martin Raab, Raymond Brueckner
-
Publication number: 20120245919Abstract: An automatic speech recognition (ASR) apparatus for an embedded device application is described. A speech decoder receives an input sequence of speech feature vectors in a first language and outputs an acoustic segment lattice representing a probabilistic combination of basic linguistic units in a second language. A vocabulary matching module compares the acoustic segment lattice to vocabulary models in the first language to determine an output set of probability-ranked recognition hypotheses. A detailed matching module compares the set of probability-ranked recognition hypotheses to detailed match models in the first language to determine a recognition output representing a vocabulary word most likely to correspond to the input sequence of speech feature vectors.Type: ApplicationFiled: September 23, 2009Publication date: September 27, 2012Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Guillermo Aradilla, Rainer Gruhn
-
Patent number: 8275619Abstract: The present invention relates to a method for speech recognition of a speech signal comprising the steps of providing at least one codebook comprising codebook entries, in particular, multivariate Gaussians of feature vectors, that are frequency weighted such that higher weights are assigned to entries corresponding to frequencies below a predetermined level than to entries corresponding to frequencies above the predetermined level and processing the speech signal for speech recognition comprising extracting at least one feature vector from the speech signal and matching the feature vector with the entries of the codebook.Type: GrantFiled: September 2, 2009Date of Patent: September 25, 2012Assignee: Nuance Communications, Inc.Inventors: Tobias Herbig, Martin Raab, Raymond Brueckner, Rainer Gruhn
-
Publication number: 20120239403Abstract: An approach for phoneme recognition is described. A sequence of intermediate output posterior vectors is generated from an input sequence of cepstral features using a first layer perceptron. The intermediate output posterior vectors are then downsampled to form a reduced input set of intermediate posterior vectors for a second layer perceptron. A sequence of final posterior vectors is generated from the reduced input set of intermediate posterior vectors using the second layer perceptron. Then the final posterior vectors are decoded to determine an output recognized phoneme sequence representative of the input sequence of cepstral features.Type: ApplicationFiled: September 28, 2009Publication date: September 20, 2012Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Daniel Andrés Vásquez Cano, Guillermo Aradilla, Rainer Gruhn
-
Publication number: 20110161079Abstract: The present invention relates to a communication system, comprising a database including classes of speech templates, in particular, classified according to a predetermined grammar; an input configured to receive and to digitize speech signals corresponding to a spoken utterance; a speech recognizer configured to receive and recognize the digitized speech signals; and wherein the speech recognizer is configured to recognize the digitized speech signals based on speech templates stored in the database and a predetermined grammatical structure.Type: ApplicationFiled: December 9, 2009Publication date: June 30, 2011Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Rainer Gruhn, Stefan Hamerich
-
Publication number: 20100217589Abstract: The invention provides a method for automated training of a plurality of artificial neural networks for phoneme recognition using training data, wherein the training data comprises speech signals subdivided into frames, each frame associated with a phoneme label, wherein the phoneme label indicates a phoneme associated with the frame. A sequence of frames from the training data are provided, wherein the number of frames in the sequence of frames is at least equal to the number of artificial neural networks. Each of the artificial neural networks is assigned a different subsequence of the provided sequence, wherein each subsequence comprises a predetermined number of frames. A common phoneme label for the sequence of frames is determined based on the phoneme labels of one or more frames of one or more subsequences of the provided sequence. Each artificial neural network using the common phoneme label.Type: ApplicationFiled: February 17, 2010Publication date: August 26, 2010Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Rainer Gruhn, Daniel Vasquez, Guillermo Aradilla
-
Publication number: 20100191520Abstract: A system and method are provided for recognizing a user's speech input. The method includes the steps for detecting the user's speech input, recognizing the user's speech input by comparing the speech input to a list of entries using language model statistics to determine the most likely entry matching the user's speech input, and detecting navigation information of a trip to a predetermined destination, where the most likely entry is determined by modifying the language model statistics taking into account the navigation information. A system and method is further provided that takes into account navigation trip information to determine the most likely entry using language model statistics for recognizing text input.Type: ApplicationFiled: January 25, 2010Publication date: July 29, 2010Applicant: Harman Becker Automotive Systems GmbHInventors: Rainer Gruhn, Andreas Marcel Riechert
-
Publication number: 20100131262Abstract: Embodiments of the invention relate to methods for generating a multilingual acoustic model. A main acoustic model comprising a main acoustic model having probability distribution functions and a probabilistic state sequence model including first states is provided to a processor. At least one second acoustic model including probability distribution functions and a probabilistic state sequence model including states is also provided to the processor. The processor replaces each of the probability distribution functions of the at least one second acoustic model by one of the probability distribution functions and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with the state of the probabilistic state sequence model of the main acoustic model based on a criteria set to obtain at least one modified second acoustic model. The criteria set may be a distance measurement.Type: ApplicationFiled: November 25, 2009Publication date: May 27, 2010Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Rainer Gruhn, Martin Raab, Raymond Brueckner
-
Publication number: 20100057462Abstract: The present invention relates to a method for speech recognition of a speech signal comprising the steps of providing at least one codebook comprising codebook entries, in particular, multivariate Gaussians of feature vectors, that are frequency weighted such that higher weights are assigned to entries corresponding to frequencies below a predetermined level than to entries corresponding to frequencies above the predetermined level and processing the speech signal for speech recognition comprising extracting at least one feature vector from the speech signal and matching the feature vector with the entries of the codebook.Type: ApplicationFiled: September 2, 2009Publication date: March 4, 2010Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Tobias Herbig, Martin Raab, Raymond Brueckner, Rainer Gruhn
-
Publication number: 20090254335Abstract: Examples of methods are provided for generating a multilingual codebook. According to an example method, a main language codebook and at least one additional codebook corresponding to a language different from the main language are provided. A multilingual codebook is generated from the main language codebook and the at least one additional codebook by adding a sub-set of code vectors of the at least one additional codebook to the main codebook based on distances between the code vectors of the at least one additional codebook to code vectors of the main language codebook. Systems and methods for speech recognition using the multilingual codebook and applications that use speech recognition based on the multilingual codebook are also provided.Type: ApplicationFiled: April 1, 2009Publication date: October 8, 2009Applicant: Harman Becker Automotive Systems GmbHInventors: Raymond Brückner, Martin Raab, Rainer Gruhn