Patents by Inventor Rainer Gruhn

Rainer Gruhn has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Downsampling schemes in a hierarchical neural network structure for phoneme recognition

Patent number: 9595257

Abstract: An approach for phoneme recognition is described. A sequence of intermediate output posterior vectors is generated from an input sequence of cepstral features using a first layer perceptron. The intermediate output posterior vectors are then downsampled to form a reduced input set of intermediate posterior vectors for a second layer perceptron. A sequence of final posterior vectors is generated from the reduced input set of intermediate posterior vectors using the second layer perceptron. Then the final posterior vectors are decoded to determine an output recognized phoneme sequence representative of the input sequence of cepstral features.

Type: Grant

Filed: September 28, 2009

Date of Patent: March 14, 2017

Assignee: Nuance Communications, Inc.

Inventors: Daniel Andrés Vásquez Cano, Guillermo Aradilla, Rainer Gruhn
Method for automated training of a plurality of artificial neural networks

Patent number: 8554555

Abstract: The invention provides a method for automated training of a plurality of artificial neural networks for phoneme recognition using training data, wherein the training data comprises speech signals subdivided into frames, each frame associated with a phoneme label, wherein the phoneme label indicates a phoneme associated with the frame. A sequence of frames from the training data are provided, wherein the number of frames in the sequence of frames is at least equal to the number of artificial neural networks. Each of the artificial neural networks is assigned a different subsequence of the provided sequence, wherein each subsequence comprises a predetermined number of frames. A common phoneme label for the sequence of frames is determined based on the phoneme labels of one or more frames of one or more subsequences of the provided sequence. Each artificial neural network using the common phoneme label.

Type: Grant

Filed: February 17, 2010

Date of Patent: October 8, 2013

Assignee: Nuance Communications, Inc.

Inventors: Rainer Gruhn, Daniel Vasquez, Guillermo Aradilla
Text and speech recognition system using navigation information

Patent number: 8340958

Abstract: A system and method are provided for recognizing a user's speech input. The method includes the steps for detecting the user's speech input, recognizing the user's speech input by comparing the speech input to a list of entries using language model statistics to determine the most likely entry matching the user's speech input, and detecting navigation information of a trip to a predetermined destination, where the most likely entry is determined by modifying the language model statistics taking into account the navigation information. A system and method is further provided that takes into account navigation trip information to determine the most likely entry using language model statistics for recognizing text input.

Type: Grant

Filed: January 25, 2010

Date of Patent: December 25, 2012

Assignee: Harman Becker Automotive Systems GmbH

Inventors: Rainer Gruhn, Andreas Marcel Riechert
Speech recognition based on a multilingual acoustic model

Patent number: 8301445

Abstract: Embodiments of the invention relate to methods for generating a multilingual acoustic model. A main acoustic model comprising a main acoustic model having probability distribution functions and a probabilistic state sequence model including first states is provided to a processor. At least one second acoustic model including probability distribution functions and a probabilistic state sequence model including states is also provided to the processor. The processor replaces each of the probability distribution functions of the at least one second acoustic model by one of the probability distribution functions and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with the state of the probabilistic state sequence model of the main acoustic model based on a criteria set to obtain at least one modified second acoustic model. The criteria set may be a distance measurement.

Type: Grant

Filed: November 25, 2009

Date of Patent: October 30, 2012

Assignee: Nuance Communications, Inc.

Inventors: Rainer Gruhn, Martin Raab, Raymond Brueckner
Probabilistic Representation of Acoustic Segments

Publication number: 20120245919

Abstract: An automatic speech recognition (ASR) apparatus for an embedded device application is described. A speech decoder receives an input sequence of speech feature vectors in a first language and outputs an acoustic segment lattice representing a probabilistic combination of basic linguistic units in a second language. A vocabulary matching module compares the acoustic segment lattice to vocabulary models in the first language to determine an output set of probability-ranked recognition hypotheses. A detailed matching module compares the set of probability-ranked recognition hypotheses to detailed match models in the first language to determine a recognition output representing a vocabulary word most likely to correspond to the input sequence of speech feature vectors.

Type: Application

Filed: September 23, 2009

Publication date: September 27, 2012

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Guillermo Aradilla, Rainer Gruhn
Speech recognition

Patent number: 8275619

Abstract: The present invention relates to a method for speech recognition of a speech signal comprising the steps of providing at least one codebook comprising codebook entries, in particular, multivariate Gaussians of feature vectors, that are frequency weighted such that higher weights are assigned to entries corresponding to frequencies below a predetermined level than to entries corresponding to frequencies above the predetermined level and processing the speech signal for speech recognition comprising extracting at least one feature vector from the speech signal and matching the feature vector with the entries of the codebook.

Type: Grant

Filed: September 2, 2009

Date of Patent: September 25, 2012

Assignee: Nuance Communications, Inc.

Inventors: Tobias Herbig, Martin Raab, Raymond Brueckner, Rainer Gruhn
Downsampling Schemes in a Hierarchical Neural Network Structure for Phoneme Recognition

Publication number: 20120239403

Abstract: An approach for phoneme recognition is described. A sequence of intermediate output posterior vectors is generated from an input sequence of cepstral features using a first layer perceptron. The intermediate output posterior vectors are then downsampled to form a reduced input set of intermediate posterior vectors for a second layer perceptron. A sequence of final posterior vectors is generated from the reduced input set of intermediate posterior vectors using the second layer perceptron. Then the final posterior vectors are decoded to determine an output recognized phoneme sequence representative of the input sequence of cepstral features.

Type: Application

Filed: September 28, 2009

Publication date: September 20, 2012

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Daniel Andrés Vásquez Cano, Guillermo Aradilla, Rainer Gruhn
Grammar and Template-Based Speech Recognition of Spoken Utterances

Publication number: 20110161079

Abstract: The present invention relates to a communication system, comprising a database including classes of speech templates, in particular, classified according to a predetermined grammar; an input configured to receive and to digitize speech signals corresponding to a spoken utterance; a speech recognizer configured to receive and recognize the digitized speech signals; and wherein the speech recognizer is configured to recognize the digitized speech signals based on speech templates stored in the database and a predetermined grammatical structure.

Type: Application

Filed: December 9, 2009

Publication date: June 30, 2011

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Rainer Gruhn, Stefan Hamerich
Method for Automated Training of a Plurality of Artificial Neural Networks

Publication number: 20100217589

Abstract: The invention provides a method for automated training of a plurality of artificial neural networks for phoneme recognition using training data, wherein the training data comprises speech signals subdivided into frames, each frame associated with a phoneme label, wherein the phoneme label indicates a phoneme associated with the frame. A sequence of frames from the training data are provided, wherein the number of frames in the sequence of frames is at least equal to the number of artificial neural networks. Each of the artificial neural networks is assigned a different subsequence of the provided sequence, wherein each subsequence comprises a predetermined number of frames. A common phoneme label for the sequence of frames is determined based on the phoneme labels of one or more frames of one or more subsequences of the provided sequence. Each artificial neural network using the common phoneme label.

Type: Application

Filed: February 17, 2010

Publication date: August 26, 2010

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Rainer Gruhn, Daniel Vasquez, Guillermo Aradilla
TEXT AND SPEECH RECOGNITION SYSTEM USING NAVIGATION INFORMATION

Publication number: 20100191520

Abstract: A system and method are provided for recognizing a user's speech input. The method includes the steps for detecting the user's speech input, recognizing the user's speech input by comparing the speech input to a list of entries using language model statistics to determine the most likely entry matching the user's speech input, and detecting navigation information of a trip to a predetermined destination, where the most likely entry is determined by modifying the language model statistics taking into account the navigation information. A system and method is further provided that takes into account navigation trip information to determine the most likely entry using language model statistics for recognizing text input.

Type: Application

Filed: January 25, 2010

Publication date: July 29, 2010

Applicant: Harman Becker Automotive Systems GmbH

Inventors: Rainer Gruhn, Andreas Marcel Riechert
Speech Recognition Based on a Multilingual Acoustic Model

Publication number: 20100131262

Abstract: Embodiments of the invention relate to methods for generating a multilingual acoustic model. A main acoustic model comprising a main acoustic model having probability distribution functions and a probabilistic state sequence model including first states is provided to a processor. At least one second acoustic model including probability distribution functions and a probabilistic state sequence model including states is also provided to the processor. The processor replaces each of the probability distribution functions of the at least one second acoustic model by one of the probability distribution functions and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with the state of the probabilistic state sequence model of the main acoustic model based on a criteria set to obtain at least one modified second acoustic model. The criteria set may be a distance measurement.

Type: Application

Filed: November 25, 2009

Publication date: May 27, 2010

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Rainer Gruhn, Martin Raab, Raymond Brueckner
Speech Recognition

Publication number: 20100057462

Abstract: The present invention relates to a method for speech recognition of a speech signal comprising the steps of providing at least one codebook comprising codebook entries, in particular, multivariate Gaussians of feature vectors, that are frequency weighted such that higher weights are assigned to entries corresponding to frequencies below a predetermined level than to entries corresponding to frequencies above the predetermined level and processing the speech signal for speech recognition comprising extracting at least one feature vector from the speech signal and matching the feature vector with the entries of the codebook.

Type: Application

Filed: September 2, 2009

Publication date: March 4, 2010

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Tobias Herbig, Martin Raab, Raymond Brueckner, Rainer Gruhn
MULTILINGUAL WEIGHTED CODEBOOKS

Publication number: 20090254335

Abstract: Examples of methods are provided for generating a multilingual codebook. According to an example method, a main language codebook and at least one additional codebook corresponding to a language different from the main language are provided. A multilingual codebook is generated from the main language codebook and the at least one additional codebook by adding a sub-set of code vectors of the at least one additional codebook to the main codebook based on distances between the code vectors of the at least one additional codebook to code vectors of the main language codebook. Systems and methods for speech recognition using the multilingual codebook and applications that use speech recognition based on the multilingual codebook are also provided.

Type: Application

Filed: April 1, 2009

Publication date: October 8, 2009

Applicant: Harman Becker Automotive Systems GmbH

Inventors: Raymond Brückner, Martin Raab, Rainer Gruhn