Patents by Inventor Stephan Kanthak

Stephan Kanthak has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Fast deep neural network feature transformation via optimized memory bandwidth utilization

Patent number: 10013652

Abstract: Deep Neural Networks (DNNs) with many hidden layers and many units per layer are very flexible models with a very large number of parameters. As such, DNNs are challenging to optimize. To achieve real-time computation, embodiments disclosed herein enable fast DNN feature transformation via optimized memory bandwidth utilization. To optimize memory bandwidth utilization, a rate of accessing memory may be reduced based on a batch setting. A memory, corresponding to a selected given output neuron of a current layer of the DNN, may be updated with an incremental output value computed for the selected given output neuron as a function of input values of a selected few non-zero input neurons of a previous layer of the DNN in combination with weights between the selected few non-zero input neurons and the selected given output neuron, wherein a number of the selected few corresponds to the batch setting.

Type: Grant

Filed: April 29, 2015

Date of Patent: July 3, 2018

Assignee: Nuance Communications, Inc.

Inventors: Jan Vlietinck, Stephan Kanthak, Rudi Vuerinckx, Christophe Ris
Multiple pass automatic speech recognition methods and apparatus

Patent number: 9940927

Abstract: In some aspects, a method of recognizing speech that comprises natural language and at least one word specified in at least one domain-specific vocabulary is provided. The method comprises performing a first speech processing pass comprising identifying, in the speech, a first portion including the natural language and a second portion including the at least one word specified in the at least one domain-specific vocabulary, and recognizing the first portion including the natural language. The method further comprises performing a second speech processing pass comprising recognizing the second portion including the at least one word specified in the at least one domain-specific vocabulary.

Type: Grant

Filed: August 23, 2013

Date of Patent: April 10, 2018

Assignee: Nuance Communications, Inc.

Inventors: Munir Nikolai Alexander Georges, Stephan Kanthak
Efficient incremental modification of optimized finite-state transducers (FSTs) for use in speech applications

Patent number: 9837073

Abstract: Methods of incrementally modifying a word-level finite state transducer (FST) are described for adding and removing sentences. A prefix subset of states and arcs in the FST is determined that matches a prefix portion of the sentence. A suffix subset of states and arcs in the FST is determined that matches a suffix portion of the sentence. A new sentence can then be added to the FST by appending a new sequence of states and arcs to the FST corresponding to a remainder of the sentence between the prefix and suffix. An existing sentence can be removed from the FST by removing any arcs and states between the prefix subset and the suffix subset. The resulting modified FST is locally efficient but does not satisfy global optimization criteria such as minimization.

Type: Grant

Filed: September 21, 2011

Date of Patent: December 5, 2017

Assignee: Nuance Communications, Inc.

Inventors: Stephan Kanthak, Oliver Bender
Method and apparatus for improving speech recognition processing performance

Patent number: 9792910

Abstract: Computing the feature Maximum Mutual Information (fMMI) method requires multiplication of vectors with a huge matrix. The huge matrix is subdivided into block sub-matrices. The sub-matrices are quantized into different values and compressed by replacing the quantized element values with 1 or 2 bit indices. Fast multiplication with those compressed matrices with far fewer multiply/accumulate operations compared to standard matrix computation is enabled and additionally obviates a de-compression method for decompressing the sub-matrices before use.

Type: Grant

Filed: April 29, 2015

Date of Patent: October 17, 2017

Assignee: Nuance Communications, Inc.

Inventors: Jan Vlietinck, Stephan Kanthak
Techniques for updating an automatic speech recognition system using finite-state transducers

Patent number: 9715874

Abstract: Techniques are described for updating an automatic speech recognition (ASR) system that, prior to the update, is configured to perform ASR using a first finite-state transducer (FST) comprising a first set of paths representing recognizable speech sequences. A second FST may be accessed, comprising a second set of paths representing speech sequences to be recognized by the updated ASR system. By analyzing the second FST together with the first FST, a patch may be extracted and provided to the ASR system as an update, capable of being applied non-destructively to the first FST at the ASR system to cause the ASR system using the first FST with the patch to recognize speech using the second set of paths from the second FST. In some embodiments, the patch may be configured such that destructively applying the patch to the first FST creates a modified FST that is globally minimized.

Type: Grant

Filed: October 30, 2015

Date of Patent: July 25, 2017

Assignee: Nuance Communications, Inc.

Inventors: Stephan Kanthak, Jan Vlietinck, Johan Vantieghem, Stijn Verschaeren
TECHNIQUES FOR UPDATING AN AUTOMATIC SPEECH RECOGNITION SYSTEM USING FINITE-STATE TRANSDUCERS

Publication number: 20170125012

Abstract: Techniques are described for updating an automatic speech recognition (ASR) system that, prior to the update, is configured to perform ASR using a first finite-state transducer (FST) comprising a first set of paths representing recognizable speech sequences. A second FST may be accessed, comprising a second set of paths representing speech sequences to be recognized by the updated ASR system. By analyzing the second FST together with the first FST, a patch may be extracted and provided to the ASR system as an update, capable of being applied non-destructively to the first FST at the ASR system to cause the ASR system using the first FST with the patch to recognize speech using the second set of paths from the second FST. In some embodiments, the patch may be configured such that destructively applying the patch to the first FST creates a modified FST that is globally minimized.

Type: Application

Filed: October 30, 2015

Publication date: May 4, 2017

Applicant: Nuance Communications, Inc.

Inventors: Stephan Kanthak, Jan Vlietinck, Johan Vantieghem, Stijn Verschaeren
FAST DEEP NEURAL NETWORK FEATURE TRANSFORMATION VIA OPTIMIZED MEMORY BANDWIDTH UTILIZATION

Publication number: 20160322042

Abstract: Deep Neural Networks (DNNs) with many hidden layers and many units per layer are very flexible models with a very large number of parameters. As such, DNNs are challenging to optimize. To achieve real-time computation, embodiments disclosed herein enable fast DNN feature transformation via optimized memory bandwidth utilization. To optimize memory bandwidth utilization, a rate of accessing memory may be reduced based on a batch setting. A memory, corresponding to a selected given output neuron of a current layer of the DNN, may be updated with an incremental output value computed for the selected given output neuron as a function of input values of a selected few non-zero input neurons of a previous layer of the DNN in combination with weights between the selected few non-zero input neurons and the selected given output neuron, wherein a number of the selected few corresponds to the batch setting.

Type: Application

Filed: April 29, 2015

Publication date: November 3, 2016

Inventors: Jan Vlietinck, Stephan Kanthak, Rudi Vuerinckx, Christophe Ris
METHOD AND APPARATUS FOR IMPROVING SPEECH RECOGNITION PROCESSING PERFORMANCE

Publication number: 20160322059

Abstract: Computing the feature Maximum Mutual Information (fMMI) method requires multiplication of vectors with a huge matrix. The huge matrix is subdivided into block sub-matrices. The sub-matrices are quantized into different values and compressed by replacing the quantized element values with 1 or 2 bit indices. Fast multiplication with those compressed matrices with far fewer multiply/accumulate operations compared to standard matrix computation is enabled and additionally obviates a de-compression method for decompressing the sub-matrices before use.

Type: Application

Filed: April 29, 2015

Publication date: November 3, 2016

Inventors: Jan Vlietinck, Stephan Kanthak
Machine translation using global lexical selection and sentence reconstruction

Patent number: 9323745

Abstract: Disclosed are systems, methods, and computer-readable media for performing translations from a source language to a target language. The method comprises receiving a source phrase, generating a target bag of words based on a global lexical selection of words that loosely couples the source words/phrases and target words/phrases, and reconstructing a target phrase or sentence by considering all permutations of words with a conditional probability greater than a threshold.

Type: Grant

Filed: July 21, 2014

Date of Patent: April 26, 2016

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Srinivas Bangalore, Patrick Haffner, Stephan Kanthak
Method and system for providing an automated web transcription service

Patent number: 9070368

Abstract: A system, method and computer readable medium that provides an automated web transcription service is disclosed. The method may include receiving input speech from a user using a communications network, recognizing the received input speech, understanding the recognized speech, transcribing the understood speech to text, storing the transcribed text in a database, receiving a request via a web page to display the transcribed text, retrieving transcribed text from the database, and displaying the transcribed text to the requester using the web page.

Type: Grant

Filed: July 2, 2014

Date of Patent: June 30, 2015

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin Gilbert, Stephan Kanthak
MULTIPLE PASS AUTOMATIC SPEECH RECOGNITION METHODS AND APPARATUS

Publication number: 20150058018

Abstract: In some aspects, a method of recognizing speech that comprises natural language and at least one word specified in at least one domain-specific vocabulary is provided. The method comprises performing a first speech processing pass comprising identifying, in the speech, a first portion including the natural language and a second portion including the at least one word specified in the at least one domain-specific vocabulary, and recognizing the first portion including the natural language. The method further comprises performing a second speech processing pass comprising recognizing the second portion including the at least one word specified in the at least one domain-specific vocabulary.

Type: Application

Filed: August 23, 2013

Publication date: February 26, 2015

Applicant: Nuance Communications, Inc.

Inventors: Munir Nikolai Alexander Georges, Stephan Kanthak
Machine Translation Using Global Lexical Selection and Sentence Reconstruction

Publication number: 20140330552

Abstract: Disclosed are systems, methods, and computer-readable media for performing translations from a source language to a target language. The method comprises receiving a source phrase, generating a target bag of words based on a global lexical selection of words that loosely couples the source words/phrases and target words/phrases, and reconstructing a target phrase or sentence by considering all permutations of words with a conditional probability greater than a threshold.

Type: Application

Filed: July 21, 2014

Publication date: November 6, 2014

Inventors: Srinivas BANGALORE, Patrick HAFFNER, Stephan KANTHAK
METHOD AND SYSTEM FOR PROVIDING AN AUTOMATED WEB TRANSCRIPTION SERVICE

Publication number: 20140316780

Abstract: A system, method and computer readable medium that provides an automated web transcription service is disclosed. The method may include receiving input speech from a user using a communications network, recognizing the received input speech, understanding the recognized speech, transcribing the understood speech to text, storing the transcribed text in a database, receiving a request via a web page to display the transcribed text, retrieving transcribed text from the database, and displaying the transcribed text to the requester using the web page.

Type: Application

Filed: July 2, 2014

Publication date: October 23, 2014

Inventors: Mazin GILBERT, Stephan KANTHAK
Efficient Incremental Modification of Optimized Finite-State Transducers (FSTs) for Use in Speech Applications

Publication number: 20140229177

Abstract: Methods of incrementally modifying a word-level finite state transducer (FST) are described for adding and removing sentences. A prefix subset of states and arcs in the FST is determined that matches a prefix portion of the sentence. A suffix subset of states and arcs in the FST is determined that matches a suffix portion of the sentence. A new sentence can then be added to the FST by appending a new sequence of states and arcs to the FST corresponding to a remainder of the sentence between the prefix and suffix. An existing sentence can be removed from the FST by removing any arcs and states between the prefix subset and the suffix subset. The resulting modified FST is locally efficient but does not satisfy global optimization criteria such as minimization.

Type: Application

Filed: September 21, 2011

Publication date: August 14, 2014

Applicant: Nuance Communications, Inc.

Inventors: Stephan Kanthak, Oliver Bender
Machine translation using global lexical selection and sentence reconstruction

Patent number: 8788258

Abstract: Disclosed are systems, methods, and computer-readable media for performing translations from a source language to a target language. The method comprises receiving a source phrase, generating a target bag of words based on a global lexical selection of words that loosely couples the source words/phrases and target words/phrases, and reconstructing a target phrase or sentence by considering all permutations of words with a conditional probability greater than a threshold.

Type: Grant

Filed: March 15, 2007

Date of Patent: July 22, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Srinivas Bangalore, Patrick Haffner, Stephan Kanthak
Method and system for providing an automated web transcription service

Patent number: 8775176

Abstract: A system, method and computer readable medium that provides an automated web transcription service is disclosed. The method may include receiving input speech from a user using a communications network, recognizing the received input speech, understanding the recognized speech, transcribing the understood speech to text, storing the transcribed text in a database, receiving a request via a web page to display the transcribed text, retrieving transcribed text from the database, and displaying the transcribed text to the requester using the web page.

Type: Grant

Filed: August 26, 2013

Date of Patent: July 8, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin Gilbert, Stephan Kanthak
Method and System for Providing an Automated Web Transcription Service

Publication number: 20130346086

Abstract: A system, method and computer readable medium that provides an automated web transcription service is disclosed. The method may include receiving input speech from a user using a communications network, recognizing the received input speech, understanding the recognized speech, transcribing the understood speech to text, storing the transcribed text in a database, receiving a request via a web page to display the transcribed text, retrieving transcribed text from the database, and displaying the transcribed text to the requester using the web page.

Type: Application

Filed: August 26, 2013

Publication date: December 26, 2013

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Mazin GILBERT, Stephan KANTHAK
Method and system for providing an automated web transcription service

Patent number: 8521510

Abstract: A system, method and computer readable medium that provides an automated web transcription service is disclosed. The method may include receiving input speech from a user using a communications network, recognizing the received input speech, understanding the recognized speech, transcribing the understood speech to text, storing the transcribed text in a database, receiving a request via a web page to display the transcribed text, retrieving transcribed text from the database, and displaying the transcribed text to the requester using the web page.

Type: Grant

Filed: August 31, 2006

Date of Patent: August 27, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin Gilbert, Stephan Kanthak
Sequence classification for machine translation

Patent number: 7783473

Abstract: Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word.

Type: Grant

Filed: December 28, 2006

Date of Patent: August 24, 2010

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Srinivas Bangalore, Patrick Haffner, Stephan Kanthak
Sequence classification for machine translation

Publication number: 20080162111

Abstract: Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word.

Type: Application

Filed: December 28, 2006

Publication date: July 3, 2008

Inventors: Srinivas Bangalore, Patrick Haffner, Stephan Kanthak

1 2 next