Patents by Inventor Daniel Martin Bikel

Daniel Martin Bikel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Sampling training data for an automatic speech recognition system based on a benchmark classification distribution

Patent number: 9202461

Abstract: A set of benchmark text strings may be classified to provide a set of benchmark classifications. The benchmark text strings in the set may correspond to a benchmark corpus of benchmark utterances in a particular language. A benchmark classification distribution of the set of benchmark classifications may be determined. A respective classification for each text string in a corpus of text strings may also be determined. Text strings from the corpus of text strings may be sampled to form a training corpus of training text strings such that the classifications of the training text strings have a training text string classification distribution that is based on the benchmark classification distribution. The training corpus of training text strings may be used to train an automatic speech recognition (ASR) system.

Type: Grant

Filed: January 18, 2013

Date of Patent: December 1, 2015

Assignee: Google Inc.

Inventors: Fadi Biadsy, Pedro J. Moreno Mengibar, Kaisuke Nakajima, Daniel Martin Bikel
Securely classifying data

Patent number: 8903090

Abstract: Techniques are disclosed for securely classifying or decoding data. By way of example, a method of determining a most likely sequence for a given data set comprises a computer system associated with a first party performing the following steps. An encrypted model is obtained from a second party. The encrypted model is utilized to determine cost values associated with a particular sequence of observed outputs associated with the given data set. The cost values are sent to the second party. At least one index of a minimum cost value determined by the second party from the cost values sent thereto is obtained from the second party. A minimum cost sequence resulting from the at least one index is determined as the most likely sequence.

Type: Grant

Filed: April 29, 2008

Date of Patent: December 2, 2014

Assignee: International Business Machines Corporation

Inventors: Daniel Martin Bikel, Jeffrey Scott Sorensen
Sampling Training Data for an Automatic Speech Recognition System Based on a Benchmark Classification Distribution

Publication number: 20130289989

Abstract: A set of benchmark text strings may be classified to provide a set of benchmark classifications. The benchmark text strings in the set may correspond to a benchmark corpus of benchmark utterances in a particular language. A benchmark classification distribution of the set of benchmark classifications may be determined. A respective classification for each text string in a corpus of text strings may also be determined. Text strings from the corpus of text strings may be sampled to form a training corpus of training text strings such that the classifications of the training text strings have a training text string classification distribution that is based on the benchmark classification distribution. The training corpus of training text strings may be used to train an automatic speech recognition (ASR) system.

Type: Application

Filed: January 18, 2013

Publication date: October 31, 2013

Inventors: Fadi Biadsy, Pedro J. Moreno Mengibar, Kaisuke Nakajima, Daniel Martin Bikel
Sampling training data for an automatic speech recognition system based on a benchmark classification distribution

Patent number: 8374865

Abstract: A set of benchmark text strings may be classified to provide a set of benchmark classifications. The benchmark text strings in the set may correspond to a benchmark corpus of benchmark utterances in a particular language. A benchmark classification distribution of the set of benchmark classifications may be determined. A respective classification for each text string in a corpus of text strings may also be determined. Text strings from the corpus of text strings may be sampled to form a training corpus of training text strings such that the classifications of the training text strings have a training text string classification distribution that is based on the benchmark classification distribution. The training corpus of training text strings may be used to train an automatic speech recognition (ASR) system.

Type: Grant

Filed: April 26, 2012

Date of Patent: February 12, 2013

Assignee: Google Inc.

Inventors: Fadi Biadsy, Pedro J. Moreno Mengibar, Kaisuke Nakajima, Daniel Martin Bikel
Methods and Apparatus for Securely Classifying Data

Publication number: 20090268908

Abstract: Techniques are disclosed for securely classifying or decoding data. By way of example, a method of determining a most likely sequence for a given data set comprises a computer system associated with a first party performing the following steps. An encrypted model is obtained from a second party. The encrypted model is utilized to determine cost values associated with a particular sequence of observed outputs associated with the given data set. The cost values are sent to the second party. At least one index of a minimum cost value determined by the second party from the cost values sent thereto is obtained from the second party. A minimum cost sequence resulting from the at least one index is determined as the most likely sequence.

Type: Application

Filed: April 29, 2008

Publication date: October 29, 2009

Inventors: Daniel Martin Bikel, Jeffrey Scott Sorensen

Sampling training data for an automatic speech recognition system based on a benchmark classification distribution

Securely classifying data

Sampling Training Data for an Automatic Speech Recognition System Based on a Benchmark Classification Distribution

Sampling training data for an automatic speech recognition system based on a benchmark classification distribution

Methods and Apparatus for Securely Classifying Data