Patents by Inventor Kevin Knight

Kevin Knight has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Statistical machine translation

Patent number: 7624005

Abstract: A method includes detecting a syntactic chunk in a source string in a first language, assigning a syntactic label to the detected syntactic chunk in the source string, mapping the detected syntactic chunk in the source string to a syntactic chunk in a target string in a second language, said mapping based on the assigned syntactic label, and translating the source string into a possible translation in the second language.

Type: Grant

Filed: March 28, 2003

Date of Patent: November 24, 2009

Assignee: University of Southern California

Inventors: Philipp Koehn, Kevin Knight
Adapter for allowing both online and offline training of a text to text system

Patent number: 7624020

Abstract: An adapter for a text to text training. A main corpus is used for training, and a domain specific corpus is used to adapt the main corpus according to the training information in the domain specific corpus. The adaptation is carried out using a technique that may be faster than the main training. The parameter set from the main training is adapted using the domain specific part.

Type: Grant

Filed: September 9, 2005

Date of Patent: November 24, 2009

Assignee: Language Weaver, Inc.

Inventors: Kenji Yamada, Kevin Knight, Greg Langmead
Constructing a translation lexicon from comparable, non-parallel corpora

Patent number: 7620538

Abstract: A machine translation system may use non-parallel monolingual corpora to generate a translation lexicon. The system may identify identically spelled words in the two corpora, and use them as a seed lexicon. The system may use various clues, e.g., context and frequency, to identify and score other possible translation pairs, using the seed lexicon as a basis. An alternative system may use a small bilingual lexicon in addition to non-parallel corpora to learn translations of unknown words and to generate a parallel corpus.

Type: Grant

Filed: March 26, 2003

Date of Patent: November 17, 2009

Assignee: University of Southern California

Inventors: Daniel Marcu, Kevin Knight, Dragos Stefan Munteanu, Philipp Koehn
Named entity translation

Patent number: 7580830

Abstract: Translating named entities from a source language to a target language. In general, in one implementation, the technique includes: generating potential translations of a named entity from a source language to a target language using a pronunciation-based and spelling-based transliteration model, searching a monolingual resource in the target language for information relating to usage frequency, and providing output including at least one of the potential translations based on the usage frequency information.

Type: Grant

Filed: June 7, 2007

Date of Patent: August 25, 2009

Assignee: University of Southern California

Inventors: Yaser Al-Onaizan, Kevin Knight
Phrase to phrase joint probability model for statistical machine translation

Patent number: 7454326

Abstract: A machine translation (MT) system may utilize a phrase-based joint probability model. The model may be used to generate source and target language sentences simultaneously. In an embodiment, the model may learn phrase-to-phrase alignments from word-to-word alignments generated by a word-to-word statistical MT system. The system may utilize the joint probability model for both source-to-target and target-to-source translation applications.

Type: Grant

Filed: March 27, 2003

Date of Patent: November 18, 2008

Assignee: University of Southern California

Inventors: Daniel Marcu, William Wong, Kevin Knight, Philipp Koehn
Task parallelization in a text-to-text system

Patent number: 7389222

Abstract: Parallelization of word alignment for a text-to-text operation. The training data is divided into multiple groups, and training is carried out of each group on separate processors. Different techniques can be carried out to increase the speed of the processing. The hookups can be done only once for all of multiple different iterations. Moreover, parallel operations can apply only to the counts, since this may be the most time-consuming part.

Type: Grant

Filed: April 26, 2006

Date of Patent: June 17, 2008

Assignee: Language Weaver, Inc.

Inventors: Greg Langmead, Kenji Yamada, Kevin Knight, Daniel Marcu
Named entity translation

Publication number: 20080114583

Abstract: Translating named entities from a source language to a target language. In general, in one implementation, the technique includes: generating potential translations of a named entity from a source language to a target language using a pronunciation-based and spelling-based transliteration model, searching a monolingual resource in the target language for information relating to usage frequency, and providing output including at least one of the potential translations based on the usage frequency information.

Type: Application

Filed: June 7, 2007

Publication date: May 15, 2008

Inventors: Yaser Al-Onaizan, Kevin Knight
Statistical translation using a large monolingual corpus

Patent number: 7340388

Abstract: A statistical machine translation (MT) system may use a large monolingual corpus to improve the accuracy of translated phrases/sentences. The MT system may produce a alternative translations and use the large monolingual corpus to (re)rank the alternative translations.

Type: Grant

Filed: March 26, 2003

Date of Patent: March 4, 2008

Assignee: University of Southern California

Inventors: Radu Soricut, Daniel Marcu, Kevin Knight
Named entity translation

Patent number: 7249013

Abstract: Translating named entities from a source language to a target language. In general, in one implementation, the technique includes: generating potential translations of a named entity from a source language to a target language using a pronunciation-based and spelling-based transliteration model, searching a monolingual resource in the target language for information relating to usage frequency, and providing output including at least one of the potential translations based on the usage frequency information.

Type: Grant

Filed: March 11, 2003

Date of Patent: July 24, 2007

Assignee: University of Southern California

Inventors: Yaser Al-Onaizan, Kevin Knight
Language capability assessment and training apparatus and techniques

Publication number: 20070122792

Abstract: A learning system for a text-to-text application such as a machine translation system. The system has questions, and a matrix of correct answers to those questions. Any of the many different correct answers within the matrix can be considered as perfectly correct answers to the question. The system operates by displaying a question, which may be a phrase to be translated, and obtaining an answer to the question from the user. The answer is compared against the matrix and scored. Feedback may also be provided to the user.

Type: Application

Filed: November 9, 2005

Publication date: May 31, 2007

Inventors: Michel Galley, Kevin Knight, Daniel Marcu
Adapter for allowing both online and offline training of a text to text system

Publication number: 20070094169

Abstract: An adapter for a text to text training. A main corpus is used for training, and a domain specific corpus is used to adapt the main corpus according to the training information in the domain specific corpus. The adaptation is carried out using a technique that may be faster than the main training. The parameter set from the main training is adapted using the domain specific part.

Type: Application

Filed: September 9, 2005

Publication date: April 26, 2007

Inventors: Kenji Yamada, Kevin Knight, Greg Langmead
Integer programming decoder for machine translation

Patent number: 7177792

Abstract: A machine translation (MT) decoder may transform a translation problem into an integer programming problem, such as a Traveling Salesman Problem (TSP). The decoder may invoke an integer program (IP) solver to solve the integer programming problem and output a likely decoding based on the solution.

Type: Grant

Filed: May 31, 2002

Date of Patent: February 13, 2007

Assignee: University of Southern California

Inventors: Kevin Knight, Kenji Yamada
Identifying documents which form translated pairs, within a document collection

Publication number: 20070033001

Abstract: A training system for text to text application. The training system finds groups of documents, and identifies automatically similar documents in the groups which are similar. The automatically identified documents can then be used for training of the text to text application. The comparison uses reduced size versions of the documents in order to minimize the amount of processing.

Type: Application

Filed: August 3, 2005

Publication date: February 8, 2007

Inventors: Ion Muslea, Kevin Knight, Daniel Marcu
Integer programming decoder for machine translation

Publication number: 20060195312

Abstract: A machine translation (MT) decoder may transform a translation problem into an integer programming problem, such as a Traveling Salesman Problem (TSP). The decoder may invoke an integer program (IP) solver to solve the integer programming problem and output a likely decoding based on the solution.

Type: Application

Filed: April 28, 2006

Publication date: August 31, 2006

Inventors: Kevin Knight, Kenji Yamada
Training for a text-to-text application which uses string to tree conversion for training and decoding

Publication number: 20060142995

Abstract: Training and translation using trees and/or subtrees as parts of the rules. A target language is word aligned with a source language, and at least one of the languages is parsed into trees. The trees are used for training, by aligning conversion steps, forming a manual set of information representing the conversion steps and then learning rules from that reduced set. The rules include subtrees as parts thereof, and are used for decoding, along with an n-gram language model and a syntax based language mode.

Type: Application

Filed: October 12, 2005

Publication date: June 29, 2006

Inventors: Kevin Knight, Michel Galley, Mark Hopkins, Daniel Marcu, Ignacio Thayer
Training tree transducers

Publication number: 20050234701

Abstract: Training using tree transducers is described. Given sample input/output pairs as training, and given a set of tree transducer rules, the information is combined to yield locally optimal weights for those rules. This combination is carried out by building a weighted derivation forest for each input/output pair and applying counting methods to those forests.

Type: Application

Filed: March 15, 2005

Publication date: October 20, 2005

Inventors: Jonathan Graehl, Kevin Knight
Sentence generator

Publication number: 20040034520

Abstract: Systems and techniques for generating language from an input use a symbolic generator and a statistical ranker. The symbolic generator may use a transformation algorithm to transform one or more portions of the input. For example, mapping rules such as morph rules, recasting rules, filling rules, and/or ordering rules may be used. The symbolic generator may output a plurality of possible expressions, while the statistical ranker may rank at least some of the possible expressions to determine the best output.

Type: Application

Filed: March 4, 2003

Publication date: February 19, 2004

Inventors: Irene Langkilde-Geary, Kevin Knight
Phrase to phrase joint probability model for statistical machine translation

Publication number: 20040030551

Abstract: A machine translation (MT) system may utilize a phrase-based joint probability model. The model may be used to generate source and target language sentences simultaneously. In an embodiment, the model may learn phrase-to-phrase alignments from word-to-word alignments generated by a word-to-word statistical MT system. The system may utilize the joint probability model for both source-to-target and target-to-source translation applications.

Type: Application

Filed: March 27, 2003

Publication date: February 12, 2004

Inventors: Daniel Marcu, William Wong, Kevin Knight, Philipp Koehn
Statistical machine translation

Publication number: 20040024581

Abstract: A method includes detecting a syntactic chunk in a source string in a first language, assigning a syntactic label to the detected syntactic chunk in the source string, mapping the detected syntactic chunk in the source string to a syntactic chunk in a target string in a second language, said mapping based on the assigned syntactic label, and translating the source string into a possible translation in the second language.

Type: Application

Filed: March 28, 2003

Publication date: February 5, 2004

Inventors: Philipp Koehn, Kevin Knight
Statistical translation using a large monolingual corpus

Publication number: 20030233222

Abstract: A statistical machine translation (MT) system may use a large monolingual corpus to improve the accuracy of translated phrases/sentences. The MT system may produce a alternative translations and use the large monolingual corpus to (re)rank the alternative translations.

Type: Application

Filed: March 26, 2003

Publication date: December 18, 2003

Inventors: Radu Soricut, Daniel Marcu, Kevin Knight

prev 1 2 3 4 next