Abstract: A statistical translation memory (TMEM) may be generated by training a translation model with a naturally generated TMEM. A number of tuples may be extracted from each translation pair in the TMEM. The tuples may include a phrase in a source language and a corresponding phrase in a target language. The tuples may also include probability information relating to the phrases generated by the translation model.
Abstract: An adapter for a text to text training. A main corpus is used for training, and a domain specific corpus is used to adapt the main corpus according to the training information in the domain specific corpus. The adaptation is carried out using a technique that may be faster than the main training. The parameter set from the main training is adapted using the domain specific part.
Type:
Grant
Filed:
September 9, 2005
Date of Patent:
November 24, 2009
Assignee:
Language Weaver, Inc.
Inventors:
Kenji Yamada, Kevin Knight, Greg Langmead
Abstract: Parallelization of word alignment for a text-to-text operation. The training data is divided into multiple groups, and training is carried out of each group on separate processors. Different techniques can be carried out to increase the speed of the processing. The hookups can be done only once for all of multiple different iterations. Moreover, parallel operations can apply only to the counts, since this may be the most time-consuming part.
Type:
Grant
Filed:
April 26, 2006
Date of Patent:
June 17, 2008
Assignee:
Language Weaver, Inc.
Inventors:
Greg Langmead, Kenji Yamada, Kevin Knight, Daniel Marcu