Patents by Inventor Robert Carter Moore

Robert Carter Moore has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Feature weight training techniques

Patent number: 9141622

Abstract: Techniques of creating a classifier model for a multi-class linear classifier are disclosed. The classifier model includes feature weights for each of a plurality of feature and label combinations. One technique includes selecting an update set of feature weights to be updated from the classifier model, determining an update for each of the feature weights in the selected update set using a processor, a plurality of the updated determined independently of all other updates and determined based on a largest reduction in an output of a loss function, modifying each of the updates using a step size, and updating the classifier model feature weights using the modified, determined updates.

Type: Grant

Filed: September 16, 2011

Date of Patent: September 22, 2015

Assignee: Google Inc.

Inventor: Robert Carter Moore
Faster minimum error rate training for weighted linear models

Patent number: 9098812

Abstract: The claimed subject matter provides systems and/or methods for training feature weights in a statistical machine translation model. The system can include components that obtain lists of translation hypotheses and associated feature values, set a current point in the multidimensional feature weight space to an initial value, chooses a line in the feature weight space that passes through the current point, and resets the current point to optimize the feature weights with respect to the line. The system can further include components that set the current point to be a best point attained, reduce the list of translation hypotheses based on a determination that a particular hypothesis has never been touched in optimizing the feature weights from at least one of an initial staring point or a randomly selected restarting point, and output the point ascertained to be the best point in the feature weight space.

Type: Grant

Filed: April 14, 2009

Date of Patent: August 4, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Robert Carter Moore, Christopher Brian Quirk
N-gram model smoothing with independently controllable parameters

Patent number: 9069755

Abstract: Described is a technology by which a probability is estimated for a token in a sequence of tokens based upon a number of zero or more times (actual counts) that the sequence was observed in training data. The token may be a word in a word sequence, and the estimated probability may be used in a statistical language model. A discount parameter is set independently of interpolation parameters. If the sequence was observed at least once in the training data, a discount probability and an interpolation probability are computed and summed to provide the estimated probability. If the sequence was not observed, the probability is estimated by computing a backoff probability. Also described are various ways to obtain the discount parameter and interpolation parameters.

Type: Grant

Filed: March 11, 2010

Date of Patent: June 30, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventor: Robert Carter Moore
N-gram selection for practical-sized language models

Patent number: 8655647

Abstract: Described is a technology by which a statistical N-gram (e.g., language) model is trained using an N-gram selection technique that helps reduce the size of the final N-gram model. During training, a higher-order probability estimate for an N-gram is only added to the model when the training data justifies adding the estimate. To this end, if a backoff probability estimate is within a maximum likelihood set determined by that N-gram and the N-gram's associated context, or is between the higher-order estimate and the maximum likelihood set, then the higher-order estimate is not included in the model. The backoff probability estimate may be determined via an iterative process such that the backoff probability estimate is based on the final model rather than any lower-order model. Also described is additional pruning referred to as modified weighted difference pruning.

Type: Grant

Filed: March 11, 2010

Date of Patent: February 18, 2014

Assignee: Microsoft Corporation

Inventor: Robert Carter Moore
Line searching techniques

Patent number: 8620838

Abstract: Techniques of line-searching to find a minimum value of a loss function l projected onto a line are disclosed, wherein the loss function l is continuous and piece-wise linear. One technique includes identifying points for at least one output derived from the loss function l at which the slope of the output changes, determining at least one derivative value for an initial point and at least one change in derivative value for at least some of the points using a processor, determining a cumulative reduction in loss for at least some of the points using at least one of the determined derivative values or at least one of the change in derivative values, and selecting a point corresponding to the minimum value of the loss function l based on the cumulative reduction in loss.

Type: Grant

Filed: September 16, 2011

Date of Patent: December 31, 2013

Assignee: Google Inc.

Inventor: Robert Carter Moore
Selection of Language Model Training Data

Publication number: 20130018650

Abstract: An intelligent selection system selects language model training data to obtain in-domain training datasets. The selection is accomplished by estimating a cross-entropy difference for each candidate text segment from a generic language dataset. The cross-entropy difference is a difference between the cross-entropy of the text segment according to the in-domain language model and the cross-entropy of the text segment according to a language model trained on a random sample of the data source from which the text segment is drawn. If the difference satisfies a threshold condition, the text segment is added as an in-domain text segment to a training dataset.

Type: Application

Filed: February 1, 2012

Publication date: January 17, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Robert Carter Moore, William Duncan Lewis
N-Gram Selection for Practical-Sized Language Models

Publication number: 20110224971

Abstract: Described is a technology by which a statistical N-gram (e.g., language) model is trained using an N-gram selection technique that helps reduce the size of the final N-gram model. During training, a higher-order probability estimate for an N-gram is only added to the model when the training data justifies adding the estimate. To this end, if a backoff probability estimate is within a maximum likelihood set determined by that N-gram and the N-gram's associated context, or is between the higher-order estimate and the maximum likelihood set, then the higher-order estimate is not included in the model. The backoff probability estimate may be determined via an iterative process such that the backoff probability estimate is based on the final model rather than any lower-order model. Also described is additional pruning referred to as modified weighted difference pruning.

Type: Application

Filed: March 11, 2010

Publication date: September 15, 2011

Applicant: Microsoft Corporation

Inventor: Robert Carter Moore
N-Gram Model Smoothing with Independently Controllable Parameters

Publication number: 20110224983

Abstract: Described is a technology by which a probability is estimated for a token in a sequence of tokens based upon a number of zero or more times (actual counts) that the sequence was observed in training data. The token may be a word in a word sequence, and the estimated probability may be used in a statistical language model. A discount parameter is set independently of interpolation parameters. If the sequence was observed at least once in the training data, a discount probability and an interpolation probability are computed and summed to provide the estimated probability. If the sequence was not observed, the probability is estimated by computing a backoff probability. Also described are various ways to obtain the discount parameter and interpolation parameters.

Type: Application

Filed: March 11, 2010

Publication date: September 15, 2011

Applicant: Microsoft Corporation

Inventor: Robert Carter Moore
FASTER MINIMUM ERROR RATE TRAINING FOR WEIGHTED LINEAR MODELS

Publication number: 20100262575

Abstract: The claimed subject matter provides systems and/or methods for training feature weights in a statistical machine translation model. The system can include components that obtain lists of translation hypotheses and associated feature values, set a current point in the multidimensional feature weight space to an initial value, chooses a line in the feature weight space that passes through the current point, and resets the current point to optimize the feature weights with respect to the line. The system can further include components that set the current point to be a best point attained, reduce the list of translation hypotheses based on a determination that a particular hypothesis has never been touched in optimizing the feature weights from at least one of an initial staring point or a randomly selected restarting point, and output the point ascertained to be the best point in the feature weight space.

Type: Application

Filed: April 14, 2009

Publication date: October 14, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Robert Carter Moore, Christopher Brian Quirk