Patents by Inventor Robert Carter Moore
Robert Carter Moore has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9141622Abstract: Techniques of creating a classifier model for a multi-class linear classifier are disclosed. The classifier model includes feature weights for each of a plurality of feature and label combinations. One technique includes selecting an update set of feature weights to be updated from the classifier model, determining an update for each of the feature weights in the selected update set using a processor, a plurality of the updated determined independently of all other updates and determined based on a largest reduction in an output of a loss function, modifying each of the updates using a step size, and updating the classifier model feature weights using the modified, determined updates.Type: GrantFiled: September 16, 2011Date of Patent: September 22, 2015Assignee: Google Inc.Inventor: Robert Carter Moore
-
Patent number: 9098812Abstract: The claimed subject matter provides systems and/or methods for training feature weights in a statistical machine translation model. The system can include components that obtain lists of translation hypotheses and associated feature values, set a current point in the multidimensional feature weight space to an initial value, chooses a line in the feature weight space that passes through the current point, and resets the current point to optimize the feature weights with respect to the line. The system can further include components that set the current point to be a best point attained, reduce the list of translation hypotheses based on a determination that a particular hypothesis has never been touched in optimizing the feature weights from at least one of an initial staring point or a randomly selected restarting point, and output the point ascertained to be the best point in the feature weight space.Type: GrantFiled: April 14, 2009Date of Patent: August 4, 2015Assignee: Microsoft Technology Licensing, LLCInventors: Robert Carter Moore, Christopher Brian Quirk
-
Patent number: 9069755Abstract: Described is a technology by which a probability is estimated for a token in a sequence of tokens based upon a number of zero or more times (actual counts) that the sequence was observed in training data. The token may be a word in a word sequence, and the estimated probability may be used in a statistical language model. A discount parameter is set independently of interpolation parameters. If the sequence was observed at least once in the training data, a discount probability and an interpolation probability are computed and summed to provide the estimated probability. If the sequence was not observed, the probability is estimated by computing a backoff probability. Also described are various ways to obtain the discount parameter and interpolation parameters.Type: GrantFiled: March 11, 2010Date of Patent: June 30, 2015Assignee: Microsoft Technology Licensing, LLCInventor: Robert Carter Moore
-
Patent number: 8655647Abstract: Described is a technology by which a statistical N-gram (e.g., language) model is trained using an N-gram selection technique that helps reduce the size of the final N-gram model. During training, a higher-order probability estimate for an N-gram is only added to the model when the training data justifies adding the estimate. To this end, if a backoff probability estimate is within a maximum likelihood set determined by that N-gram and the N-gram's associated context, or is between the higher-order estimate and the maximum likelihood set, then the higher-order estimate is not included in the model. The backoff probability estimate may be determined via an iterative process such that the backoff probability estimate is based on the final model rather than any lower-order model. Also described is additional pruning referred to as modified weighted difference pruning.Type: GrantFiled: March 11, 2010Date of Patent: February 18, 2014Assignee: Microsoft CorporationInventor: Robert Carter Moore
-
Patent number: 8620838Abstract: Techniques of line-searching to find a minimum value of a loss function l projected onto a line are disclosed, wherein the loss function l is continuous and piece-wise linear. One technique includes identifying points for at least one output derived from the loss function l at which the slope of the output changes, determining at least one derivative value for an initial point and at least one change in derivative value for at least some of the points using a processor, determining a cumulative reduction in loss for at least some of the points using at least one of the determined derivative values or at least one of the change in derivative values, and selecting a point corresponding to the minimum value of the loss function l based on the cumulative reduction in loss.Type: GrantFiled: September 16, 2011Date of Patent: December 31, 2013Assignee: Google Inc.Inventor: Robert Carter Moore
-
Publication number: 20130018650Abstract: An intelligent selection system selects language model training data to obtain in-domain training datasets. The selection is accomplished by estimating a cross-entropy difference for each candidate text segment from a generic language dataset. The cross-entropy difference is a difference between the cross-entropy of the text segment according to the in-domain language model and the cross-entropy of the text segment according to a language model trained on a random sample of the data source from which the text segment is drawn. If the difference satisfies a threshold condition, the text segment is added as an in-domain text segment to a training dataset.Type: ApplicationFiled: February 1, 2012Publication date: January 17, 2013Applicant: MICROSOFT CORPORATIONInventors: Robert Carter Moore, William Duncan Lewis
-
Publication number: 20110224971Abstract: Described is a technology by which a statistical N-gram (e.g., language) model is trained using an N-gram selection technique that helps reduce the size of the final N-gram model. During training, a higher-order probability estimate for an N-gram is only added to the model when the training data justifies adding the estimate. To this end, if a backoff probability estimate is within a maximum likelihood set determined by that N-gram and the N-gram's associated context, or is between the higher-order estimate and the maximum likelihood set, then the higher-order estimate is not included in the model. The backoff probability estimate may be determined via an iterative process such that the backoff probability estimate is based on the final model rather than any lower-order model. Also described is additional pruning referred to as modified weighted difference pruning.Type: ApplicationFiled: March 11, 2010Publication date: September 15, 2011Applicant: Microsoft CorporationInventor: Robert Carter Moore
-
Publication number: 20110224983Abstract: Described is a technology by which a probability is estimated for a token in a sequence of tokens based upon a number of zero or more times (actual counts) that the sequence was observed in training data. The token may be a word in a word sequence, and the estimated probability may be used in a statistical language model. A discount parameter is set independently of interpolation parameters. If the sequence was observed at least once in the training data, a discount probability and an interpolation probability are computed and summed to provide the estimated probability. If the sequence was not observed, the probability is estimated by computing a backoff probability. Also described are various ways to obtain the discount parameter and interpolation parameters.Type: ApplicationFiled: March 11, 2010Publication date: September 15, 2011Applicant: Microsoft CorporationInventor: Robert Carter Moore
-
Publication number: 20100262575Abstract: The claimed subject matter provides systems and/or methods for training feature weights in a statistical machine translation model. The system can include components that obtain lists of translation hypotheses and associated feature values, set a current point in the multidimensional feature weight space to an initial value, chooses a line in the feature weight space that passes through the current point, and resets the current point to optimize the feature weights with respect to the line. The system can further include components that set the current point to be a best point attained, reduce the list of translation hypotheses based on a determination that a particular hypothesis has never been touched in optimizing the feature weights from at least one of an initial staring point or a randomly selected restarting point, and output the point ascertained to be the best point in the feature weight space.Type: ApplicationFiled: April 14, 2009Publication date: October 14, 2010Applicant: MICROSOFT CORPORATIONInventors: Robert Carter Moore, Christopher Brian Quirk