Patents by Inventor Nicola Cancedda

Nicola Cancedda has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Intelligent query system for attachments

Patent number: 11556548

Abstract: Systems and methods are provided that automatically process a message input, construct an intelligent query based on the processing of the message input, and provide at least one attachable entity according to the processing results and the intelligent query. In some example aspects, a message is received. A natural language processor to determine if the message is requesting content may then process the message. If the message is determined to be requesting content, then candidate sub-queries may be generated to serve as a training set for a query that will be sent to an external search engine to retrieve the attachable entity. The sub-queries may be ranked in order of relevance and performance score. The highest ranked sub-queries may then be used in the actual query that is fired against the external search engine. The external search engine may search local and remote repositories for the top K most relevant attachable entities and present them to a user for attachment in a reply message.

Type: Grant

Filed: August 8, 2017

Date of Patent: January 17, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Amy Huyen Phuoc Nguyen, Bhaskar Mitra, Christophe Jacky Henri Van Gysel, Grzegorz Stanislaw Kukla, Lynn Carter Ayres, Mark Rolland Knight, Matteo Venanzi, Nicola Cancedda, Rachel Elizabeth Sirkin, Robin Michael Thomas, Roy Rosemarin, Shobana Balakrishnan, Sri Ramya Mallipudi, Tariq Sharif, Yamin Wang
INTELLIGENT QUERY SYSTEM FOR ATTACHMENTS

Publication number: 20190050406

Abstract: Systems and methods are provided that automatically process a message input, construct an intelligent query based on the processing of the message input, and provide at least one attachable entity according to the processing results and the intelligent query. In some example aspects, a message is received. A natural language processor to determine if the message is requesting content may then process the message. If the message is determined to be requesting content, then candidate sub-queries may be generated to serve as a training set for a query that will be sent to an external search engine to retrieve the attachable entity. The sub-queries may be ranked in order of relevance and performance score. The highest ranked sub-queries may then be used in the actual query that is fired against the external search engine. The external search engine may search local and remote repositories for the top K most relevant attachable entities and present them to a user for attachment in a reply message.

Type: Application

Filed: August 8, 2017

Publication date: February 14, 2019

Applicant: Microsoft Technology Licensing, LLC

Inventors: Amy Huyen Phuoc NGUYEN, Bhaskar MITRA, Christophe Jacky Henri VAN GYSEL, Grzegorz Stanislaw KUKLA, Lynn Carter AYRES, Mark Rolland KNIGHT, Matteo VENANZI, Nicola CANCEDDA, Rachel Elizabeth SIRKIN, Robin Michael THOMAS, Roy ROSEMARIN, Shobana BALAKRISHNAN, Sri Ramya MALLIPUDI, Tariq SHARIF, Yamin WANG
Estimation of parameters for machine translation without in-domain parallel data

Patent number: 9652453

Abstract: A system and method for estimating parameters for features of a translation scoring function for scoring candidate translations in a target domain are provided. Given a source language corpus for a target domain, a similarity measure is computed between the source corpus and a target domain multi-model, which may be a phrase table derived from phrase tables of comparative domains, weighted as a function of similarity with the source corpus. The parameters of the log-linear function for these comparative domains are known. A mapping function is learned between similarity measure and parameters of the scoring function for the comparative domains. Given the mapping function and the target corpus similarity measure, the parameters of the translation scoring function for the target domain are estimated. For parameters where a mapping function with a threshold correlation is not found, another method for obtaining the target domain parameter can be used.

Type: Grant

Filed: April 14, 2014

Date of Patent: May 16, 2017

Assignee: XEROX CORPORATION

Inventors: Prashant Mathur, Sriram Venkatapathy, Nicola Cancedda
Retrieval of domain relevant phrase tables

Patent number: 9582499

Abstract: A method for generating a phrase table for a target domain includes receiving a source corpus for a target domain and, for each of a set of comparative domain phrase tables, computing a measure of similarity between the source corpus and the comparative domain phrase table. Based on the computed similarity measures, a subset of the comparative domain phrase tables may be identified from the set of comparative domain phrase tables, and/or weights for combining them, and a phrase table is generated for the target domain based on the at least a subset of phrase tables.

Type: Grant

Filed: April 14, 2014

Date of Patent: February 28, 2017

Assignee: XEROX CORPORATION

Inventors: Prashant Mathur, Sriram Venkatapathy, Nicola Cancedda
Dynamic bi-phrases for statistical machine translation

Patent number: 9552355

Abstract: A system and a method for phrase-based translation are disclosed. The method includes receiving source language text to be translated into target language text. One or more dynamic bi-phrases are generated, based on the source text and the application of one or more rules, which may be based on user descriptions. A dynamic feature value is associated with each of the dynamic bi-phrases. For a sentence of the source text, static bi-phrases are retrieved from a bi-phrase table, each of the static bi-phrases being associated with one or more values of static features. Any of the dynamic bi-phrases which each cover at least one word of the source text are also retrieved, which together form a set of active bi-phrases. Translation hypotheses are generated using active bi-phrases from the set and scored with a translation scoring model which takes into account the static and dynamic feature values of the bi-phrases used in the respective hypothesis. A translation, based on the hypothesis scores, is then output.

Type: Grant

Filed: May 20, 2010

Date of Patent: January 24, 2017

Assignee: XEROX CORPORATION

Inventors: Marc Dymetman, Wilker Ferreira Aziz, Nicola Cancedda, Jean-Marc Coursimault, Vassilina Nikoulina, Lucia Specia
Multi-domain machine translation model adaptation

Patent number: 9235567

Abstract: A method adapted to multiple corpora includes training a statistical machine translation model which outputs a score for a candidate translation, in a target language, of a text string in a source language. The training includes learning a weight for each of a set of lexical coverage features that are aggregated in the statistical machine translation model. The lexical coverage features include a lexical coverage feature for each of a plurality of parallel corpora. Each of the lexical coverage features represents a relative number of words of the text string for which the respective parallel corpus contributed a biphrase to the candidate translation. The method may also include learning a weight for each of a plurality of language model features, the language model features comprising one language model feature for each of the domains.

Type: Grant

Filed: January 14, 2013

Date of Patent: January 12, 2016

Assignee: XEROX CORPORATION

Inventors: Markos Mylonakis, Nicola Cancedda
Methods and systems for securely accessing translation resource manager

Patent number: 9213696

Abstract: The disclosed embodiments relate to systems and methods for securely accessing a phrase table. One or more records in the phrase table are encrypted using a first set of keys. The first set of keys is encrypted using a second key. A decoder module is compiled based on the second key. Thereafter, the one or more encrypted records and/or the decoder module are transmitted to the first computing device at the client side. The first set of encrypted keys is transmitted to a second computing device. The first computing device transmits a request to the second computing device to send an encrypted key. The decoder module decrypts the encrypted key to generate a key. The first computing device uses the key to decrypt one or more encrypted records.

Type: Grant

Filed: August 21, 2012

Date of Patent: December 15, 2015

Assignee: Xerox Corporation

Inventor: Nicola Cancedda
Methods and systems for predicting learning curve for statistical machine translation system

Patent number: 9164961

Abstract: The disclosed embodiments relate to a system and method for predicting the learning curve of an SMT system. A set of anchor points are selected. The set of anchor points correspond to a size of a corpus. Thereafter, a gold curve or a benchmark curve is fitted based on the set of anchor points to determine the BLEU score. Based on the BLEU score and a set of parameters associated with the first set of anchor points, a confidence score is computed.

Type: Grant

Filed: November 30, 2012

Date of Patent: October 20, 2015

Assignee: Xerox Corporation

Inventors: Prasanth Kolachina, Nicola Cancedda, Marc Dymetman, Sriram Venkatapathy
ESTIMATION OF PARAMETERS FOR MACHINE TRANSLATION WITHOUT IN-DOMAIN PARALLEL DATA

Publication number: 20150293908

Abstract: A system and method for estimating parameters for features of a translation scoring function for scoring candidate translations in a target domain are provided. Given a source language corpus for a target domain, a similarity measure is computed between the source corpus and a target domain multi-model, which may be a phrase table derived from phrase tables of comparative domains, weighted as a function of similarity with the source corpus. The parameters of the log-linear function for these comparative domains are known. A mapping function is learned between similarity measure and parameters of the scoring function for the comparative domains. Given the mapping function and the target corpus similarity measure, the parameters of the translation scoring function for the target domain are estimated. For parameters where a mapping function with a threshold correlation is not found, another method for obtaining the target domain parameter can be used.

Type: Application

Filed: April 14, 2014

Publication date: October 15, 2015

Applicant: Xerox Corporation

Inventors: Prashant Mathur, Sriram Venkatapathy, Nicola Cancedda
RETRIEVAL OF DOMAIN RELEVANT PHRASE TABLES

Publication number: 20150293910

Abstract: A method for generating a phrase table for a target domain includes receiving a source corpus for a target domain and, for each of a set of comparative domain phrase tables, computing a measure of similarity between the source corpus and the comparative domain phrase table. Based on the computed similarity measures, a subset of the comparative domain phrase tables may be identified from the set of comparative domain phrase tables, and/or weights for combining them, and a phrase table is generated for the target domain based on the at least a subset of phrase tables.

Type: Application

Filed: April 14, 2014

Publication date: October 15, 2015

Applicant: Xerox Corporation

Inventors: Prashant Mathur, Sriram Venkatapathy, Nicola Cancedda
Method for aligning sentences at the word level enforcing selective contiguity constraints

Patent number: 9020804

Abstract: An alignment method includes, for a source sentence in a source language, identifying whether the sentence includes at least one candidate term comprising a contiguous subsequence of words of the source sentence. A target sentence in a target language is aligned with the source sentence. This includes developing a probabilistic model which models conditional probability distributions for alignments between words of the source sentence and words of the target sentence and generating an optimal alignment based on the probabilistic model, including, where the source sentence includes the at least one candidate term, enforcing a contiguity constraint which requires that all the words of the target sentence which are aligned with an identified candidate term form a contiguous subsequence of the target sentence.

Type: Grant

Filed: June 1, 2007

Date of Patent: April 28, 2015

Assignee: Xerox Corporation

Inventors: Madalina Barbaiani, Nicola Cancedda, Christopher R. Dance, Szilárd Zsolt Fazekas, Tamás Gaál, Eric Gaussier
Method for processing optical character recognizer output

Patent number: 8983211

Abstract: A method, a system, and a computer program product for processing the output of an OCR are disclosed. The system receives a first character sequence from the OCR. A first set of characters from the first character sequence are converted to a corresponding second set of characters to generate a second character sequence based on a look-up table and language scores.

Type: Grant

Filed: May 14, 2012

Date of Patent: March 17, 2015

Assignee: Xerox Corporation

Inventors: Sriram Venkatapathy, Nicola Cancedda
Method and system for confidence-weighted learning of factored discriminative language models

Patent number: 8798984

Abstract: A system and method for building a language model for a translation system are provided. The method includes providing a first relative ranking of first and second translations in a target language of a same source string in a source language, determining a second relative ranking of the first and second translations using weights of a language model, the language model including a weight for each of a set of n-gram features, and comparing the first and second relative rankings to determine whether they are in agreement. The method further includes, when the rankings are not in agreement, updating one or more of the weights in the language model as a function of a measure of confidence in the weight, the confidence being a function of previous observations of the n-gram feature in the method.

Type: Grant

Filed: April 27, 2011

Date of Patent: August 5, 2014

Assignee: Xerox Corporation

Inventors: Nicola Cancedda, Viet Ha-Thuc
MULTI-DOMAIN MACHINE TRANSLATION MODEL ADAPTATION

Publication number: 20140200878

Abstract: A method adapted to multiple corpora includes training a statistical machine translation model which outputs a score for a candidate translation, in a target language, of a text string in a source language. The training includes learning a weight for each of a set of lexical coverage features that are aggregated in the statistical machine translation model. The lexical coverage features include a lexical coverage feature for each of a plurality of parallel corpora. Each of the lexical coverage features represents a relative number of words of the text string for which the respective parallel corpus contributed a biphrase to the candidate translation. The method may also include learning a weight for each of a plurality of language model features, the language model features comprising one language model feature for each of the domains.

Type: Application

Filed: January 14, 2013

Publication date: July 17, 2014

Applicant: XEROX CORPORATION

Inventors: Markos Mylonakis, Nicola Cancedda
System and method for productive generation of compound words in statistical machine translation

Patent number: 8781810

Abstract: A method and a system for making merging decisions for a translation are disclosed which are suited to use where the target language is a productive compounding one. The method includes outputting decisions on merging of pairs of words in a translated text string with a merging system. The merging system can include a set of stored heuristics and/or a merging model. In the case of heuristics, these can include a heuristic by which two consecutive words in the string are considered for merging if the first word of the two consecutive words is recognized as a compound modifier and their observed frequency f1 as a closed compound word is larger than an observed frequency f2 of the two consecutive words as a bigram. In the case of a merging model, it can be one that is trained on features associated with pairs of consecutive tokens of text strings in a training set and predetermined merging decisions for the pairs.

Type: Grant

Filed: July 25, 2011

Date of Patent: July 15, 2014

Assignee: Xerox Corporation

Inventors: Nicola Cancedda, Sara Stymne
Machine translation using overlapping biphrase alignments and sampling

Patent number: 8775155

Abstract: A system and method for machine translation are disclosed. Source sentences are received. For each source sentence, a target sentence comprising target words is generated. A plurality of translation neighbors of the target sentence is generated. Phrase alignments are computed between the source sentence and the translation neighbor. Translation neighbors are scored with a translation scoring model, based on the phrase alignment. Translation neighbors are ranked, based on the scores. In training the model, parameters of the model are updated based on an external ranking of the ranked translation neighbors. The generating of translation neighbors, scoring, ranking, and, in the case of training, updating the parameters, are iterated with one of the translation neighbors as the target sentence. In the case of decoding, one of the translation neighbors is output as a translation. The system and method may be at least partially implemented with a computer processor.

Type: Grant

Filed: October 25, 2010

Date of Patent: July 8, 2014

Assignee: Xerox Corporation

Inventors: Benjamin Roth, Andrew R. McCallum, Marc Dymetman, Nicola Cancedda
METHODS AND SYSTEMS FOR PREDICTING LEARNING CURVE FOR STATISTICAL MACHINE TRANSLATION SYSTEM

Publication number: 20140156565

Abstract: The disclosed embodiments relate to a system and method for predicting the learning curve of an SMT system. A set of anchor points are selected. The set of anchor points correspond to a size of a corpus. Thereafter, a gold curve or a benchmark curve is fitted based on the set of anchor points to determine the BLEU score. Based on the BLEU score and a set of parameters associated with the first set of anchor points, a confidence score is computed.

Type: Application

Filed: November 30, 2012

Publication date: June 5, 2014

Applicant: XEROX CORPORATION

Inventors: Prasanth Kolachina, Nicola Cancedda, Marc Dymetman, Sriram Venkatapathy
METHODS AND SYSTEMS FOR SECURELY ACCESSING TRANSLATION RESOURCE MANAGER

Publication number: 20140056428

Abstract: The disclosed embodiments relate to systems and methods for securely accessing a phrase table. One or more records in the phrase table are encrypted using a first set of keys. The first set of keys is encrypted using a second key. A decoder module is compiled based on the second key. Thereafter, the one or more encrypted records and/or the decoder module are transmitted to the first computing device at the client side. The first set of encrypted keys is transmitted to a second computing device. The first computing device transmits a request to the first computing device to send an encrypted key. The decoder module decrypts the encrypted key to generate a key. The first computing device uses the key to decrypt one or more encrypted records.

Type: Application

Filed: August 21, 2012

Publication date: February 27, 2014

Applicant: XEROX CORPORATION

Inventor: Nicola Cancedda
CROWDSOURCING TRANSLATION SERVICES

Publication number: 20140058718

Abstract: A method, system, and computer program product for translating a text file are disclosed. A text file in a source language is received and text snippets from the text file are extracted. The text snippets are distributed to a first set of remote workers for translation. The translated text snippets are validated by a second set of remote workers and the validated text snippets are used to generate a translated text file.

Type: Application

Filed: August 23, 2012

Publication date: February 27, 2014

Applicants: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY, XEROX CORPORATION

Inventors: Anoop Kunchukuttan, Shourya Roy, Mitesh Khapra, Nicola Cancedda, Pushpak Bhattacharyya
ONLINE MARKETPLACE FOR TRANSLATION SERVICES

Publication number: 20140058879

Abstract: A method, system, and computer program product for implementing an online marketplace for translation services is disclosed. A plurality of requirements is received from a client and are sent to one or more service providers. Further, service quotations from the one or more service providers are received. Based on the plurality of requirements and the service quotations, an estimate of quality of service is generated. Lastly, the service quotations and the estimate of quality of service are sent to the client.

Type: Application

Filed: August 23, 2012

Publication date: February 27, 2014

Applicant: XEROX CORPORATION

Inventor: Nicola Cancedda

1 2 3 next