Patents by Inventor Nicola Cancedda

Nicola Cancedda has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11556548
    Abstract: Systems and methods are provided that automatically process a message input, construct an intelligent query based on the processing of the message input, and provide at least one attachable entity according to the processing results and the intelligent query. In some example aspects, a message is received. A natural language processor to determine if the message is requesting content may then process the message. If the message is determined to be requesting content, then candidate sub-queries may be generated to serve as a training set for a query that will be sent to an external search engine to retrieve the attachable entity. The sub-queries may be ranked in order of relevance and performance score. The highest ranked sub-queries may then be used in the actual query that is fired against the external search engine. The external search engine may search local and remote repositories for the top K most relevant attachable entities and present them to a user for attachment in a reply message.
    Type: Grant
    Filed: August 8, 2017
    Date of Patent: January 17, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Amy Huyen Phuoc Nguyen, Bhaskar Mitra, Christophe Jacky Henri Van Gysel, Grzegorz Stanislaw Kukla, Lynn Carter Ayres, Mark Rolland Knight, Matteo Venanzi, Nicola Cancedda, Rachel Elizabeth Sirkin, Robin Michael Thomas, Roy Rosemarin, Shobana Balakrishnan, Sri Ramya Mallipudi, Tariq Sharif, Yamin Wang
  • Publication number: 20190050406
    Abstract: Systems and methods are provided that automatically process a message input, construct an intelligent query based on the processing of the message input, and provide at least one attachable entity according to the processing results and the intelligent query. In some example aspects, a message is received. A natural language processor to determine if the message is requesting content may then process the message. If the message is determined to be requesting content, then candidate sub-queries may be generated to serve as a training set for a query that will be sent to an external search engine to retrieve the attachable entity. The sub-queries may be ranked in order of relevance and performance score. The highest ranked sub-queries may then be used in the actual query that is fired against the external search engine. The external search engine may search local and remote repositories for the top K most relevant attachable entities and present them to a user for attachment in a reply message.
    Type: Application
    Filed: August 8, 2017
    Publication date: February 14, 2019
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Amy Huyen Phuoc NGUYEN, Bhaskar MITRA, Christophe Jacky Henri VAN GYSEL, Grzegorz Stanislaw KUKLA, Lynn Carter AYRES, Mark Rolland KNIGHT, Matteo VENANZI, Nicola CANCEDDA, Rachel Elizabeth SIRKIN, Robin Michael THOMAS, Roy ROSEMARIN, Shobana BALAKRISHNAN, Sri Ramya MALLIPUDI, Tariq SHARIF, Yamin WANG
  • Patent number: 9652453
    Abstract: A system and method for estimating parameters for features of a translation scoring function for scoring candidate translations in a target domain are provided. Given a source language corpus for a target domain, a similarity measure is computed between the source corpus and a target domain multi-model, which may be a phrase table derived from phrase tables of comparative domains, weighted as a function of similarity with the source corpus. The parameters of the log-linear function for these comparative domains are known. A mapping function is learned between similarity measure and parameters of the scoring function for the comparative domains. Given the mapping function and the target corpus similarity measure, the parameters of the translation scoring function for the target domain are estimated. For parameters where a mapping function with a threshold correlation is not found, another method for obtaining the target domain parameter can be used.
    Type: Grant
    Filed: April 14, 2014
    Date of Patent: May 16, 2017
    Assignee: XEROX CORPORATION
    Inventors: Prashant Mathur, Sriram Venkatapathy, Nicola Cancedda
  • Patent number: 9582499
    Abstract: A method for generating a phrase table for a target domain includes receiving a source corpus for a target domain and, for each of a set of comparative domain phrase tables, computing a measure of similarity between the source corpus and the comparative domain phrase table. Based on the computed similarity measures, a subset of the comparative domain phrase tables may be identified from the set of comparative domain phrase tables, and/or weights for combining them, and a phrase table is generated for the target domain based on the at least a subset of phrase tables.
    Type: Grant
    Filed: April 14, 2014
    Date of Patent: February 28, 2017
    Assignee: XEROX CORPORATION
    Inventors: Prashant Mathur, Sriram Venkatapathy, Nicola Cancedda
  • Patent number: 9552355
    Abstract: A system and a method for phrase-based translation are disclosed. The method includes receiving source language text to be translated into target language text. One or more dynamic bi-phrases are generated, based on the source text and the application of one or more rules, which may be based on user descriptions. A dynamic feature value is associated with each of the dynamic bi-phrases. For a sentence of the source text, static bi-phrases are retrieved from a bi-phrase table, each of the static bi-phrases being associated with one or more values of static features. Any of the dynamic bi-phrases which each cover at least one word of the source text are also retrieved, which together form a set of active bi-phrases. Translation hypotheses are generated using active bi-phrases from the set and scored with a translation scoring model which takes into account the static and dynamic feature values of the bi-phrases used in the respective hypothesis. A translation, based on the hypothesis scores, is then output.
    Type: Grant
    Filed: May 20, 2010
    Date of Patent: January 24, 2017
    Assignee: XEROX CORPORATION
    Inventors: Marc Dymetman, Wilker Ferreira Aziz, Nicola Cancedda, Jean-Marc Coursimault, Vassilina Nikoulina, Lucia Specia
  • Patent number: 9235567
    Abstract: A method adapted to multiple corpora includes training a statistical machine translation model which outputs a score for a candidate translation, in a target language, of a text string in a source language. The training includes learning a weight for each of a set of lexical coverage features that are aggregated in the statistical machine translation model. The lexical coverage features include a lexical coverage feature for each of a plurality of parallel corpora. Each of the lexical coverage features represents a relative number of words of the text string for which the respective parallel corpus contributed a biphrase to the candidate translation. The method may also include learning a weight for each of a plurality of language model features, the language model features comprising one language model feature for each of the domains.
    Type: Grant
    Filed: January 14, 2013
    Date of Patent: January 12, 2016
    Assignee: XEROX CORPORATION
    Inventors: Markos Mylonakis, Nicola Cancedda
  • Patent number: 9213696
    Abstract: The disclosed embodiments relate to systems and methods for securely accessing a phrase table. One or more records in the phrase table are encrypted using a first set of keys. The first set of keys is encrypted using a second key. A decoder module is compiled based on the second key. Thereafter, the one or more encrypted records and/or the decoder module are transmitted to the first computing device at the client side. The first set of encrypted keys is transmitted to a second computing device. The first computing device transmits a request to the second computing device to send an encrypted key. The decoder module decrypts the encrypted key to generate a key. The first computing device uses the key to decrypt one or more encrypted records.
    Type: Grant
    Filed: August 21, 2012
    Date of Patent: December 15, 2015
    Assignee: Xerox Corporation
    Inventor: Nicola Cancedda
  • Patent number: 9164961
    Abstract: The disclosed embodiments relate to a system and method for predicting the learning curve of an SMT system. A set of anchor points are selected. The set of anchor points correspond to a size of a corpus. Thereafter, a gold curve or a benchmark curve is fitted based on the set of anchor points to determine the BLEU score. Based on the BLEU score and a set of parameters associated with the first set of anchor points, a confidence score is computed.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: October 20, 2015
    Assignee: Xerox Corporation
    Inventors: Prasanth Kolachina, Nicola Cancedda, Marc Dymetman, Sriram Venkatapathy
  • Publication number: 20150293908
    Abstract: A system and method for estimating parameters for features of a translation scoring function for scoring candidate translations in a target domain are provided. Given a source language corpus for a target domain, a similarity measure is computed between the source corpus and a target domain multi-model, which may be a phrase table derived from phrase tables of comparative domains, weighted as a function of similarity with the source corpus. The parameters of the log-linear function for these comparative domains are known. A mapping function is learned between similarity measure and parameters of the scoring function for the comparative domains. Given the mapping function and the target corpus similarity measure, the parameters of the translation scoring function for the target domain are estimated. For parameters where a mapping function with a threshold correlation is not found, another method for obtaining the target domain parameter can be used.
    Type: Application
    Filed: April 14, 2014
    Publication date: October 15, 2015
    Applicant: Xerox Corporation
    Inventors: Prashant Mathur, Sriram Venkatapathy, Nicola Cancedda
  • Publication number: 20150293910
    Abstract: A method for generating a phrase table for a target domain includes receiving a source corpus for a target domain and, for each of a set of comparative domain phrase tables, computing a measure of similarity between the source corpus and the comparative domain phrase table. Based on the computed similarity measures, a subset of the comparative domain phrase tables may be identified from the set of comparative domain phrase tables, and/or weights for combining them, and a phrase table is generated for the target domain based on the at least a subset of phrase tables.
    Type: Application
    Filed: April 14, 2014
    Publication date: October 15, 2015
    Applicant: Xerox Corporation
    Inventors: Prashant Mathur, Sriram Venkatapathy, Nicola Cancedda
  • Patent number: 9020804
    Abstract: An alignment method includes, for a source sentence in a source language, identifying whether the sentence includes at least one candidate term comprising a contiguous subsequence of words of the source sentence. A target sentence in a target language is aligned with the source sentence. This includes developing a probabilistic model which models conditional probability distributions for alignments between words of the source sentence and words of the target sentence and generating an optimal alignment based on the probabilistic model, including, where the source sentence includes the at least one candidate term, enforcing a contiguity constraint which requires that all the words of the target sentence which are aligned with an identified candidate term form a contiguous subsequence of the target sentence.
    Type: Grant
    Filed: June 1, 2007
    Date of Patent: April 28, 2015
    Assignee: Xerox Corporation
    Inventors: Madalina Barbaiani, Nicola Cancedda, Christopher R. Dance, Szilárd Zsolt Fazekas, Tamás Gaál, Eric Gaussier
  • Patent number: 8983211
    Abstract: A method, a system, and a computer program product for processing the output of an OCR are disclosed. The system receives a first character sequence from the OCR. A first set of characters from the first character sequence are converted to a corresponding second set of characters to generate a second character sequence based on a look-up table and language scores.
    Type: Grant
    Filed: May 14, 2012
    Date of Patent: March 17, 2015
    Assignee: Xerox Corporation
    Inventors: Sriram Venkatapathy, Nicola Cancedda
  • Patent number: 8798984
    Abstract: A system and method for building a language model for a translation system are provided. The method includes providing a first relative ranking of first and second translations in a target language of a same source string in a source language, determining a second relative ranking of the first and second translations using weights of a language model, the language model including a weight for each of a set of n-gram features, and comparing the first and second relative rankings to determine whether they are in agreement. The method further includes, when the rankings are not in agreement, updating one or more of the weights in the language model as a function of a measure of confidence in the weight, the confidence being a function of previous observations of the n-gram feature in the method.
    Type: Grant
    Filed: April 27, 2011
    Date of Patent: August 5, 2014
    Assignee: Xerox Corporation
    Inventors: Nicola Cancedda, Viet Ha-Thuc
  • Publication number: 20140200878
    Abstract: A method adapted to multiple corpora includes training a statistical machine translation model which outputs a score for a candidate translation, in a target language, of a text string in a source language. The training includes learning a weight for each of a set of lexical coverage features that are aggregated in the statistical machine translation model. The lexical coverage features include a lexical coverage feature for each of a plurality of parallel corpora. Each of the lexical coverage features represents a relative number of words of the text string for which the respective parallel corpus contributed a biphrase to the candidate translation. The method may also include learning a weight for each of a plurality of language model features, the language model features comprising one language model feature for each of the domains.
    Type: Application
    Filed: January 14, 2013
    Publication date: July 17, 2014
    Applicant: XEROX CORPORATION
    Inventors: Markos Mylonakis, Nicola Cancedda
  • Patent number: 8781810
    Abstract: A method and a system for making merging decisions for a translation are disclosed which are suited to use where the target language is a productive compounding one. The method includes outputting decisions on merging of pairs of words in a translated text string with a merging system. The merging system can include a set of stored heuristics and/or a merging model. In the case of heuristics, these can include a heuristic by which two consecutive words in the string are considered for merging if the first word of the two consecutive words is recognized as a compound modifier and their observed frequency f1 as a closed compound word is larger than an observed frequency f2 of the two consecutive words as a bigram. In the case of a merging model, it can be one that is trained on features associated with pairs of consecutive tokens of text strings in a training set and predetermined merging decisions for the pairs.
    Type: Grant
    Filed: July 25, 2011
    Date of Patent: July 15, 2014
    Assignee: Xerox Corporation
    Inventors: Nicola Cancedda, Sara Stymne
  • Patent number: 8775155
    Abstract: A system and method for machine translation are disclosed. Source sentences are received. For each source sentence, a target sentence comprising target words is generated. A plurality of translation neighbors of the target sentence is generated. Phrase alignments are computed between the source sentence and the translation neighbor. Translation neighbors are scored with a translation scoring model, based on the phrase alignment. Translation neighbors are ranked, based on the scores. In training the model, parameters of the model are updated based on an external ranking of the ranked translation neighbors. The generating of translation neighbors, scoring, ranking, and, in the case of training, updating the parameters, are iterated with one of the translation neighbors as the target sentence. In the case of decoding, one of the translation neighbors is output as a translation. The system and method may be at least partially implemented with a computer processor.
    Type: Grant
    Filed: October 25, 2010
    Date of Patent: July 8, 2014
    Assignee: Xerox Corporation
    Inventors: Benjamin Roth, Andrew R. McCallum, Marc Dymetman, Nicola Cancedda
  • Publication number: 20140156565
    Abstract: The disclosed embodiments relate to a system and method for predicting the learning curve of an SMT system. A set of anchor points are selected. The set of anchor points correspond to a size of a corpus. Thereafter, a gold curve or a benchmark curve is fitted based on the set of anchor points to determine the BLEU score. Based on the BLEU score and a set of parameters associated with the first set of anchor points, a confidence score is computed.
    Type: Application
    Filed: November 30, 2012
    Publication date: June 5, 2014
    Applicant: XEROX CORPORATION
    Inventors: Prasanth Kolachina, Nicola Cancedda, Marc Dymetman, Sriram Venkatapathy
  • Publication number: 20140058718
    Abstract: A method, system, and computer program product for translating a text file are disclosed. A text file in a source language is received and text snippets from the text file are extracted. The text snippets are distributed to a first set of remote workers for translation. The translated text snippets are validated by a second set of remote workers and the validated text snippets are used to generate a translated text file.
    Type: Application
    Filed: August 23, 2012
    Publication date: February 27, 2014
    Applicants: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY, XEROX CORPORATION
    Inventors: Anoop Kunchukuttan, Shourya Roy, Mitesh Khapra, Nicola Cancedda, Pushpak Bhattacharyya
  • Publication number: 20140056428
    Abstract: The disclosed embodiments relate to systems and methods for securely accessing a phrase table. One or more records in the phrase table are encrypted using a first set of keys. The first set of keys is encrypted using a second key. A decoder module is compiled based on the second key. Thereafter, the one or more encrypted records and/or the decoder module are transmitted to the first computing device at the client side. The first set of encrypted keys is transmitted to a second computing device. The first computing device transmits a request to the first computing device to send an encrypted key. The decoder module decrypts the encrypted key to generate a key. The first computing device uses the key to decrypt one or more encrypted records.
    Type: Application
    Filed: August 21, 2012
    Publication date: February 27, 2014
    Applicant: XEROX CORPORATION
    Inventor: Nicola Cancedda
  • Publication number: 20140058879
    Abstract: A method, system, and computer program product for implementing an online marketplace for translation services is disclosed. A plurality of requirements is received from a client and are sent to one or more service providers. Further, service quotations from the one or more service providers are received. Based on the plurality of requirements and the service quotations, an estimate of quality of service is generated. Lastly, the service quotations and the estimate of quality of service are sent to the client.
    Type: Application
    Filed: August 23, 2012
    Publication date: February 27, 2014
    Applicant: XEROX CORPORATION
    Inventor: Nicola Cancedda