Patents by Inventor Shachar Mirkin
Shachar Mirkin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10025779Abstract: A system and method predict an optimal machine translation system for a first of a set of users. The method includes, for each of the users, providing a respective user profile which includes rankings for at least some machine translation systems from a set of machine translation systems. The user profile of the first user is updated, based on the user profiles of at least a subset of the other users. The updating includes generating at least one missing ranking. An optimal translation system for the first user from the set of machine translation systems is predicted, based on the updated user profile computed for the first user.Type: GrantFiled: August 13, 2015Date of Patent: July 17, 2018Assignee: XEROX CorporationInventors: Shachar Mirkin, Jean-Luc Meunier
-
Publication number: 20170046333Abstract: A system and method predict an optimal machine translation system for a first of a set of users. The method includes, for each of the users, providing a respective user profile which includes rankings for at least some machine translation systems from a set of machine translation systems. The user profile of the first user is updated, based on the user profiles of at least a subset of the other users. The updating includes generating at least one missing ranking. An optimal translation system for the first user from the set of machine translation systems is predicted, based on the updated user profile computed for the first user.Type: ApplicationFiled: August 13, 2015Publication date: February 16, 2017Applicant: Xerox CorporationInventors: Shachar Mirkin, Jean-Luc Meunier
-
Patent number: 9473637Abstract: Agent utterances are generated for implementing dialog acts recommended by a dialog manager of a call center. To this end, a set of word lattices, each represented as a weighted finite state automaton (WFSA), is constructed from training dialogs between call center agents and second parties (e.g. customers). The word lattices are assigned conditional probabilities over dialog act type. For each dialog act received from the dialog manager, the word lattices are ranked by the conditional probabilities for the dialog act type. At least one word lattice is chosen from the ranking, and is instantiated to generate a recommended agent utterance for implementing the recommended dialog act. The word lattices may be constructed by clustering agent utterances of training dialogs using context features from preceding second party utterances and grammatical dependency link features between words within agent utterances. Path variations of the word lattices may define slots or paraphrases.Type: GrantFiled: July 28, 2015Date of Patent: October 18, 2016Assignee: XEROX CORPORATIONInventors: Sriram Venkatapathy, Shachar Mirkin, Marc Dymetman
-
Publication number: 20160299881Abstract: The disclosed embodiments illustrate methods and systems for summarizing an electronic document. The method includes extracting, by a natural language processor, one or more sentences from said electronic document. The method further includes creating a graph, comprising one or more nodes and one or more edges connecting said one or more nodes, each node being representative of a sentence. An edge is placed between a pair of sentences based on a threshold value and a first score. The first score corresponds to a measure of an entailment between said pair of sentences. Thereafter, the method includes identifying a set of nodes from said one or more nodes by applying a minimum vertex cover algorithm on said graph. The sentences associated with said identified set of nodes are utilizable to create a summary of said electronic document. The method is performed by one or more microprocessors.Type: ApplicationFiled: April 7, 2015Publication date: October 13, 2016Inventors: Anand Gupta, Manpreet Kaur, Shachar Mirkin
-
Patent number: 9442922Abstract: A method for updating a reordering model of a statistical machine translation system includes, at a first time, receiving new training data for retraining an existing statistical machine translation system, the new training data including at least one sentence pair, each pair including a source sentence in a source language and a target sentence in a target language. Phrase pairs are extracted from the new training data and used to generate a new reordering file. A reordering model of the existing statistical machine translation system is updated, based on the new reordering file. The reordering model includes a reordering table. At a second time after the first time, new training data is received. The extracting of phrase pairs, generating of the new reordering file and the updating the reordering model is reiterated, based on the new training data received at the second time.Type: GrantFiled: November 18, 2014Date of Patent: September 13, 2016Assignee: XEROX CORPORATIONInventor: Shachar Mirkin
-
Publication number: 20160140111Abstract: A method for updating a reordering model of a statistical machine translation system includes, at a first time, receiving new training data for retraining an existing statistical machine translation system, the new training data including at least one sentence pair, each pair including a source sentence in a source language and a target sentence in a target language. Phrase pairs are extracted from the new training data and used to generate a new reordering file. A reordering model of the existing statistical machine translation system is updated, based on the new reordering file. The reordering model includes a reordering table. At a second time after the first time, new training data is received. The extracting of phrase pairs, generating of the new reordering file and the updating the reordering model is reiterated, based on the new training data received at the second time.Type: ApplicationFiled: November 18, 2014Publication date: May 19, 2016Inventor: Shachar Mirkin
-
Publication number: 20150199339Abstract: A method for cross language information retrieval includes receiving an input query which includes at least one word in a source language and translating the input query from the source language to a target language to provide a set of translated queries. A set of documents is retrieved from a document collection based on the translated queries. The retrieved documents are translated back into the source language to generate a set of translated documents. An entailment relationship between each of the translated documents and the input query is assessed. The set of translated documents is refined, based on the assessment of the entailment relationship. A subset (or all) of the refined set of translated documents, and/or the target documents to which the translated documents in the subset correspond, is output.Type: ApplicationFiled: May 13, 2014Publication date: July 16, 2015Applicant: XEROX CORPORATIONInventors: Shachar MIRKIN, Nikolaos LAGOS, loan CALAPODESCU
-
Patent number: 9047274Abstract: An authoring method includes generating an authoring interface configured for assisting a user to author a text string in a source language for translation to a target string in a target language. Initial source text entered by the user is received through the authoring interface. Source phrases are selected that each include at least one token of the initial source text as a prefix and at least one other token as a suffix. The source phrase selection is based on a translatability score and optionally on fluency and semantic relatedness scores. A set of candidate phrases is proposed for display on the authoring interface, each of the candidate phases being the suffix of a respective one of the selected source phrases. The user may select one of the candidate phrases, which is appended to the source text following its corresponding prefix, or may enter alternative text. The process may be repeated until the user is satisfied with the source text and the SMT model can then be used for its translation.Type: GrantFiled: January 21, 2013Date of Patent: June 2, 2015Assignee: XEROX CORPORATIONInventors: Sriram Venkatapathy, Shachar Mirkin
-
Publication number: 20150127323Abstract: A method for computing similarity between paths includes extracting corpus statistics for triples from a corpus of text documents, each triple comprising a predicate and respective first and second arguments of the predicate. Documents in the corpus are clustered to form a set of clusters based on textual similarity and temporal similarity. An event-based path similarity is computed between first and second paths, the first path comprising a first predicate and first and second argument slots, the second path comprising a second predicate and first and second argument slots, the event-based path similarity being computed as a function of a corpus statistics-based similarity score which is a function of the corpus statistics for the extracted triples which are instances of the first and second paths, and a cluster-based similarity score which is a function of occurrences of the first and second predicates in the clusters.Type: ApplicationFiled: November 4, 2013Publication date: May 7, 2015Applicant: Xerox CorporationInventors: Guillaume Jacquet, Shachar Mirkin
-
Publication number: 20140358519Abstract: A method for rewriting source text includes receiving source text including a source text string in a first natural language. The source text string is translated with a machine translation system to generate a first target text string in a second natural language. A translation confidence for the source text string is computed, based on the first target text string. At least one alternative text string is generated, where possible, in the first natural language by automatically rewriting the source string. Each alternative string is translated to generate a second target text string in the second natural language. A translation confidence is computed for the alternative text string based on the second target string. Based on the computed translation confidences, one of the alternative text strings may be selected as a candidate replacement for the source text string and may be proposed to a user on a graphical user interface.Type: ApplicationFiled: June 3, 2013Publication date: December 4, 2014Inventors: Shachar Mirkin, Sriram Venkatapathy, Marc Dymetman
-
Publication number: 20140207439Abstract: An authoring method includes generating an authoring interface configured for assisting a user to author a text string in a source language for translation to a target string in a target language. Initial source text entered by the user is received through the authoring interface. Source phrases are selected that each include at least one token of the initial source text as a prefix and at least one other token as a suffix. The source phrase selection is based on a translatability score and optionally on fluency and semantic relatedness scores. A set of candidate phrases is proposed for display on the authoring interface, each of the candidate phases being the suffix of a respective one of the selected source phrases. The user may select one of the candidate phrases, which is appended to the source text following its corresponding prefix, or may enter alternative text. The process may be repeated until the user is satisfied with the source text and the SMT model can then be used for its translation.Type: ApplicationFiled: January 21, 2013Publication date: July 24, 2014Applicant: XEROX CORPORATIONInventors: Sriram Venkatapathy, Shachar Mirkin