Patents by Inventor Shachar Mirkin

Shachar Mirkin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10025779
    Abstract: A system and method predict an optimal machine translation system for a first of a set of users. The method includes, for each of the users, providing a respective user profile which includes rankings for at least some machine translation systems from a set of machine translation systems. The user profile of the first user is updated, based on the user profiles of at least a subset of the other users. The updating includes generating at least one missing ranking. An optimal translation system for the first user from the set of machine translation systems is predicted, based on the updated user profile computed for the first user.
    Type: Grant
    Filed: August 13, 2015
    Date of Patent: July 17, 2018
    Assignee: XEROX Corporation
    Inventors: Shachar Mirkin, Jean-Luc Meunier
  • Publication number: 20170046333
    Abstract: A system and method predict an optimal machine translation system for a first of a set of users. The method includes, for each of the users, providing a respective user profile which includes rankings for at least some machine translation systems from a set of machine translation systems. The user profile of the first user is updated, based on the user profiles of at least a subset of the other users. The updating includes generating at least one missing ranking. An optimal translation system for the first user from the set of machine translation systems is predicted, based on the updated user profile computed for the first user.
    Type: Application
    Filed: August 13, 2015
    Publication date: February 16, 2017
    Applicant: Xerox Corporation
    Inventors: Shachar Mirkin, Jean-Luc Meunier
  • Patent number: 9473637
    Abstract: Agent utterances are generated for implementing dialog acts recommended by a dialog manager of a call center. To this end, a set of word lattices, each represented as a weighted finite state automaton (WFSA), is constructed from training dialogs between call center agents and second parties (e.g. customers). The word lattices are assigned conditional probabilities over dialog act type. For each dialog act received from the dialog manager, the word lattices are ranked by the conditional probabilities for the dialog act type. At least one word lattice is chosen from the ranking, and is instantiated to generate a recommended agent utterance for implementing the recommended dialog act. The word lattices may be constructed by clustering agent utterances of training dialogs using context features from preceding second party utterances and grammatical dependency link features between words within agent utterances. Path variations of the word lattices may define slots or paraphrases.
    Type: Grant
    Filed: July 28, 2015
    Date of Patent: October 18, 2016
    Assignee: XEROX CORPORATION
    Inventors: Sriram Venkatapathy, Shachar Mirkin, Marc Dymetman
  • Publication number: 20160299881
    Abstract: The disclosed embodiments illustrate methods and systems for summarizing an electronic document. The method includes extracting, by a natural language processor, one or more sentences from said electronic document. The method further includes creating a graph, comprising one or more nodes and one or more edges connecting said one or more nodes, each node being representative of a sentence. An edge is placed between a pair of sentences based on a threshold value and a first score. The first score corresponds to a measure of an entailment between said pair of sentences. Thereafter, the method includes identifying a set of nodes from said one or more nodes by applying a minimum vertex cover algorithm on said graph. The sentences associated with said identified set of nodes are utilizable to create a summary of said electronic document. The method is performed by one or more microprocessors.
    Type: Application
    Filed: April 7, 2015
    Publication date: October 13, 2016
    Inventors: Anand Gupta, Manpreet Kaur, Shachar Mirkin
  • Patent number: 9442922
    Abstract: A method for updating a reordering model of a statistical machine translation system includes, at a first time, receiving new training data for retraining an existing statistical machine translation system, the new training data including at least one sentence pair, each pair including a source sentence in a source language and a target sentence in a target language. Phrase pairs are extracted from the new training data and used to generate a new reordering file. A reordering model of the existing statistical machine translation system is updated, based on the new reordering file. The reordering model includes a reordering table. At a second time after the first time, new training data is received. The extracting of phrase pairs, generating of the new reordering file and the updating the reordering model is reiterated, based on the new training data received at the second time.
    Type: Grant
    Filed: November 18, 2014
    Date of Patent: September 13, 2016
    Assignee: XEROX CORPORATION
    Inventor: Shachar Mirkin
  • Publication number: 20160140111
    Abstract: A method for updating a reordering model of a statistical machine translation system includes, at a first time, receiving new training data for retraining an existing statistical machine translation system, the new training data including at least one sentence pair, each pair including a source sentence in a source language and a target sentence in a target language. Phrase pairs are extracted from the new training data and used to generate a new reordering file. A reordering model of the existing statistical machine translation system is updated, based on the new reordering file. The reordering model includes a reordering table. At a second time after the first time, new training data is received. The extracting of phrase pairs, generating of the new reordering file and the updating the reordering model is reiterated, based on the new training data received at the second time.
    Type: Application
    Filed: November 18, 2014
    Publication date: May 19, 2016
    Inventor: Shachar Mirkin
  • Publication number: 20150199339
    Abstract: A method for cross language information retrieval includes receiving an input query which includes at least one word in a source language and translating the input query from the source language to a target language to provide a set of translated queries. A set of documents is retrieved from a document collection based on the translated queries. The retrieved documents are translated back into the source language to generate a set of translated documents. An entailment relationship between each of the translated documents and the input query is assessed. The set of translated documents is refined, based on the assessment of the entailment relationship. A subset (or all) of the refined set of translated documents, and/or the target documents to which the translated documents in the subset correspond, is output.
    Type: Application
    Filed: May 13, 2014
    Publication date: July 16, 2015
    Applicant: XEROX CORPORATION
    Inventors: Shachar MIRKIN, Nikolaos LAGOS, loan CALAPODESCU
  • Patent number: 9047274
    Abstract: An authoring method includes generating an authoring interface configured for assisting a user to author a text string in a source language for translation to a target string in a target language. Initial source text entered by the user is received through the authoring interface. Source phrases are selected that each include at least one token of the initial source text as a prefix and at least one other token as a suffix. The source phrase selection is based on a translatability score and optionally on fluency and semantic relatedness scores. A set of candidate phrases is proposed for display on the authoring interface, each of the candidate phases being the suffix of a respective one of the selected source phrases. The user may select one of the candidate phrases, which is appended to the source text following its corresponding prefix, or may enter alternative text. The process may be repeated until the user is satisfied with the source text and the SMT model can then be used for its translation.
    Type: Grant
    Filed: January 21, 2013
    Date of Patent: June 2, 2015
    Assignee: XEROX CORPORATION
    Inventors: Sriram Venkatapathy, Shachar Mirkin
  • Publication number: 20150127323
    Abstract: A method for computing similarity between paths includes extracting corpus statistics for triples from a corpus of text documents, each triple comprising a predicate and respective first and second arguments of the predicate. Documents in the corpus are clustered to form a set of clusters based on textual similarity and temporal similarity. An event-based path similarity is computed between first and second paths, the first path comprising a first predicate and first and second argument slots, the second path comprising a second predicate and first and second argument slots, the event-based path similarity being computed as a function of a corpus statistics-based similarity score which is a function of the corpus statistics for the extracted triples which are instances of the first and second paths, and a cluster-based similarity score which is a function of occurrences of the first and second predicates in the clusters.
    Type: Application
    Filed: November 4, 2013
    Publication date: May 7, 2015
    Applicant: Xerox Corporation
    Inventors: Guillaume Jacquet, Shachar Mirkin
  • Publication number: 20140358519
    Abstract: A method for rewriting source text includes receiving source text including a source text string in a first natural language. The source text string is translated with a machine translation system to generate a first target text string in a second natural language. A translation confidence for the source text string is computed, based on the first target text string. At least one alternative text string is generated, where possible, in the first natural language by automatically rewriting the source string. Each alternative string is translated to generate a second target text string in the second natural language. A translation confidence is computed for the alternative text string based on the second target string. Based on the computed translation confidences, one of the alternative text strings may be selected as a candidate replacement for the source text string and may be proposed to a user on a graphical user interface.
    Type: Application
    Filed: June 3, 2013
    Publication date: December 4, 2014
    Inventors: Shachar Mirkin, Sriram Venkatapathy, Marc Dymetman
  • Publication number: 20140207439
    Abstract: An authoring method includes generating an authoring interface configured for assisting a user to author a text string in a source language for translation to a target string in a target language. Initial source text entered by the user is received through the authoring interface. Source phrases are selected that each include at least one token of the initial source text as a prefix and at least one other token as a suffix. The source phrase selection is based on a translatability score and optionally on fluency and semantic relatedness scores. A set of candidate phrases is proposed for display on the authoring interface, each of the candidate phases being the suffix of a respective one of the selected source phrases. The user may select one of the candidate phrases, which is appended to the source text following its corresponding prefix, or may enter alternative text. The process may be repeated until the user is satisfied with the source text and the SMT model can then be used for its translation.
    Type: Application
    Filed: January 21, 2013
    Publication date: July 24, 2014
    Applicant: XEROX CORPORATION
    Inventors: Sriram Venkatapathy, Shachar Mirkin