Patents by Inventor Eric Gaussier

Eric Gaussier has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20060009963
    Abstract: Various methods formulated using a geometric interpretation for identifying bilingual pairs in comparable corpora using a bilingual dictionary are disclosed. The methods may be used separately or in combination to compute the similarity between bilingual pairs.
    Type: Application
    Filed: November 1, 2004
    Publication date: January 12, 2006
    Inventors: Eric Gaussier, Jean-Michel Renders, Herve Dejean, Cyril Goutte, Irina Matveeva
  • Patent number: 6983240
    Abstract: A method generates normalized representations of strings, in particular sentences. The method, which can be used for translation, receives an input string. The input string is subjected to a first operation out of a plurality of operating functions for linguistically processing the input string to generate a first normalized representation of the input string that includes linguistic information. The first normalized representation is then subjected to a second operation for replacing linguistic information in the first normalized representation by abstract variables and to generate a second normalized representation.
    Type: Grant
    Filed: December 18, 2000
    Date of Patent: January 3, 2006
    Assignee: Xerox Corporation
    Inventors: Salah Ait-Mokhtar, Jean-Pierre Chanod, Eric Gaussier
  • Publication number: 20050187892
    Abstract: A method for categorizing a set of objects includes defining a set of categories in which at least one category in the set is dependent on another category in the set; organizing the set of categories in a hierarchy that embodies any dependencies among the categories in the set; for each object, assigning to the object one or more categories l1 . . . lP where li?{1 . . . L} from a set {1 . . . L} of possible categories, wherein the assigned categories represent a subset of categories for which the object is relevant; defining a new set of labels z comprising all possible combinations of any number of the categories, z?{{1},{2}, . . . {L},{1,2}, . . . {1,L},{2,3}, . . . {1,2,3}, . . . {1,2, . . . L}}, such that if an object is relevant to several categories, the object must be assigned the label z corresponding to the subset of all relevant categories; and assigning to the object the several categories and the subcategories of the several categories.
    Type: Application
    Filed: February 9, 2004
    Publication date: August 25, 2005
    Inventors: Cyril Goutte, Eric Gaussier
  • Patent number: 6678677
    Abstract: An information retrieval apparatus retrieves information from databases comprising internal representations of documents. Syntactic relations between terms of the query are extracted and an internal representation of the query is created based on the terms of the query and the extracted syntactic relations. New terms are appended to a semantic lattice if the query includes terms not included in the semantic lattice. The query is projected onto the documents in the database by comparing the internal representation and terms of the query to the internal representations and terms of the documents using the semantic lattice for comparing the terms and a similarity is computed between the query and each document. The documents are ranked according to their computed similarities and are output as retrieved documents according to the established rank order.
    Type: Grant
    Filed: December 19, 2000
    Date of Patent: January 13, 2004
    Assignee: Xerox Corporation
    Inventors: Claude Roux, Denys Proux, Eric Gaussier
  • Publication number: 20030101187
    Abstract: Methods, systems, and articles of manufacture consistent with certain principles related to the present invention enable a computing system to perform hierarchical topical clustering of text data based on statistical modeling of co-occurrences of (document, word) pairs. The computing system may be configured to receive a collection of documents, each document including a plurality of words, and perform a modified deterministic annealing Expectation-Maximization (EM) process on the collection to produce a softly assigned hierarchy of nodes. The process may involve assigning documents and document fragments to multiple nodes in the hierarchy based on words included in the documents, such that a document may be assigned to any ancestor node included in the hierarchy, thus eliminating the hard assignment of documents in the hierarchy.
    Type: Application
    Filed: October 19, 2001
    Publication date: May 29, 2003
    Applicant: Xerox Corporation
    Inventors: Eric Gaussier, Francine Chen, Ashok Chhabedia Popat
  • Publication number: 20020116169
    Abstract: A method generates normalized representations of strings, in particular sentences. The method, which can be used for translation, receives an input string. The input string is subjected to a first operation out of a plurality of operating functions for linguistically processing the input string to generate a first normalized representation of the input string that includes linguistic information. The first normalized representation is then subjected to a second operation for replacing linguistic information in the first normalized representation by abstract variables and to generate a second normalized representation.
    Type: Application
    Filed: December 18, 2000
    Publication date: August 22, 2002
    Applicant: Xerox Corporation
    Inventors: Salah Ait-Mokhtar, Jean-Pierre Chanod, Eric Gaussier
  • Publication number: 20020111941
    Abstract: An information retrieval apparatus retrieves information from databases comprising internal representations of documents. Syntactic relations between terms of the query are extracted and an internal representation of the query is created based on the terms of the query and the extracted syntactic relations. New terms are appended to a semantic lattice if the query includes terms not included in the semantic lattice. The query is projected onto the documents in the database by comparing the internal representation and terms of the query to the internal representations and terms of the documents using the semantic lattice for comparing the terms and a similarity is computed between the query and each document. The documents are ranked according to their computed similarities and are output as retrieved documents according to the established rank order.
    Type: Application
    Filed: December 19, 2000
    Publication date: August 15, 2002
    Applicant: Xerox Corporation
    Inventors: Claude Roux, Denys Proux, Eric Gaussier
  • Patent number: 6430557
    Abstract: A query word is used to identify one of a number of word groups, by first determining whether the query word is in any of the word groups. If not, attempts to modify the query word are made in accordance with successive suffix relationships in a sequence until a modified query word is obtained that is in one of the word groups. The sequence of suffix relationships, which can be pairwise relationships, can be defined by a list ordered according to the frequencies of occurrence of the suffix relationships in a natural language. If a modified query word is obtained that is in one of the word groups, information identifying the word group can be provided, such as a representative of the group or a list of words in the group.
    Type: Grant
    Filed: December 16, 1998
    Date of Patent: August 6, 2002
    Assignee: Xerox Corporation
    Inventors: Eric Gaussier, Gregory Grefenstette, Jean-Pierre Chanod
  • Patent number: 6308149
    Abstract: A set of words of a natural language is grouped by automatically obtaining suffix relation data that indicate a relation value for each of a set of relationships between suffixes that occur in the natural language, and, then, by automatically clustering the words in the set using the relation values from the suffix relation data, to obtain group data indicating groups of words. Two or more words in a group have suffixes as in one of the relationships and, preceding the suffixes, equivalent substrings. The relationships can be pairwise relationships, and the relation value can indicate the number of occurrences of a suffix pair. The suffix relation data can be obtained using an inflectional lexicon. Complete link clustering can be used.
    Type: Grant
    Filed: December 16, 1998
    Date of Patent: October 23, 2001
    Assignee: Xerox Corporation
    Inventors: Eric Gaussier, Gregory Grefenstette, Jean-Pierre Chanod
  • Patent number: 6236958
    Abstract: A terminology extraction system which allows for automatic creation of bilingual terminology has a source text which comprises at least one sequence of source terms, aligned with a target text which also comprises at least one sequence of target terms. A term extractor builds a network from each source and target sequence wherein each node of the network comprises at least one term and such that each combination of source terms is included within one source node and each combination of target terms is included within one target node. The term extractor links each source node with each target node, and through a flow optimization method selects relevant links in the resulting network. Once the term extractor has been run on the entire set of aligned sequences, a term statistics circuit computes an association score for each pair of linked source/target terms, and finally the scored pairs of linked source/target term that are considered relevant bilingual terms are stored in a bilingual terminology database.
    Type: Grant
    Filed: May 15, 1998
    Date of Patent: May 22, 2001
    Assignee: International Business Machines Corporation
    Inventors: Jean-Marc Lange, Eric Gaussier