Patents by Inventor Geoffrey D. Nunberg

Geoffrey D. Nunberg has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7188117
    Abstract: Systems and methods for determining the authoritativeness of a document based on textual, non-topical cues. The authoritativeness of a document is determined by evaluating a set of document content features contained within each document to determine a set of document content feature values, processing the set of document content feature values through a trained document textual authority model, and determining a textual authoritativeness value and/or textual authority class for each document evaluated using the predictive models included in the trained document textual authority model. Estimates of a document's textual authoritativeness value and/or textual authority class can be used to re-rank documents previously retrieved by a search, to expand and improve document query searches, to provide a more complete and robust determination of a document's authoritativeness, and to improve the aggregation of rank-ordered lists with numerically-ordered lists.
    Type: Grant
    Filed: September 3, 2002
    Date of Patent: March 6, 2007
    Assignee: Xerox Corporation
    Inventors: Ayman O. Farahat, Francine R. Chen, Charles R. Mathis, Geoffrey D. Nunberg
  • Patent number: 7167871
    Abstract: Systems and methods for determining the authoritativeness of a document based on textual, non-topical cues. The authoritativeness of a document is determined by evaluating a set of document content features contained within each document to determine a set of document content feature values, processing the set of document content feature values through a trained document textual authority model, and determining a textual authoritativeness value and/or textual authority class for each document evaluated using the predictive models included in the trained document textual authority model. Estimates of a document's textual authoritativeness value and/or textual authority class can be used to re-rank documents previously retrieved by a search, to expand and improve document query searches, to provide a more complete and robust determination of a document's authoritativeness, and to improve the aggregation of rank-ordered lists with numerically-ordered lists.
    Type: Grant
    Filed: September 3, 2002
    Date of Patent: January 23, 2007
    Assignee: Xerox Corporation
    Inventors: Ayman O. Farahat, Francine R. Chen, Charles R. Mathis, Geoffrey D. Nunberg
  • Patent number: 6973423
    Abstract: A processor implemented method of identifying the text genre of a machine-readable, untagged text. The processor implemented method begins by generating a cue vector from the text, which represents occurrences in the text of a first set of nonstructural, surface cues, which are easily computable. Afterward, the processor determines whether the text is an instance of a first text genre using the cue vector and a weighting vector associated with the first text genre.
    Type: Grant
    Filed: June 18, 1998
    Date of Patent: December 6, 2005
    Assignee: Xerox Corporation
    Inventors: Geoffrey D. Nunberg, Hinrich Schuetze, Jan O. Pedersen, Brett L. Kessler, Gregory Grefenstette
  • Publication number: 20030225750
    Abstract: Systems and methods for determining the authoritativeness of a document based on textual, non-topical cues. The authoritativeness of a document is determined by evaluating a set of document content features contained within each document to determine a set of document content feature values, processing the set of document content feature values through a trained document textual authority model, and determining a textual authoritativeness value and/or textual authority class for each document evaluated using the predictive models included in the trained document textual authority model. Estimates of a document's textual authoritativeness value and/or textual authority class can be used to re-rank documents previously retrieved by a search, to expand and improve document query searches, to provide a more complete and robust determination of a document's authoritativeness, and to improve the aggregation of rank-ordered lists with numerically-ordered lists.
    Type: Application
    Filed: September 3, 2002
    Publication date: December 4, 2003
    Applicant: XEROX CORPORATION
    Inventors: Ayman O. Farahat, Francine R. Chen, Charles R. Mathis, Geoffrey D. Nunberg
  • Publication number: 20030226100
    Abstract: Systems and methods for determining the authoritativeness of a document based on textual, non-topical cues. The authoritativeness of a document is determined by evaluating a set of document content features contained within each document to determine a set of document content feature values, processing the set of document content feature values through a trained document textual authority model, and determining a textual authoritativeness value and/or textual authority class for each document evaluated using the predictive models included in the trained document textual authority model. Estimates of a document's textual authoritativeness value and/or textual authority class can be used to re-rank documents previously retrieved by a search, to expand and improve document query searches, to provide a more complete and robust determination of a document's authoritativeness, and to improve the aggregation of rank-ordered lists with numerically-ordered lists.
    Type: Application
    Filed: September 3, 2002
    Publication date: December 4, 2003
    Applicant: XEROX CORPORATION
    Inventors: Ayman O. Farahat, Francine R. Chen, Charles R. Mathis, Geoffrey D. Nunberg
  • Publication number: 20030221166
    Abstract: Systems and methods for determining the authoritativeness of a document based on textual, non-topical cues. The authoritativeness of a document is determined by evaluating a set of document content features contained within each document to determine a set of document content feature values, processing the set of document content feature values through a trained document textual authority model, and determining a textual authoritativeness value and/or textual authority class for each document evaluated using the predictive models included in the trained document textual authority model. Estimates of a document's textual authoritativeness value and/or textual authority class can be used to re-rank documents previously retrieved by a search, to expand and improve document query searches, to provide a more complete and robust determination of a document's authoritativeness, and to improve the aggregation of ran-ordered lists with numerically-ordered lists.
    Type: Application
    Filed: September 3, 2002
    Publication date: November 27, 2003
    Applicant: XEROX CORPORATION
    Inventors: Ayman O. Farahat, Francine R. Chen, Charles R. Mathis, Geoffrey D. Nunberg
  • Patent number: 6505150
    Abstract: A method of filtering according to text genre the results of a topic search of a heterogeneous corpus of untagged, machine-readable texts. Because each text of the corpus has a topic and a text genre, the corpus includes multiple text genres and covers multiple topics. According to the method, a processor first searches the corpus for a first multiplicity of texts that have a first topic. Next, the processor identifies a first set of texts of the first multiplicity that are instances of a first text genre and identifies a second set of texts of the first multiplicity that are instances of a second text genre. Finally, the processor identifies to a computer user the first multiplicity of texts in an order based upon the first text genre and second text genre.
    Type: Grant
    Filed: June 18, 1998
    Date of Patent: January 7, 2003
    Assignee: Xerox Corporation
    Inventors: Geoffrey D. Nunberg, Hinrich Schuetze, Jan O. Pedersen, Brett L. Kessler
  • Publication number: 20020002450
    Abstract: A method of filtering according to text genre the results of a topic search of a heterogeneous corpus of untagged, machine-readable texts. Because each text of the corpus has a topic and a text genre, the corpus includes multiple text genres and covers multiple topics. According to the method, a processor first searches the corpus for a first multiplicity of texts that have a first topic. Next, the processor identifies a first set of texts of the first multiplicity that are instances of a first text genre and identifies a second set of texts of the first multiplicity that are instances of a second text genre. Finally, the processor identifies to a computer user the first multiplicity of texts in an order based upon the first text genre and second text genre.
    Type: Application
    Filed: June 18, 1998
    Publication date: January 3, 2002
    Applicant: Xerox Corp.
    Inventors: GEOFFREY D. NUNBERG, HINRICH SCHUETZE, JAN O. PEDERSEN, BRETT L. KESSLER
  • Patent number: 5111398
    Abstract: A technique for processing natural language text uses a data structure that includes structure data in the text data. The structure data indicates an autonomous punctuational structure of the text, a punctuational structure that is independent of the lexical content of the text and therefore can be manipulated without considering the meaning of the words in the text. The data structure can be a tree in which each node has a textual type such as a paragraph, sentence, clause, phrase, or word. The data structure could alternatively be parallel data sequences, one with codes indicating the text's characters and the other with codes indicating textual types. The data structure is produced and maintained using a grammar of textual types, indicating for each textual type the textual types of units into which it can properly be divided. During editing, a text sequence is generated by applying rendering rules to the data structure, and the text is presented to the user based on the text sequence.
    Type: Grant
    Filed: November 21, 1988
    Date of Patent: May 5, 1992
    Assignee: Xerox Corporation
    Inventors: Geoffrey D. Nunberg, H. Tayloe Stansbury, Curtis Abbott, Brian C. Smith