Patents by Inventor George E. Heidorn

George E. Heidorn has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7383169
    Abstract: A lexical knowledge base is compiled automatically from a machine-readable source (such as an on-line dictionary or unstructured text). The preferred embodiment of the invention makes use of “backward linking,” by which inverse semantic relations are discerned from the text and used to augment the knowledge base. By this arrangement, on-line dictionaries and other texts can provide formidable sources of “common sense” knowledge about the world.
    Type: Grant
    Filed: April 13, 1994
    Date of Patent: June 3, 2008
    Assignee: Microsoft Corporation
    Inventors: Lucretia H. Vanderwende, Stephen D. Richardson, Karen Jensen, George E. Heidorn, William B. Dolan
  • Patent number: 6360197
    Abstract: A method and apparatus are provided that identify confused characters in a text written in a language having a large number of distinct characters. To identify the confused characters, a set of characters from the text are segmented into individual characters. A confusable character for at least one of the segmented characters is then retrieved. Lexical information is identified for both the segmented characters and the retrieved confusable characters and is used to parse the segmented characters and the confusable characters. Based on the parse, a segmented character is identified that has been confused with a confusable character.
    Type: Grant
    Filed: October 19, 1999
    Date of Patent: March 19, 2002
    Assignee: Microsoft Corporation
    Inventors: Andi Wu, George E. Heidorn
  • Patent number: 6246977
    Abstract: The present invention is directed to performing information retrieval utilizing semantic representation of text. In a preferred embodiment, a tokenizer generates from an input string information retrieval tokens that characterize the semantic relationship expressed in the input string. The tokenizer first creates from the input string a primary logical form characterizing a semantic relationship between selected words in the input string. The tokenizer then identifies hypemyms that each have an “is a” relationship with one of the selected words in the input string. The tokenizer then constructs from the primary logical form one or more alternative logical forms. The tokenizer constructs each alternative logical form by, for each of one or more of the selected words in the input string, replacing the selected word in the primary logical form with an identified hypernym of the selected word.
    Type: Grant
    Filed: August 3, 1999
    Date of Patent: June 12, 2001
    Assignee: Microsoft Corporation
    Inventors: John J. Messerly, George E. Heidorn, Stephen D. Richardson, William B. Dolan, Karen Jensen
  • Patent number: 6161084
    Abstract: The present invention is directed to performing information retrieval utilizing semantic representation of text. In a preferred embodiment, a tokenizer generates from an input string information retrieval tokens that characterize the semantic relationship expressed in the input string. The tokenizer first creates from the input string a primary logical form characterizing a semantic relationship between selected words in the input string. The tokenizer then identifies hypernyms that each have an "is a" relationship with one of the selected words in the input string. The tokenizer then constructs from the primary logical form one or more alternative logical forms. The tokenizer constructs each alternative logical form by, for each of one or more of the selected words in the input string, replacing the selected word in the primary logical form with an identified hypernym of the selected word. Finally, the tokenizer generates tokens representing both the primary logical form and the alternative logical forms.
    Type: Grant
    Filed: August 3, 1999
    Date of Patent: December 12, 2000
    Assignee: Microsoft Corporation
    Inventors: John J. Messerly, George E. Heidorn, Stephen D. Richardson, William B. Dolan, Karen Jensen
  • Patent number: 6076051
    Abstract: The present invention is directed to performing information retrieval utilizing semantic representation of text. In a preferred embodiment, a tokenizer generates from an input string information retrieval tokens that characterize the semantic relationship expressed in the input string. The tokenizer first creates from the input string a primary logical form characterizing a semantic relationship between selected words in the input string. The tokenizer then identifies hypernyms that each have an "is a" relationship with one of the selected words in the input string. The tokenizer then constructs from the primary logical form one or more alternative logical forms. The tokenizer constructs each alternative logical form by, for each of one or more of the selected words in the input string, replacing the selected word in the primary logical form with an identified hypernym of the selected word. Finally, the tokenizer generates tokens representing both the primary logical form and the alternative logical forms.
    Type: Grant
    Filed: March 7, 1997
    Date of Patent: June 13, 2000
    Assignee: Microsoft Corporation
    Inventors: John J. Messerly, George E. Heidorn, Stephen D. Richardson, William B. Dolan, Karen Jensen
  • Patent number: 5999896
    Abstract: A method and system for identifying and resolving commonly confused words in a natural language parser is provided. In a preferred embodiment, a computer system parses input text made up of two or more words using a relation that maps from potentially confused words, including one word among the words of the input text, to possibly intended words. The computer system first identifies the possible parts of speech for each word of the input text including the potentially confused word. The computer system then identifies the possible parts of speech for the possibly intended word to which the relation maps the potentially confused word. Finally, the computer system applies syntactic grammar rules to the identified parts of speech such that a complete syntax tree containing a possible part of speech for the possibly intended word is produced and no complete syntax tree containing a possible part of speech for the potentially confused word is produced.
    Type: Grant
    Filed: June 25, 1996
    Date of Patent: December 7, 1999
    Assignee: Microsoft Corporation
    Inventors: Stephen Darrow Richardson, George E. Heidorn
  • Patent number: 5963894
    Abstract: A method and system for bootstrapping statistical processing into a rule-based natural language parser is provided. In a preferred embodiment, a statistical bootstrapping software facility optimizes the operation of a robust natural language parser that uses a set of lexicon entries to determine possible parts of speech of words from an input string and a set of rules to combine words from the input string into syntactic structures. The facility first operates the parser in a statistics compilation mode, in which, for each of many sample input strings, the parser attempts to apply all applicable rules and lexicon entries. While the parser is operating in the statistics compilation mode, the facility compiles statistics indicating the likelihood of success of each rule and lexicon entry, based on the success of each rule and lexicon entry when applied in the statistics compilation mode.
    Type: Grant
    Filed: May 20, 1997
    Date of Patent: October 5, 1999
    Assignee: Microsoft Corporation
    Inventors: Stephen Darrow Richardson, George E. Heidorn
  • Patent number: 5752052
    Abstract: A method and system for bootstrapping statistical processing into a rule-based natural language parser is provided. In a preferred embodiment, a statistical bootstrapping software facility optimizes the operation of a robust natural language parser that uses a set of lexicon entries to determine possible parts of speech of words from an input string and a set of rules to combine words from the input string into syntactic structures. The facility first operates the parser in a statistics compilation mode, in which, for each of many sample input strings, the parser attempts to apply all applicable rules and lexicon entries. While the parser is operating in the statistics compilation mode, the facility compiles statistics indicating the likelihood of success of each rule and lexicon entry, based on the success of each rule and lexicon entry when applied in the statistics compilation mode.
    Type: Grant
    Filed: June 24, 1994
    Date of Patent: May 12, 1998
    Assignee: Microsoft Corporation
    Inventors: Stephen Darrow Richardson, George E. Heidorn