Patents by Inventor George E. Heidorn

George E. Heidorn has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and system for compiling a lexical knowledge base

Patent number: 7383169

Abstract: A lexical knowledge base is compiled automatically from a machine-readable source (such as an on-line dictionary or unstructured text). The preferred embodiment of the invention makes use of “backward linking,” by which inverse semantic relations are discerned from the text and used to augment the knowledge base. By this arrangement, on-line dictionaries and other texts can provide formidable sources of “common sense” knowledge about the world.

Type: Grant

Filed: April 13, 1994

Date of Patent: June 3, 2008

Assignee: Microsoft Corporation

Inventors: Lucretia H. Vanderwende, Stephen D. Richardson, Karen Jensen, George E. Heidorn, William B. Dolan
Method and apparatus for identifying erroneous characters in text

Patent number: 6360197

Abstract: A method and apparatus are provided that identify confused characters in a text written in a language having a large number of distinct characters. To identify the confused characters, a set of characters from the text are segmented into individual characters. A confusable character for at least one of the segmented characters is then retrieved. Lexical information is identified for both the segmented characters and the retrieved confusable characters and is used to parse the segmented characters and the confusable characters. Based on the parse, a segmented character is identified that has been confused with a confusable character.

Type: Grant

Filed: October 19, 1999

Date of Patent: March 19, 2002

Assignee: Microsoft Corporation

Inventors: Andi Wu, George E. Heidorn
Information retrieval utilizing semantic representation of text and based on constrained expansion of query words

Patent number: 6246977

Abstract: The present invention is directed to performing information retrieval utilizing semantic representation of text. In a preferred embodiment, a tokenizer generates from an input string information retrieval tokens that characterize the semantic relationship expressed in the input string. The tokenizer first creates from the input string a primary logical form characterizing a semantic relationship between selected words in the input string. The tokenizer then identifies hypemyms that each have an “is a” relationship with one of the selected words in the input string. The tokenizer then constructs from the primary logical form one or more alternative logical forms. The tokenizer constructs each alternative logical form by, for each of one or more of the selected words in the input string, replacing the selected word in the primary logical form with an identified hypernym of the selected word.

Type: Grant

Filed: August 3, 1999

Date of Patent: June 12, 2001

Assignee: Microsoft Corporation

Inventors: John J. Messerly, George E. Heidorn, Stephen D. Richardson, William B. Dolan, Karen Jensen
Information retrieval utilizing semantic representation of text by identifying hypernyms and indexing multiple tokenized semantic structures to a same passage of text

Patent number: 6161084

Abstract: The present invention is directed to performing information retrieval utilizing semantic representation of text. In a preferred embodiment, a tokenizer generates from an input string information retrieval tokens that characterize the semantic relationship expressed in the input string. The tokenizer first creates from the input string a primary logical form characterizing a semantic relationship between selected words in the input string. The tokenizer then identifies hypernyms that each have an "is a" relationship with one of the selected words in the input string. The tokenizer then constructs from the primary logical form one or more alternative logical forms. The tokenizer constructs each alternative logical form by, for each of one or more of the selected words in the input string, replacing the selected word in the primary logical form with an identified hypernym of the selected word. Finally, the tokenizer generates tokens representing both the primary logical form and the alternative logical forms.

Type: Grant

Filed: August 3, 1999

Date of Patent: December 12, 2000

Assignee: Microsoft Corporation

Inventors: John J. Messerly, George E. Heidorn, Stephen D. Richardson, William B. Dolan, Karen Jensen
Information retrieval utilizing semantic representation of text

Patent number: 6076051

Abstract: The present invention is directed to performing information retrieval utilizing semantic representation of text. In a preferred embodiment, a tokenizer generates from an input string information retrieval tokens that characterize the semantic relationship expressed in the input string. The tokenizer first creates from the input string a primary logical form characterizing a semantic relationship between selected words in the input string. The tokenizer then identifies hypernyms that each have an "is a" relationship with one of the selected words in the input string. The tokenizer then constructs from the primary logical form one or more alternative logical forms. The tokenizer constructs each alternative logical form by, for each of one or more of the selected words in the input string, replacing the selected word in the primary logical form with an identified hypernym of the selected word. Finally, the tokenizer generates tokens representing both the primary logical form and the alternative logical forms.

Type: Grant

Filed: March 7, 1997

Date of Patent: June 13, 2000

Assignee: Microsoft Corporation

Inventors: John J. Messerly, George E. Heidorn, Stephen D. Richardson, William B. Dolan, Karen Jensen
Method and system for identifying and resolving commonly confused words in a natural language parser

Patent number: 5999896

Abstract: A method and system for identifying and resolving commonly confused words in a natural language parser is provided. In a preferred embodiment, a computer system parses input text made up of two or more words using a relation that maps from potentially confused words, including one word among the words of the input text, to possibly intended words. The computer system first identifies the possible parts of speech for each word of the input text including the potentially confused word. The computer system then identifies the possible parts of speech for the possibly intended word to which the relation maps the potentially confused word. Finally, the computer system applies syntactic grammar rules to the identified parts of speech such that a complete syntax tree containing a possible part of speech for the possibly intended word is produced and no complete syntax tree containing a possible part of speech for the potentially confused word is produced.

Type: Grant

Filed: June 25, 1996

Date of Patent: December 7, 1999

Assignee: Microsoft Corporation

Inventors: Stephen Darrow Richardson, George E. Heidorn
Method and system for bootstrapping statistical processing into a rule-based natural language parser

Patent number: 5963894

Abstract: A method and system for bootstrapping statistical processing into a rule-based natural language parser is provided. In a preferred embodiment, a statistical bootstrapping software facility optimizes the operation of a robust natural language parser that uses a set of lexicon entries to determine possible parts of speech of words from an input string and a set of rules to combine words from the input string into syntactic structures. The facility first operates the parser in a statistics compilation mode, in which, for each of many sample input strings, the parser attempts to apply all applicable rules and lexicon entries. While the parser is operating in the statistics compilation mode, the facility compiles statistics indicating the likelihood of success of each rule and lexicon entry, based on the success of each rule and lexicon entry when applied in the statistics compilation mode.

Type: Grant

Filed: May 20, 1997

Date of Patent: October 5, 1999

Assignee: Microsoft Corporation

Inventors: Stephen Darrow Richardson, George E. Heidorn
Method and system for bootstrapping statistical processing into a rule-based natural language parser

Patent number: 5752052

Abstract: A method and system for bootstrapping statistical processing into a rule-based natural language parser is provided. In a preferred embodiment, a statistical bootstrapping software facility optimizes the operation of a robust natural language parser that uses a set of lexicon entries to determine possible parts of speech of words from an input string and a set of rules to combine words from the input string into syntactic structures. The facility first operates the parser in a statistics compilation mode, in which, for each of many sample input strings, the parser attempts to apply all applicable rules and lexicon entries. While the parser is operating in the statistics compilation mode, the facility compiles statistics indicating the likelihood of success of each rule and lexicon entry, based on the success of each rule and lexicon entry when applied in the statistics compilation mode.

Type: Grant

Filed: June 24, 1994

Date of Patent: May 12, 1998

Assignee: Microsoft Corporation

Inventors: Stephen Darrow Richardson, George E. Heidorn