Patents Assigned to Houghton Mifflin Company
  • Patent number: 4868750
    Abstract: A system for the grammatical annotation of natural language receives natural language text and annotates each word with a set of tags indicative of its possible grammatical or syntactic uses. An empirical probability of collocation function defined on pairs of tags is iteratively extended to a selected set of tag sequences of increasing length so as to select a most probable tag for each word of a sequence of ambiguously-tagged words. For listed pairs of commonly confused words a substitute calculation reveals erroneous use of the wrong word. For words with tags having abnormally low frequency of occurrence, a stored table of reduced probability factors corrects the calculation. Once the text words have been annotated with their most probable tags, the tagged text is parsed by a parser which successively applies phrasal, predicate and clausal analysis to build higher structures from the disambiguated tag strings.
    Type: Grant
    Filed: October 7, 1987
    Date of Patent: September 19, 1989
    Assignee: Houghton Mifflin Company
    Inventors: Henry Kucera, Alwin B. Carus, Jeffrey G. Hopkins
  • Patent number: 4864502
    Abstract: An apparatus for the grammatical anlysis of digitally encoded text material receives encoded text, annotates each word of the text with a tag, and processes the annotated text to identify basic syntactic units such as noun phrases and verb groups. A clausal analyzer then operates on the identified nominal and predicate structures to identify clause boundaries and clause types. During processing, feature agreement between parts of successively larger entities--noun phrases, predicates, and clauses--are successively derived. When an error is detected, an error maessage identifies the error and displays a suggested correction.
    Type: Grant
    Filed: October 7, 1987
    Date of Patent: September 5, 1989
    Assignee: Houghton Mifflin Company
    Inventors: Henry Kucera, Alwin B. Carus
  • Patent number: 4864501
    Abstract: A system for annotating digitally encoded text includes a dictionary of base forms. For each base form, a first set of tags represents possible grammatical and syntactic properties of the word, and may encode inflectional paradigms of the base form, or feature agreement behavior and special processing. If a text word is not found in the dictionary, an inflectional analyzer looks up one or more base forms derived from the word, and if found, and annotates them with their dictionary tags. A morphological analyzer assigns tags to words not retrieved in the dictionary. The morphological analyzer recognizes words formed by prefixation and suffixation, as well as proper nouns, ordinals, idiomatic expressions, and certain classes of character strings. The tagged words of a sentence are then processed to parse the sentence.
    Type: Grant
    Filed: October 7, 1987
    Date of Patent: September 5, 1989
    Assignee: Houghton Mifflin Company
    Inventors: Henry Kucera, Alwin B. Carus
  • Patent number: 4783758
    Abstract: A spelling correction system compares a correctly spelled word with an incorrectly spelled word to determine the degree of substitutability. If the system determines that the words are highly similar, the system flags the correct word as exclusively substitutable for the incorrect word. If the system determines the words are of moderate similarity, the correct word is flagged as a possible substitute for the incorrect word.
    Type: Grant
    Filed: February 5, 1985
    Date of Patent: November 8, 1988
    Assignee: Houghton Mifflin Company
    Inventor: Henry Kucera
  • Patent number: 4773009
    Abstract: An electronic text analyzer operates on an ordered block of digitally coded text by analyzing sequential strings thereof to determine paragraph and sentence boundaries. Each string is broken down into component words. Possible abbreviations are identified and checked against a table of common abbreviations to identify abbreviations which cannot end a sentence. End punctuation and the following string are analyzed to identify the terminal word of a sentence. When sentence boundaries have been determined, the test may be further processed by a grammar checker, a readability analyzer, or other higher-level text processing system.A preferred embodiment includes a readability analyzer having a syllable counter for determining the number of syllables in each word. The system includes a modified common-word table having an empirical syllable-count field. A checker first determines if a word is in the table and, if so, returns its syllable count.
    Type: Grant
    Filed: June 6, 1986
    Date of Patent: September 20, 1988
    Assignee: Houghton Mifflin Company
    Inventors: Henry Kucera, Rachael Sokolowski, Jacqueline Russom
  • Patent number: 4771401
    Abstract: An apparatus and method for linguistic expression processing provides features for spelling verification, correction, and dictionary database storage. The system utilizes a linguistically salient word skeleton-forming process to correct both typrographic and cognitive spelling errors. The system also uses a suspect expression modification sequence to recognize and correct typographical spelling errors. A linguistic expression database includes a master lexicon having expression blocks arranged in accord with respective collation ranges of skeletons of expressions contained therein. In one preferred embodiment, these linguistically salient word skeletons corresponding to the master lexicon expressions are not retained in the database.
    Type: Grant
    Filed: March 31, 1986
    Date of Patent: September 13, 1988
    Assignee: Houghton Mifflin Company
    Inventors: Ilia Kaufman, Henry Kucera
  • Patent number: 4730269
    Abstract: Automated spelling correction converts, by prescribed linguistic procedures, each word to be corrected to a skeleton, and compares that skeleton with a data base of skeletons derived by identical linguistic procedures from a dictionary of correctly spelled words. In the event of a match between the two skeletal terms, the correctly spelled word (or words) associated with the matched skeleton is presented for replacement of the misspelled word. In the event the comparison does not yield a correct match, the skeletal form of the misspelled word is repeatedly modified and each modified form is compared with the data base of skeletons.
    Type: Grant
    Filed: March 31, 1986
    Date of Patent: March 8, 1988
    Assignee: Houghton Mifflin Company
    Inventor: Henry Kucera
  • Patent number: 4724523
    Abstract: The invention relates to a system for storing, retrieving, and processing linguistic information. In one aspect, the invention provides a system for storing linguistic expressions, which system includes a main dictionary storage section and three coding sections, one for each of regular paradigm information, partially irregular paradigm information, and fully irregular paradigm information. In another aspect, the invention provides a system for evaluating linguistic expressions, e.g., words, and determining their grammatical and inflectional information. The invention has applicability in the field of word processing.
    Type: Grant
    Filed: July 1, 1985
    Date of Patent: February 9, 1988
    Assignee: Houghton Mifflin Company
    Inventor: Henry Kucera
  • Patent number: 4674066
    Abstract: An electronic database search system can identify database records having textual expressions that match, or are similar to, an operator-designated search expression. The system features a mechanism for transforming linguistic expressions, e.g., words, into linguistically salient word skeletons. Skeletal modification and suffix stripping features are employed to enhance expression-matching qualities of the word skeletons and to reduce data storage requirements.
    Type: Grant
    Filed: March 3, 1986
    Date of Patent: June 16, 1987
    Assignee: Houghton Mifflin Company
    Inventor: Henry Kucera
  • Patent number: 4580241
    Abstract: Automated spelling correction converts, by prescribed linguistic procedures, each word to be corrected to a skeleton, and compares that skeleton with a data base of skeletons derived by identical linguistic procedures from a dictionary of correctly spelled words. In the event of a match between the two skeletal terms, the correctly spelled word (or words) associated with the matched skeleton is presented for replacement of the misspelled word. In the event the comparison does not yield a correct match, the skeletal form of the misspelled word is repeatedly modified and each modified form is compared with the data base of skeletons.
    Type: Grant
    Filed: February 18, 1983
    Date of Patent: April 1, 1986
    Assignee: Houghton Mifflin Company
    Inventor: Henry Kucera