Patents by Inventor Alwin B. Carus

Alwin B. Carus has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7558778
    Abstract: A semantic discovery and exploration system is disclosed where an environment enabling a developer or user to uncover, navigate, and organize semantic patterns and structures in a document collection with or without the aid of structured knowledge. The semantic discovery and exploration system provides techniques for searching document collections, categorizing documents, inducing lists of related concepts, and identifying clusters of related terms and documents. This system operates both without and with infusions of structured knowledge such as gazetteers, thesauruses, taxonomies and ontologies. System performance improves when structured knowledge is incorporated. The semantic discovery and exploration system may be used as a first step in developing an information extraction system such as to categorize or cluster documents in a particular domain or to develop gazetteers and as a part of a deployed run-time information extraction system.
    Type: Grant
    Filed: June 20, 2007
    Date of Patent: July 7, 2009
    Assignee: Information Extraction Systems, Inc.
    Inventors: Alwin B. Carus, Thomas J. DePlonty
  • Publication number: 20090070380
    Abstract: The Clinical Data Container (CDC) is a method for packaging, transporting, and viewing medical reports, their associated data elements, images, and data from medical information systems for use by physicians and patients.
    Type: Application
    Filed: September 19, 2008
    Publication date: March 12, 2009
    Applicant: Dictaphone Corporation
    Inventors: Robert G. Schwager, Alwin B. Carus, Harry J. Ogrinc, Jeffrey G. Hopkins, Susan Reggie, David E. Pearah
  • Publication number: 20080255884
    Abstract: A computer implemented method for generating a report that includes latent information, comprising receiving an input data stream that includes latent information, performing one of normalization, validation, and extraction of the input data stream, processing the input data stream to identify latent information within the data stream that is required for generation of a particular report, wherein said processing of the input data stream to identify latent information comprises of identifying a relevant portion of the input data stream, bounding the relevant portion of the input data stream, classifying and normalizing the bounded data, activating a relevant report template based on said identified latent information, populating said template with template-specified data, and processing the template-specified data to generate a report.
    Type: Application
    Filed: May 15, 2008
    Publication date: October 16, 2008
    Applicant: Nuance Communications, Inc.
    Inventors: Alwin B. Carus, Harry J. Ogrinc
  • Publication number: 20080126273
    Abstract: We have discovered a system and method for improving the quality of information extraction applications consisting of an ensemble of per-user, adaptive, on-line machine-learning classifiers that adapt to document content and judgments of users by continuously incorporating feedback from information extraction results and corrections that users apply to these results. The satellite classifier ensemble uses only the immediately available features for classifier improvement and it is independent of the complex cascade of earlier decisions leading to the final information extraction result. The machine-learning classifiers may also provide explanations or justifications for classification decisions in the form of rules, other machine-learning classifiers may provide feedback in the form of supporting instances or patterns.
    Type: Application
    Filed: June 21, 2007
    Publication date: May 29, 2008
    Applicant: Information Extraction Systems, Inc.
    Inventors: Alwin B. Carus, Thomas J. DePlonty
  • Patent number: 7379946
    Abstract: Methods and systems for classifying and normalizing information using a combination of traditional data input methods, natural language processing, and predetermined templates are disclosed. One method may include activating a template. Based on this template, template-specific data may also be retrieved. After receiving both an input stream of data and the template-specific data, this information may be processed to generate a report based on the input data and the template specific data. In an alternative embodiment of the invention, templates may include, for example, medical billing codes from a number of different billing code classifications for the generation of patient bills. Alternatively, a method may include receiving an input stream of data and processing the input stream of data. A determination may be made as to whether or not the input stream of data includes latent information. If the data includes latent information, a template associated with latent information may be activated.
    Type: Grant
    Filed: May 7, 2004
    Date of Patent: May 27, 2008
    Assignee: Dictaphone Corporation
    Inventors: Alwin B. Carus, Harry J. Ogrinc
  • Patent number: 7233938
    Abstract: The invention includes a medical document handling system and method and automated coding systems and methods for assigning predetermined medical codes to medical documents based on the documents' contents. The invention functions by analyzing electronic medical records and extracting medical information using natural language processing and machine learning. The system collects and amalgamates medical documentation in various formats from multiple sources and locations, normalizes the information, analyzes the information, recognizes information indicating contents corresponding to classification codes, assigns classification codes, and presents information in context correlated to medical records for billing and other purposes.
    Type: Grant
    Filed: April 15, 2003
    Date of Patent: June 19, 2007
    Assignee: Dictaphone Corporation
    Inventors: Alwin B. Carus, Stefaan Heyvaert, Harry J. Ogrinc, Robert G. Titemore, Tom Deplonty, Keith Boone, Brian Wilson, Ray Rankins, Don Fonza, David Speth, Melissa Macpherson
  • Publication number: 20040243545
    Abstract: The invention involves systems and methods for generating, manipulating, summarizing, storing, reusing, and searching electronic medical records. Structured input of medical information by medical personnel based on templates may optionally be used to facilitate analysis of records, while allowing less restrictive text input than systems of the prior art. Data extraction of relevant medical data from the input text may optionally be facilitated by the structured format of the medical records. Extracted medical data is optionally validated and linked or associated with the text from which it was extracted. The extracted medical data is normalized to allow easier searching than available in systems of the prior art. Medical and document metadata is incorporated into the extracted medical data. Particularly pertinent medical information may be extracted and summarized from a patient's medical history for use by a medical professional at the point of care.
    Type: Application
    Filed: May 29, 2003
    Publication date: December 2, 2004
    Applicant: Dictaphone Corporation
    Inventors: Keith W. Boone, Alwin B. Carus, Thomas J. DePlonty, Jeffrey G. Hopkins, Harry J. Ogrinc, Susan Reggie, Robert G. Titemore
  • Publication number: 20040220895
    Abstract: The invention includes a medical document handling system and method and automated coding systems and methods for assigning predetermined medical codes to medical documents based on the documents' contents. The invention functions by analyzing electronic medical records and extracting medical information using natural language processing and machine learning. The system collects and amalgamates medical documentation in various formats from multiple sources and locations, normalizes the information, analyzes the information, recognizes information indicating contents corresponding to classification codes, assigns classification codes, and presents information in context correlated to medical records for billing and other purposes.
    Type: Application
    Filed: April 15, 2003
    Publication date: November 4, 2004
    Applicant: Dictaphone Corporation
    Inventors: Alwin B. Carus, Stefaan Heyvaert, Harry J. Ogrinc, Robert G. Titemore, Tom Deplonty, Keith Boone, Brian Wilson, Ray Rankins, Don Fonza, David Speth, Melissa MacPherson
  • Patent number: 6185524
    Abstract: A method and device for identifying word boundaries in continuous text compares the continuous text to a set of varying length strings to identify candidate word-initial boundaries and candidate word-final boundaries in the continuous text. Each candidate word-initial boundary and candidate word-final boundary has an associated probability value. Each candidate word boundary in the continuous text is identified by calculating a word boundary score for such candidate word boundary using the probability values associated with the candidate word-initial boundaries and candidate word-final boundaries. The set of varying length strings may include words and n-grams.
    Type: Grant
    Filed: December 31, 1998
    Date of Patent: February 6, 2001
    Assignee: Lernout & Hauspie Speech Products N.V.
    Inventors: Alwin B. Carus, Kathleen Good
  • Patent number: 6035268
    Abstract: A word breaker utilizing a lexicon module and a processing module to identify word breaks in a stream of Asian (e.g. Japanese, Chinese, or Korean) language text. The lexicon module is a dictionary or database containing words native to the language of the input text. The processing module includes a plurality of analysis modules which operate on the input text. In particular, the processing module can include modules that analyze the input text using heuristic rules and statistical analysis to identify a first set of work breaks, thereby reducing the size of segments with undefined word breaks. The processing module also includes a database analysis module that identifies the remaining undefined word breaks in those smaller segments that have undergone heuristic or statistical analysis.
    Type: Grant
    Filed: August 21, 1997
    Date of Patent: March 7, 2000
    Assignee: Lernout & Hauspie Speech Products N.V.
    Inventors: Alwin B. Carus, Michael Wiesner, Deborah Krause
  • Patent number: 5890103
    Abstract: This invention improves information retrieval by providing a tokenizing apparatus and method that parses natural language text in a manner that increases the throughput of an information retrieval or natural language analysis system. The tokenizer includes a parser that extracts characters from the stream of text, an identifying element for identifying a token formed of characters in the stream of text that include lexical matter, and a filter for assigning tags to those tokens requiring further linguistic analysis. The tokenizer, in a single pass through the stream of text, determines the further linguistic processing suitable to each particular token contained in the stream of text.
    Type: Grant
    Filed: July 19, 1996
    Date of Patent: March 30, 1999
    Assignee: Lernout & Hauspie Speech Products N.V.
    Inventor: Alwin B. Carus
  • Patent number: 5794177
    Abstract: This invention improves information retrieval and the precision of language processing by providing an apparatus and method for organizing, utilizing, analyzing, and generating morphological data. The apparatus and method involve locating a stored lexical expression representative of a candidate word found in a stream of natural language text, identifying a paradigm for the candidate word based upon the stored lexical expression, and applying transforms contained within the identified paradigm to the candidate word.
    Type: Grant
    Filed: November 8, 1995
    Date of Patent: August 11, 1998
    Assignee: Inso Corporation
    Inventors: Alwin B. Carus, Michael Wiesner, Keith Boone
  • Patent number: 5680628
    Abstract: An apparatus and method for the identification of noun phrases in a stream of natural language text receives an input stream of text, identifies tokens within the stream of text, and processes the tokens to identify noun phrases. The system processes the tokens by annotating the tokens with tags identifying characteristics of the tokens and by contextually analyzing each token and its associated characteristics. During processing, the system can also disambiguate individual token characteristics and identify agreement between tokens.
    Type: Grant
    Filed: July 19, 1995
    Date of Patent: October 21, 1997
    Assignee: Inso Corporation
    Inventors: Alwin B. Carus, Michael Wiesner, Ateeque R. Haque
  • Patent number: 4868750
    Abstract: A system for the grammatical annotation of natural language receives natural language text and annotates each word with a set of tags indicative of its possible grammatical or syntactic uses. An empirical probability of collocation function defined on pairs of tags is iteratively extended to a selected set of tag sequences of increasing length so as to select a most probable tag for each word of a sequence of ambiguously-tagged words. For listed pairs of commonly confused words a substitute calculation reveals erroneous use of the wrong word. For words with tags having abnormally low frequency of occurrence, a stored table of reduced probability factors corrects the calculation. Once the text words have been annotated with their most probable tags, the tagged text is parsed by a parser which successively applies phrasal, predicate and clausal analysis to build higher structures from the disambiguated tag strings.
    Type: Grant
    Filed: October 7, 1987
    Date of Patent: September 19, 1989
    Assignee: Houghton Mifflin Company
    Inventors: Henry Kucera, Alwin B. Carus, Jeffrey G. Hopkins
  • Patent number: 4864501
    Abstract: A system for annotating digitally encoded text includes a dictionary of base forms. For each base form, a first set of tags represents possible grammatical and syntactic properties of the word, and may encode inflectional paradigms of the base form, or feature agreement behavior and special processing. If a text word is not found in the dictionary, an inflectional analyzer looks up one or more base forms derived from the word, and if found, and annotates them with their dictionary tags. A morphological analyzer assigns tags to words not retrieved in the dictionary. The morphological analyzer recognizes words formed by prefixation and suffixation, as well as proper nouns, ordinals, idiomatic expressions, and certain classes of character strings. The tagged words of a sentence are then processed to parse the sentence.
    Type: Grant
    Filed: October 7, 1987
    Date of Patent: September 5, 1989
    Assignee: Houghton Mifflin Company
    Inventors: Henry Kucera, Alwin B. Carus
  • Patent number: 4864502
    Abstract: An apparatus for the grammatical anlysis of digitally encoded text material receives encoded text, annotates each word of the text with a tag, and processes the annotated text to identify basic syntactic units such as noun phrases and verb groups. A clausal analyzer then operates on the identified nominal and predicate structures to identify clause boundaries and clause types. During processing, feature agreement between parts of successively larger entities--noun phrases, predicates, and clauses--are successively derived. When an error is detected, an error maessage identifies the error and displays a suggested correction.
    Type: Grant
    Filed: October 7, 1987
    Date of Patent: September 5, 1989
    Assignee: Houghton Mifflin Company
    Inventors: Henry Kucera, Alwin B. Carus