Patents by Inventor Alwin B. Carus

Alwin B. Carus has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Semantic exploration and discovery

Patent number: 7558778

Abstract: A semantic discovery and exploration system is disclosed where an environment enabling a developer or user to uncover, navigate, and organize semantic patterns and structures in a document collection with or without the aid of structured knowledge. The semantic discovery and exploration system provides techniques for searching document collections, categorizing documents, inducing lists of related concepts, and identifying clusters of related terms and documents. This system operates both without and with infusions of structured knowledge such as gazetteers, thesauruses, taxonomies and ontologies. System performance improves when structured knowledge is incorporated. The semantic discovery and exploration system may be used as a first step in developing an information extraction system such as to categorize or cluster documents in a particular domain or to develop gazetteers and as a part of a deployed run-time information extraction system.

Type: Grant

Filed: June 20, 2007

Date of Patent: July 7, 2009

Assignee: Information Extraction Systems, Inc.

Inventors: Alwin B. Carus, Thomas J. DePlonty
METHOD, SYSTEM, AND APPARATUS FOR ASSEMBLY, TRANSPORT AND DISPLAY OF CLINICAL DATA

Publication number: 20090070380

Abstract: The Clinical Data Container (CDC) is a method for packaging, transporting, and viewing medical reports, their associated data elements, images, and data from medical information systems for use by physicians and patients.

Type: Application

Filed: September 19, 2008

Publication date: March 12, 2009

Applicant: Dictaphone Corporation

Inventors: Robert G. Schwager, Alwin B. Carus, Harry J. Ogrinc, Jeffrey G. Hopkins, Susan Reggie, David E. Pearah
Categorization of Information Using Natural Language Processing and Predefined Templates

Publication number: 20080255884

Abstract: A computer implemented method for generating a report that includes latent information, comprising receiving an input data stream that includes latent information, performing one of normalization, validation, and extraction of the input data stream, processing the input data stream to identify latent information within the data stream that is required for generation of a particular report, wherein said processing of the input data stream to identify latent information comprises of identifying a relevant portion of the input data stream, bounding the relevant portion of the input data stream, classifying and normalizing the bounded data, activating a relevant report template based on said identified latent information, populating said template with template-specified data, and processing the template-specified data to generate a report.

Type: Application

Filed: May 15, 2008

Publication date: October 16, 2008

Applicant: Nuance Communications, Inc.

Inventors: Alwin B. Carus, Harry J. Ogrinc
Satellite classifier ensemble

Publication number: 20080126273

Abstract: We have discovered a system and method for improving the quality of information extraction applications consisting of an ensemble of per-user, adaptive, on-line machine-learning classifiers that adapt to document content and judgments of users by continuously incorporating feedback from information extraction results and corrections that users apply to these results. The satellite classifier ensemble uses only the immediately available features for classifier improvement and it is independent of the complex cascade of earlier decisions leading to the final information extraction result. The machine-learning classifiers may also provide explanations or justifications for classification decisions in the form of rules, other machine-learning classifiers may provide feedback in the form of supporting instances or patterns.

Type: Application

Filed: June 21, 2007

Publication date: May 29, 2008

Applicant: Information Extraction Systems, Inc.

Inventors: Alwin B. Carus, Thomas J. DePlonty
Categorization of information using natural language processing and predefined templates

Patent number: 7379946

Abstract: Methods and systems for classifying and normalizing information using a combination of traditional data input methods, natural language processing, and predetermined templates are disclosed. One method may include activating a template. Based on this template, template-specific data may also be retrieved. After receiving both an input stream of data and the template-specific data, this information may be processed to generate a report based on the input data and the template specific data. In an alternative embodiment of the invention, templates may include, for example, medical billing codes from a number of different billing code classifications for the generation of patient bills. Alternatively, a method may include receiving an input stream of data and processing the input stream of data. A determination may be made as to whether or not the input stream of data includes latent information. If the data includes latent information, a template associated with latent information may be activated.

Type: Grant

Filed: May 7, 2004

Date of Patent: May 27, 2008

Assignee: Dictaphone Corporation

Inventors: Alwin B. Carus, Harry J. Ogrinc
Systems and methods for coding information

Patent number: 7233938

Abstract: The invention includes a medical document handling system and method and automated coding systems and methods for assigning predetermined medical codes to medical documents based on the documents' contents. The invention functions by analyzing electronic medical records and extracting medical information using natural language processing and machine learning. The system collects and amalgamates medical documentation in various formats from multiple sources and locations, normalizes the information, analyzes the information, recognizes information indicating contents corresponding to classification codes, assigns classification codes, and presents information in context correlated to medical records for billing and other purposes.

Type: Grant

Filed: April 15, 2003

Date of Patent: June 19, 2007

Assignee: Dictaphone Corporation

Inventors: Alwin B. Carus, Stefaan Heyvaert, Harry J. Ogrinc, Robert G. Titemore, Tom Deplonty, Keith Boone, Brian Wilson, Ray Rankins, Don Fonza, David Speth, Melissa Macpherson
Systems and methods utilizing natural language medical records

Publication number: 20040243545

Abstract: The invention involves systems and methods for generating, manipulating, summarizing, storing, reusing, and searching electronic medical records. Structured input of medical information by medical personnel based on templates may optionally be used to facilitate analysis of records, while allowing less restrictive text input than systems of the prior art. Data extraction of relevant medical data from the input text may optionally be facilitated by the structured format of the medical records. Extracted medical data is optionally validated and linked or associated with the text from which it was extracted. The extracted medical data is normalized to allow easier searching than available in systems of the prior art. Medical and document metadata is incorporated into the extracted medical data. Particularly pertinent medical information may be extracted and summarized from a patient's medical history for use by a medical professional at the point of care.

Type: Application

Filed: May 29, 2003

Publication date: December 2, 2004

Applicant: Dictaphone Corporation

Inventors: Keith W. Boone, Alwin B. Carus, Thomas J. DePlonty, Jeffrey G. Hopkins, Harry J. Ogrinc, Susan Reggie, Robert G. Titemore
Systems and methods for coding information

Publication number: 20040220895

Abstract: The invention includes a medical document handling system and method and automated coding systems and methods for assigning predetermined medical codes to medical documents based on the documents' contents. The invention functions by analyzing electronic medical records and extracting medical information using natural language processing and machine learning. The system collects and amalgamates medical documentation in various formats from multiple sources and locations, normalizes the information, analyzes the information, recognizes information indicating contents corresponding to classification codes, assigns classification codes, and presents information in context correlated to medical records for billing and other purposes.

Type: Application

Filed: April 15, 2003

Publication date: November 4, 2004

Applicant: Dictaphone Corporation

Inventors: Alwin B. Carus, Stefaan Heyvaert, Harry J. Ogrinc, Robert G. Titemore, Tom Deplonty, Keith Boone, Brian Wilson, Ray Rankins, Don Fonza, David Speth, Melissa MacPherson
Method and apparatus for automatic identification of word boundaries in continuous text and computation of word boundary scores

Patent number: 6185524

Abstract: A method and device for identifying word boundaries in continuous text compares the continuous text to a set of varying length strings to identify candidate word-initial boundaries and candidate word-final boundaries in the continuous text. Each candidate word-initial boundary and candidate word-final boundary has an associated probability value. Each candidate word boundary in the continuous text is identified by calculating a word boundary score for such candidate word boundary using the probability values associated with the candidate word-initial boundaries and candidate word-final boundaries. The set of varying length strings may include words and n-grams.

Type: Grant

Filed: December 31, 1998

Date of Patent: February 6, 2001

Assignee: Lernout & Hauspie Speech Products N.V.

Inventors: Alwin B. Carus, Kathleen Good
Method and apparatus for breaking words in a stream of text

Patent number: 6035268

Abstract: A word breaker utilizing a lexicon module and a processing module to identify word breaks in a stream of Asian (e.g. Japanese, Chinese, or Korean) language text. The lexicon module is a dictionary or database containing words native to the language of the input text. The processing module includes a plurality of analysis modules which operate on the input text. In particular, the processing module can include modules that analyze the input text using heuristic rules and statistical analysis to identify a first set of work breaks, thereby reducing the size of segments with undefined word breaks. The processing module also includes a database analysis module that identifies the remaining undefined word breaks in those smaller segments that have undergone heuristic or statistical analysis.

Type: Grant

Filed: August 21, 1997

Date of Patent: March 7, 2000

Assignee: Lernout & Hauspie Speech Products N.V.

Inventors: Alwin B. Carus, Michael Wiesner, Deborah Krause
Method and apparatus for improved tokenization of natural language text

Patent number: 5890103

Abstract: This invention improves information retrieval by providing a tokenizing apparatus and method that parses natural language text in a manner that increases the throughput of an information retrieval or natural language analysis system. The tokenizer includes a parser that extracts characters from the stream of text, an identifying element for identifying a token formed of characters in the stream of text that include lexical matter, and a filter for assigning tags to those tokens requiring further linguistic analysis. The tokenizer, in a single pass through the stream of text, determines the further linguistic processing suitable to each particular token contained in the stream of text.

Type: Grant

Filed: July 19, 1996

Date of Patent: March 30, 1999

Assignee: Lernout & Hauspie Speech Products N.V.

Inventor: Alwin B. Carus
Method and apparatus for morphological analysis and generation of natural language text

Patent number: 5794177

Abstract: This invention improves information retrieval and the precision of language processing by providing an apparatus and method for organizing, utilizing, analyzing, and generating morphological data. The apparatus and method involve locating a stored lexical expression representative of a candidate word found in a stream of natural language text, identifying a paradigm for the candidate word based upon the stored lexical expression, and applying transforms contained within the identified paradigm to the candidate word.

Type: Grant

Filed: November 8, 1995

Date of Patent: August 11, 1998

Assignee: Inso Corporation

Inventors: Alwin B. Carus, Michael Wiesner, Keith Boone
Method and apparatus for automated search and retrieval process

Patent number: 5680628

Abstract: An apparatus and method for the identification of noun phrases in a stream of natural language text receives an input stream of text, identifies tokens within the stream of text, and processes the tokens to identify noun phrases. The system processes the tokens by annotating the tokens with tags identifying characteristics of the tokens and by contextually analyzing each token and its associated characteristics. During processing, the system can also disambiguate individual token characteristics and identify agreement between tokens.

Type: Grant

Filed: July 19, 1995

Date of Patent: October 21, 1997

Assignee: Inso Corporation

Inventors: Alwin B. Carus, Michael Wiesner, Ateeque R. Haque
Collocational grammar system

Patent number: 4868750

Abstract: A system for the grammatical annotation of natural language receives natural language text and annotates each word with a set of tags indicative of its possible grammatical or syntactic uses. An empirical probability of collocation function defined on pairs of tags is iteratively extended to a selected set of tag sequences of increasing length so as to select a most probable tag for each word of a sequence of ambiguously-tagged words. For listed pairs of commonly confused words a substitute calculation reveals erroneous use of the wrong word. For words with tags having abnormally low frequency of occurrence, a stored table of reduced probability factors corrects the calculation. Once the text words have been annotated with their most probable tags, the tagged text is parsed by a parser which successively applies phrasal, predicate and clausal analysis to build higher structures from the disambiguated tag strings.

Type: Grant

Filed: October 7, 1987

Date of Patent: September 19, 1989

Assignee: Houghton Mifflin Company

Inventors: Henry Kucera, Alwin B. Carus, Jeffrey G. Hopkins
Word annotation system

Patent number: 4864501

Abstract: A system for annotating digitally encoded text includes a dictionary of base forms. For each base form, a first set of tags represents possible grammatical and syntactic properties of the word, and may encode inflectional paradigms of the base form, or feature agreement behavior and special processing. If a text word is not found in the dictionary, an inflectional analyzer looks up one or more base forms derived from the word, and if found, and annotates them with their dictionary tags. A morphological analyzer assigns tags to words not retrieved in the dictionary. The morphological analyzer recognizes words formed by prefixation and suffixation, as well as proper nouns, ordinals, idiomatic expressions, and certain classes of character strings. The tagged words of a sentence are then processed to parse the sentence.

Type: Grant

Filed: October 7, 1987

Date of Patent: September 5, 1989

Assignee: Houghton Mifflin Company

Inventors: Henry Kucera, Alwin B. Carus
Sentence analyzer

Patent number: 4864502

Abstract: An apparatus for the grammatical anlysis of digitally encoded text material receives encoded text, annotates each word of the text with a tag, and processes the annotated text to identify basic syntactic units such as noun phrases and verb groups. A clausal analyzer then operates on the identified nominal and predicate structures to identify clause boundaries and clause types. During processing, feature agreement between parts of successively larger entities--noun phrases, predicates, and clauses--are successively derived. When an error is detected, an error maessage identifies the error and displays a suggested correction.

Type: Grant

Filed: October 7, 1987

Date of Patent: September 5, 1989

Assignee: Houghton Mifflin Company

Inventors: Henry Kucera, Alwin B. Carus

prev 1 2 3