Patents by Inventor Alwin Carus

Alwin Carus has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEM AND METHOD FOR DOCUMENT SECTION SEGMENTATION

Publication number: 20080059498

Abstract: A system and method for facilitating the processing and the use of documents by providing a system for categorizing document section headings under a set of canonical section headings. In the method for categorizing section headings, there may be a process of training a database and matching methods to categorize different but equivalent document section headings under canonical headings and categories. Once trained, the system may match and categorize the document sections with little to no supervision of the categorization for large sets of documents.

Type: Application

Filed: September 7, 2007

Publication date: March 6, 2008

Applicant: Nuance Communications, Inc.

Inventors: Alwin CARUS, Melissa MACPHERSON, Stefaan Heyvaert, Cornelia Parkes
Semantic exploration and discovery

Publication number: 20080010274

Abstract: A semantic discovery and exploration system is disclosed where an environment enabling a developer or user to uncover, navigate, and organize semantic patterns and structures in a document collection with or without the aid of structured knowledge. The semantic discovery and exploration system provides techniques for searching document collections, categorizing documents, inducing lists of related concepts, and identifying clusters of related terms and documents. This system operates both without and with infusions of structured knowledge such as gazetteers, thesauruses, taxonomies and ontologies. System performance improves when structured knowledge is incorporated. The semantic discovery and exploration system may be used as a first step in developing an information extraction system such as to categorize or cluster documents in a particular domain or to develop gazetteers and as a part of a deployed run-time information extraction system.

Type: Application

Filed: June 20, 2007

Publication date: January 10, 2008

Applicant: Information Extraction Systems, Inc.

Inventors: Alwin Carus, Thomas DePlonty
System and method for applying dynamic contextual grammars and language models to improve automatic speech recognition accuracy

Publication number: 20070233488

Abstract: The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.

Type: Application

Filed: March 29, 2006

Publication date: October 4, 2007

Applicant: Dictaphone Corporation

Inventors: Alwin Carus, Larissa Lapshina, Raghu Vemula
System and method for document filtering

Publication number: 20070203707

Abstract: A system and method for filtering documents to determine section boundaries between dictated and non-dictated text. The system and method identifies portions of a text report that correspond to an original dictation and, correspondingly, those portions that are not part of the original dictation. The system and method include comparing tokenized and normalized forms of the original dictation and the final report, determining mismatches between the two forms, and applying machine-learning techniques to identify document headers, footers, page turns, macros, and lists automatically and accurately.

Type: Application

Filed: February 27, 2006

Publication date: August 30, 2007

Applicant: Dictaphone Corporation

Inventors: Alwin Carus, Larissa Lapshina, Bernardo Rechea
System and method for adaptive automatic error correction

Publication number: 20060235687

Abstract: A method for adaptive automatic error and mismatch correction is disclosed for use with a system having an automatic error and mismatch correction learning module, an automatic error and mismatch correction model, and a classifier module. The learning module operates by receiving pairs of documents, identifying and selecting effective candidate errors and mismatches, and generating classifiers corresponding to these selected errors and mismatches. The correction model operates by receiving a string of interpreted speech into the automatic error and mismatch correction module, identifying target tokens in the string of interpreted speech, creating a set of classifier features according to requirements of the automatic error and mismatch correction model, comparing the target tokens against the classifier features to detect errors and mismatches in the string of interpreted speech, and modifying the string of interpreted speech based upon the classifier features.

Type: Application

Filed: April 14, 2005

Publication date: October 19, 2006

Applicant: Dictaphone Corporation

Inventors: Alwin Carus, Larissa Lapshina, Bernardo Rechea, Amy Uhrbach
System and method for tokenization of text

Publication number: 20060116862

Abstract: The present invention pertains to a system and method for the tokenization of text. The featurizer may be configured to receive input text and convert the input text into tokens. According to one aspect of the invention, the tokens may include only one type of character, the characters selected from the group consisting of letters, numbers, and punctuation. The tokenizer may also include a classifier. The classifier may be configured to receive the tokens from the featurizer. Furthermore, the classifier may be configured to analyze the tokens received from the featurizer to determine if the tokens may be input into a predetermined classification model using a preclassifier. If one of the tokens passes the preclassifier, then the token is classified using the predetermined classification model. Additionally, according to a first aspect of the invention, the tokenizer may also include a finalizer. The finalizer may be configured to receive the tokens and may be configured to produce a final output.

Type: Application

Filed: December 1, 2004

Publication date: June 1, 2006

Applicant: Dictaphone Corporation

Inventors: Jill Carrier, Alwin Carus, William Cote, John Dowd, Kathryn Femina, Alan Frankel, Wensheng Han, Larissa Lapshina, Bernardo Rechea, Ana Santisteban, Amy Uhrbach
System and method for report level confidence

Publication number: 20060026003

Abstract: A system and method is disclosed for Report Confidence Modeling (RCM) including automatic adaptive classification of ASR output documents to determine the most efficient document edit workflow to convert dictation into finished output. The RCM according to the present invention may include a mechanism to predict recognition accuracy of a document generated by an ASR engine. Predicted accuracy of the document allows an ASR application to sort recognized documents based on their estimated accuracy or quality and route them appropriately for further processing, editing and/or formatting.

Type: Application

Filed: July 28, 2005

Publication date: February 2, 2006

Inventors: Alwin Carus, Larissa Lapshina, Elizabeth Lovance
Categorization of information using natural language processing and predefined templates

Publication number: 20050228815

Abstract: Methods and systems for classifying and normalizing information using a combination of traditional data input methods, natural language processing, and predetermined templates are disclosed. One method may include activating a template. Based on this template, template-specific data may also be retrieved. After receiving both an input stream of data and the template-specific data, this information may be processed to generate a report based on the input data and the template specific data. In an alternative embodiment of the invention, templates may include, for example, medical billing codes from a number of different billing code classifications for the generation of patient bills. Alternatively, a method may include receiving an input stream of data and processing the input stream of data. A determination may be made as to whether or not the input stream of data includes latent information. If the data includes latent information, a template associated with latent information may be activated.

Type: Application

Filed: May 7, 2004

Publication date: October 13, 2005

Applicant: Dictaphone Corporation

Inventors: Alwin Carus, Harry Ogrinc
System and method for normalization of a string of words

Publication number: 20050192792

Abstract: The present invention relates generally to a system and method for categorization of strings of words. More specifically, the present invention relates to a system and method for normalizing a string of words for use in a system for categorization of words in a predetermined categorization scheme. A method for adaptive categorization of words in a predetermined categorization scheme may include receiving a string of text, tagging the string of text, and normalizing the string of text. Normalization may be performed with a three-stage algorithm including a literal match processing stage, an approximation match processing stage, and a nearest neighbor match processing stage. The normalized string of text can be compared to a number of sequences of text in the predetermined categorization scheme.

Type: Application

Filed: February 28, 2005

Publication date: September 1, 2005

Applicant: Dictaphone Corporation

Inventors: Alwin Carus, Thomas DePlonty
System and method for document section segmentation

Publication number: 20050144184

Abstract: A system and method for facilitating the processing and the use of documents by providing a system for categorizing document section headings under a set of canonical section headings. In the method for categorizing section headings, there may be a process of training a database and matching methods to categorize different but equivalent document section headings under canonical headings and categories. Once trained the system may match and categorize the document sections with little to no supervision of the categorization for large sets of documents.

Type: Application

Filed: September 30, 2004

Publication date: June 30, 2005

Applicant: Dictaphone Corporation

Inventors: Alwin Carus, Melissa MacPherson, Stefaan Heyvaert, Cornelia Parkes
Method, system, and apparatus for assembly, transport and display of clinical data

Publication number: 20050120300

Abstract: The Clinical Data Container (CDC) is a method for packaging, transporting, and viewing medical reports, their associated data elements, images, and data from medical information systems for use by physicians and patients.

Type: Application

Filed: September 23, 2004

Publication date: June 2, 2005

Applicant: Dictaphone Corporation

Inventors: Robert Schwager, Alwin Carus, Harry Ogrinc, Jeffrey Hopkins, Susan Reggie, David Pearah
System, method and apparatus for prediction using minimal affix patterns

Publication number: 20050120020

Abstract: One embodiment generally pertains to a method of prediction. The method includes generating a set of affixes from a selected input sequence and comparing the set of affixes with a predictive set of affixes. The method also includes selecting an affix from the predictive set of affixes. The invention uses various input data sets and allows the ability to perfectly render the original data set and the minimal size of the predictive set of affixes.

Type: Application

Filed: February 27, 2004

Publication date: June 2, 2005

Applicant: Dictaphone Corporation

Inventors: Alwin Carus, Thomas Deplonty