Patents by Inventor Alwin Carus

Alwin Carus has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20080059498
    Abstract: A system and method for facilitating the processing and the use of documents by providing a system for categorizing document section headings under a set of canonical section headings. In the method for categorizing section headings, there may be a process of training a database and matching methods to categorize different but equivalent document section headings under canonical headings and categories. Once trained, the system may match and categorize the document sections with little to no supervision of the categorization for large sets of documents.
    Type: Application
    Filed: September 7, 2007
    Publication date: March 6, 2008
    Applicant: Nuance Communications, Inc.
    Inventors: Alwin CARUS, Melissa MACPHERSON, Stefaan Heyvaert, Cornelia Parkes
  • Publication number: 20080010274
    Abstract: A semantic discovery and exploration system is disclosed where an environment enabling a developer or user to uncover, navigate, and organize semantic patterns and structures in a document collection with or without the aid of structured knowledge. The semantic discovery and exploration system provides techniques for searching document collections, categorizing documents, inducing lists of related concepts, and identifying clusters of related terms and documents. This system operates both without and with infusions of structured knowledge such as gazetteers, thesauruses, taxonomies and ontologies. System performance improves when structured knowledge is incorporated. The semantic discovery and exploration system may be used as a first step in developing an information extraction system such as to categorize or cluster documents in a particular domain or to develop gazetteers and as a part of a deployed run-time information extraction system.
    Type: Application
    Filed: June 20, 2007
    Publication date: January 10, 2008
    Applicant: Information Extraction Systems, Inc.
    Inventors: Alwin Carus, Thomas DePlonty
  • Publication number: 20070233488
    Abstract: The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.
    Type: Application
    Filed: March 29, 2006
    Publication date: October 4, 2007
    Applicant: Dictaphone Corporation
    Inventors: Alwin Carus, Larissa Lapshina, Raghu Vemula
  • Publication number: 20070203707
    Abstract: A system and method for filtering documents to determine section boundaries between dictated and non-dictated text. The system and method identifies portions of a text report that correspond to an original dictation and, correspondingly, those portions that are not part of the original dictation. The system and method include comparing tokenized and normalized forms of the original dictation and the final report, determining mismatches between the two forms, and applying machine-learning techniques to identify document headers, footers, page turns, macros, and lists automatically and accurately.
    Type: Application
    Filed: February 27, 2006
    Publication date: August 30, 2007
    Applicant: Dictaphone Corporation
    Inventors: Alwin Carus, Larissa Lapshina, Bernardo Rechea
  • Publication number: 20060235687
    Abstract: A method for adaptive automatic error and mismatch correction is disclosed for use with a system having an automatic error and mismatch correction learning module, an automatic error and mismatch correction model, and a classifier module. The learning module operates by receiving pairs of documents, identifying and selecting effective candidate errors and mismatches, and generating classifiers corresponding to these selected errors and mismatches. The correction model operates by receiving a string of interpreted speech into the automatic error and mismatch correction module, identifying target tokens in the string of interpreted speech, creating a set of classifier features according to requirements of the automatic error and mismatch correction model, comparing the target tokens against the classifier features to detect errors and mismatches in the string of interpreted speech, and modifying the string of interpreted speech based upon the classifier features.
    Type: Application
    Filed: April 14, 2005
    Publication date: October 19, 2006
    Applicant: Dictaphone Corporation
    Inventors: Alwin Carus, Larissa Lapshina, Bernardo Rechea, Amy Uhrbach
  • Publication number: 20060116862
    Abstract: The present invention pertains to a system and method for the tokenization of text. The featurizer may be configured to receive input text and convert the input text into tokens. According to one aspect of the invention, the tokens may include only one type of character, the characters selected from the group consisting of letters, numbers, and punctuation. The tokenizer may also include a classifier. The classifier may be configured to receive the tokens from the featurizer. Furthermore, the classifier may be configured to analyze the tokens received from the featurizer to determine if the tokens may be input into a predetermined classification model using a preclassifier. If one of the tokens passes the preclassifier, then the token is classified using the predetermined classification model. Additionally, according to a first aspect of the invention, the tokenizer may also include a finalizer. The finalizer may be configured to receive the tokens and may be configured to produce a final output.
    Type: Application
    Filed: December 1, 2004
    Publication date: June 1, 2006
    Applicant: Dictaphone Corporation
    Inventors: Jill Carrier, Alwin Carus, William Cote, John Dowd, Kathryn Femina, Alan Frankel, Wensheng Han, Larissa Lapshina, Bernardo Rechea, Ana Santisteban, Amy Uhrbach
  • Publication number: 20060026003
    Abstract: A system and method is disclosed for Report Confidence Modeling (RCM) including automatic adaptive classification of ASR output documents to determine the most efficient document edit workflow to convert dictation into finished output. The RCM according to the present invention may include a mechanism to predict recognition accuracy of a document generated by an ASR engine. Predicted accuracy of the document allows an ASR application to sort recognized documents based on their estimated accuracy or quality and route them appropriately for further processing, editing and/or formatting.
    Type: Application
    Filed: July 28, 2005
    Publication date: February 2, 2006
    Inventors: Alwin Carus, Larissa Lapshina, Elizabeth Lovance
  • Publication number: 20050228815
    Abstract: Methods and systems for classifying and normalizing information using a combination of traditional data input methods, natural language processing, and predetermined templates are disclosed. One method may include activating a template. Based on this template, template-specific data may also be retrieved. After receiving both an input stream of data and the template-specific data, this information may be processed to generate a report based on the input data and the template specific data. In an alternative embodiment of the invention, templates may include, for example, medical billing codes from a number of different billing code classifications for the generation of patient bills. Alternatively, a method may include receiving an input stream of data and processing the input stream of data. A determination may be made as to whether or not the input stream of data includes latent information. If the data includes latent information, a template associated with latent information may be activated.
    Type: Application
    Filed: May 7, 2004
    Publication date: October 13, 2005
    Applicant: Dictaphone Corporation
    Inventors: Alwin Carus, Harry Ogrinc
  • Publication number: 20050192792
    Abstract: The present invention relates generally to a system and method for categorization of strings of words. More specifically, the present invention relates to a system and method for normalizing a string of words for use in a system for categorization of words in a predetermined categorization scheme. A method for adaptive categorization of words in a predetermined categorization scheme may include receiving a string of text, tagging the string of text, and normalizing the string of text. Normalization may be performed with a three-stage algorithm including a literal match processing stage, an approximation match processing stage, and a nearest neighbor match processing stage. The normalized string of text can be compared to a number of sequences of text in the predetermined categorization scheme.
    Type: Application
    Filed: February 28, 2005
    Publication date: September 1, 2005
    Applicant: Dictaphone Corporation
    Inventors: Alwin Carus, Thomas DePlonty
  • Publication number: 20050144184
    Abstract: A system and method for facilitating the processing and the use of documents by providing a system for categorizing document section headings under a set of canonical section headings. In the method for categorizing section headings, there may be a process of training a database and matching methods to categorize different but equivalent document section headings under canonical headings and categories. Once trained the system may match and categorize the document sections with little to no supervision of the categorization for large sets of documents.
    Type: Application
    Filed: September 30, 2004
    Publication date: June 30, 2005
    Applicant: Dictaphone Corporation
    Inventors: Alwin Carus, Melissa MacPherson, Stefaan Heyvaert, Cornelia Parkes
  • Publication number: 20050120020
    Abstract: One embodiment generally pertains to a method of prediction. The method includes generating a set of affixes from a selected input sequence and comparing the set of affixes with a predictive set of affixes. The method also includes selecting an affix from the predictive set of affixes. The invention uses various input data sets and allows the ability to perfectly render the original data set and the minimal size of the predictive set of affixes.
    Type: Application
    Filed: February 27, 2004
    Publication date: June 2, 2005
    Applicant: Dictaphone Corporation
    Inventors: Alwin Carus, Thomas Deplonty
  • Publication number: 20050120300
    Abstract: The Clinical Data Container (CDC) is a method for packaging, transporting, and viewing medical reports, their associated data elements, images, and data from medical information systems for use by physicians and patients.
    Type: Application
    Filed: September 23, 2004
    Publication date: June 2, 2005
    Applicant: Dictaphone Corporation
    Inventors: Robert Schwager, Alwin Carus, Harry Ogrinc, Jeffrey Hopkins, Susan Reggie, David Pearah