Patents by Inventor Alwin B. Carus

Alwin B. Carus has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

CATEGORIZATION OF INFORMATION USING NATURAL LANGUAGE PROCESSING AND PREDEFINED TEMPLATES

Publication number: 20140288973

Abstract: A computer implemented method for generating a report that includes latent information, comprising receiving an input data stream that includes latent information, performing one of normalization, validation, and extraction of the input data stream, processing the input data stream to identify latent information within the data stream that is required for generation of a particular report, wherein said processing of the input data stream to identify latent information comprises of identifying a relevant portion of the input data stream, bounding the relevant portion of the input data stream, classifying and normalizing the bounded data, activating a relevant report template based on said identified latent information, populating said template with template-specified data, and processing the template-specified data to generate a report.

Type: Application

Filed: June 6, 2014

Publication date: September 25, 2014

Applicant: Nuance Communications, Inc.

Inventors: Alwin B. Carus, Harry J. Ogrinc
Automated Speech Recognition Proxy System for Natural Language Understanding

Publication number: 20140288932

Abstract: An interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of voice user interfaces. The system permits imperfect ASR subsystems to nonetheless relieve burden on HSR subsystems. An ASR proxy is used to implement an IVR system, and the proxy dynamically determines how many ASR and HSR subsystems are to perform recognition for any particular utterance, based on factors such as confidence thresholds of the ASRs and availability of human resources for HSRs.

Type: Application

Filed: July 8, 2013

Publication date: September 25, 2014

Inventors: Yoryos Yeracaris, Alwin B. Carus, Larissa Lapshina
Categorization of information using natural language processing and predefined templates

Patent number: 8782088

Abstract: A computer implemented method for generating a report that includes latent information, comprising receiving an input data stream that includes latent information, performing one of normalization, validation, and extraction of the input data stream, processing the input data stream to identify latent information within the data stream that is required for generation of a particular report, wherein said processing of the input data stream to identify latent information comprises of identifying a relevant portion of the input data stream, bounding the relevant portion of the input data stream, classifying and normalizing the bounded data, activating a relevant report template based on said identified latent information, populating said template with template-specified data, and processing the template-specified data to generate a report.

Type: Grant

Filed: April 19, 2012

Date of Patent: July 15, 2014

Assignee: Nuance Communications, Inc.

Inventors: Alwin B. Carus, Harry J. Ogrinc
Categorization of information using natural language processing and predefined templates

Patent number: 8510340

Abstract: A computer implemented method for generating a report that includes latent information, comprising receiving an input data stream that includes latent information, performing one of normalization, validation, and extraction of the input data stream, processing the input data stream to identify latent information within the data stream that is required for generation of a particular report, wherein said processing of the input data stream to identify latent information comprises of identifying a relevant portion of the input data stream, bounding the relevant portion of the input data stream, classifying and normalizing the bounded data, activating a relevant report template based on said identified latent information, populating said template with template-specified data, and processing the template-specified data to generate a report.

Type: Grant

Filed: May 18, 2012

Date of Patent: August 13, 2013

Assignee: Nuance Communications, Inc.

Inventors: Alwin B. Carus, Harry J. Ogrinc
SYSTEM AND METHOD FOR APPLYING DYNAMIC CONTEXTUAL GRAMMARS AND LANGUAGE MODELS TO IMPROVE AUTOMATIC SPEECH RECOGNITION ACCURACY

Publication number: 20130006632

Abstract: The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.

Type: Application

Filed: September 12, 2012

Publication date: January 3, 2013

Inventors: Alwin B. Carus, Larissa Lapshina, Raghu Vemula
System and method for applying dynamic contextual grammars and language models to improve automatic speech recognition accuracy

Patent number: 8301448

Abstract: The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.

Type: Grant

Filed: March 29, 2006

Date of Patent: October 30, 2012

Assignee: Nuance Communications, Inc.

Inventors: Alwin B. Carus, Larissa Lapshina, Raghu Vemula
CATEGORIZATION OF INFORMATION USING NATURAL LANGUAGE PROCESSING AND PREDEFINED TEMPLATES

Publication number: 20120232923

Abstract: A computer implemented method for generating a report that includes latent information, comprising receiving an input data stream that includes latent information, performing one of normalization, validation, and extraction of the input data stream, processing the input data stream to identify latent information within the data stream that is required for generation of a particular report, wherein said processing of the input data stream to identify latent information comprises of identifying a relevant portion of the input data stream, bounding the relevant portion of the input data stream, classifying and normalizing the bounded data, activating a relevant report template based on said identified latent information, populating said template with template-specified data, and processing the template-specified data to generate a report.

Type: Application

Filed: May 18, 2012

Publication date: September 13, 2012

Applicant: Dictaphone Corporation

Inventors: Alwin B. Carus, Harry J. Ogrinc
CATEGORIZATION OF INFORMATION USING NATURAL LANGUAGE PROCESSING AND PREDEFINED TEMPLATES

Publication number: 20120209626

Abstract: A computer implemented method for generating a report that includes latent information, comprising receiving an input data stream that includes latent information, performing one of normalization, validation, and extraction of the input data stream, processing the input data stream to identify latent information within the data stream that is required for generation of a particular report, wherein said processing of the input data stream to identify latent information comprises of identifying a relevant portion of the input data stream, bounding the relevant portion of the input data stream, classifying and normalizing the bounded data, activating a relevant report template based on said identified latent information, populating said template with template-specified data, and processing the template-specified data to generate a report.

Type: Application

Filed: April 19, 2012

Publication date: August 16, 2012

Applicant: Dictaphone Corporation

Inventors: Alwin B. Carus, Harry J. Ogrinc
Categorization of information using natural language processing and predefined templates

Patent number: 8185553

Abstract: A computer implemented method for generating a report that includes latent information, comprising receiving an input data stream that includes latent information, performing one of normalization, validation, and extraction of the input data stream, processing the input data stream to identify latent information within the data stream that is required for generation of a particular report, wherein said processing of the input data stream to identify latent information comprises of identifying a relevant portion of the input data stream, bounding the relevant portion of the input data stream, classifying and normalizing the bounded data, activating a relevant report template based on said identified latent information, populating said template with template-specified data, and processing the template-specified data to generate a report.

Type: Grant

Filed: May 15, 2008

Date of Patent: May 22, 2012

Assignee: Dictaphone Corporation

Inventors: Alwin B. Carus, Harry J. Ogrinc
Apparatus, system and method for developing tools to process natural language text

Patent number: 8131756

Abstract: The disclosed invention includes an apparatus, system and method for developing tools to explore, organize, structure, extract, and mine natural language text. The system contains three sub-systems: a run-time engine, a development environment, and a feedback system. The invention also includes a system and method for improving the quality of information extraction applications consisting of an ensemble of per-user, adaptive, on-line machine-learning classifiers that adapt to document content and judgments of users by continuously incorporating feedback from information extraction results and corrections that users apply to these results. At least one of the machine-learning classifier also provides explanations or justifications for classification decisions in the form of rules; other machine-learning classifiers may provide feedback in the form of supporting instances or patterns.

Type: Grant

Filed: June 5, 2007

Date of Patent: March 6, 2012

Inventors: Alwin B. Carus, Thomas J. DePlonty
SYSTEMS AND METHODS FOR FILTERING DICTATED AND NON-DICTATED SECTIONS OF DOCUMENTS

Publication number: 20110320189

Abstract: A system and method for filtering documents to determine section boundaries between dictated and non-dictated text. The system and method identifies portions of a text report that correspond to an original dictation and, correspondingly, those portions that are not part of the original dictation. The system and method include comparing tokenized and normalized forms of the original dictation and the final report, determining mismatches between the two forms, and applying machine-learning techniques to identify document headers, footers, page turns, macros, and lists automatically and accurately.

Type: Application

Filed: September 9, 2011

Publication date: December 29, 2011

Applicant: Dictaphone Corporation

Inventors: Alwin B. Carus, Larissa Lapshina, Bernardo Rechea
Systems and methods for filtering dictated and non-dictated sections of documents

Patent number: 8036889

Abstract: A system and method for filtering documents to determine section boundaries between dictated and non-dictated text. The system and method identifies portions of a text report that correspond to an original dictation and, correspondingly, those portions that are not part of the original dictation. The system and method include comparing tokenized and normalized forms of the original dictation and the final report, determining mismatches between the two forms, and applying machine-learning techniques to identify document headers, footers, page turns, macros, and lists automatically and accurately.

Type: Grant

Filed: February 27, 2006

Date of Patent: October 11, 2011

Assignee: Nuance Communications, Inc.

Inventors: Alwin B. Carus, Larissa Lapshina, Bernardo Rechea
System, method and apparatus for prediction using minimal affix patterns

Patent number: 8024176

Abstract: One embodiment generally pertains to a method of prediction. The method includes generating a set of affixes from a selected input sequence and comparing the set of affixes with a predictive set of affixes. The method also includes selecting an affix from the predictive set of affixes. The invention uses various input data sets and allows the ability to perfectly render the original data set and the minimal size of the predictive set of affixes.

Type: Grant

Filed: February 27, 2004

Date of Patent: September 20, 2011

Assignee: Dictaphone Corporation

Inventors: Alwin B. Carus, Thomas J. Deplonty, III
System and method for tokenization of text using classifier models

Patent number: 7937263

Abstract: The present invention pertains to a system and method for the tokenization of text. The featurizer may be configured to receive input text and convert the input text into tokens. According to one aspect of the invention, the tokens may include only one type of character, the characters selected from the group consisting of letters, numbers, and punctuation. The tokenizer may also include a classifier. The classifier may be configured to receive the tokens from the featurizer. Furthermore, the classifier may be configured to analyze the tokens received from the featurizer to determine if the tokens may be input into a predetermined classification model using a preclassifier. If one of the tokens passes the preclassifier, then the token is classified using the predetermined classification model. Additionally, according to a first aspect of the invention, the tokenizer may also include a finalizer. The finalizer may be configured to receive the tokens and may be configured to produce a final output.

Type: Grant

Filed: December 1, 2004

Date of Patent: May 3, 2011

Assignee: Dictaphone Corporation

Inventors: Jill Carrier, Alwin B. Carus, William F. Cote, John Dowd, Kathryn Del La Femina, Alan Frankel, Wensheng(Vincent) Han, Larissa Lapshina, Bernardo Rechea, Ana Santisteban, Amy J. Uhrbach
AN APPARATUS, SYSTEM AND METHOD FOR DEVELOPING TOOLS TO PROCESS NATURAL LANGUAGE TEXT

Publication number: 20100293451

Abstract: The disclosed invention includes an apparatus, system and method for developing tools to explore, organize, structure, extract, and mine natural language text. The system contains three sub-systems: a run-time engine, a development environment, and a feedback system. The invention also includes a system and method for improving the quality of information extraction applications consisting of an ensemble of per-user, adaptive, on-line machine-learning classifiers that adapt to document content and judgments of users by continuously incorporating feedback from information extraction results and corrections that users apply to these results. At least one of the machine-learning classifier also provides explanations or justifications for classification decisions in the form of rules; other machine-learning classifiers may provide feedback in the form of supporting instances or patterns.

Type: Application

Filed: June 5, 2007

Publication date: November 18, 2010

Inventor: Alwin B. CARUS
System and method for normalization of a string of words

Patent number: 7822598

Abstract: The present invention relates generally to a system and method for categorization of strings of words. More specifically, the present invention relates to a system and method for normalizing a string of words for use in a system for categorization of words in a predetermined categorization scheme. A method for adaptive categorization of words in a predetermined categorization scheme may include receiving a string of text, tagging the string of text, and normalizing the string of text. Normalization may be performed with a three-stage algorithm including a literal match processing stage, an approximation match processing stage, and a nearest neighbor match processing stage. The normalized string of text can be compared to a number of sequences of text in the predetermined categorization scheme.

Type: Grant

Filed: February 28, 2005

Date of Patent: October 26, 2010

Assignee: Dictaphone Corporation

Inventors: Alwin B. Carus, Thomas J. DePlonty, III
System and method for document section segmentation

Patent number: 7818308

Abstract: A system and method for facilitating the processing and the use of documents by providing a system for categorizing document section headings under a set of canonical section headings. In the method for categorizing section headings, there may be a process of training a database and matching methods to categorize different but equivalent document section headings under canonical headings and categories. Once trained, the system may match and categorize the document sections with little to no supervision of the categorization for large sets of documents.

Type: Grant

Filed: September 7, 2007

Date of Patent: October 19, 2010

Assignee: Nuance Communications, Inc.

Inventors: Alwin B. Carus, Melissa MacPherson, Stefaan Heyvaert, Cornelia Parkes
System and method for report level confidence

Patent number: 7818175

Abstract: A system and method is disclosed for Report Confidence Modeling (RCM) including automatic adaptive classification of ASR output documents to determine the most efficient document edit workflow to convert dictation into finished output. The RCM according to the present invention may include a mechanism to predict recognition accuracy of a document generated by an ASR engine. Predicted accuracy of the document allows an ASR application to sort recognized documents based on their estimated accuracy or quality and route them appropriately for further processing, editing and/or formatting.

Type: Grant

Filed: July 28, 2005

Date of Patent: October 19, 2010

Inventors: Alwin B. Carus, Larissa Lapshina, Elizabeth M. Lovance
Satellite classifier ensemble

Patent number: 7769701

Abstract: We have discovered a system and method for improving the quality of information extraction applications consisting of an ensemble of per-user, adaptive, on-line machine-learning classifiers that adapt to document content and judgments of users by continuously incorporating feedback from information extraction results and corrections that users apply to these results. The satellite classifier ensemble uses only the immediately available features for classifier improvement and it is independent of the complex cascade of earlier decisions leading to the final information extraction result. The machine-learning classifiers may also provide explanations or justifications for classification decisions in the form of rules, other machine-learning classifiers may provide feedback in the form of supporting instances or patterns.

Type: Grant

Filed: June 21, 2007

Date of Patent: August 3, 2010

Assignee: Information Extraction Systems, Inc

Inventors: Alwin B. Carus, Thomas J. DePlonty
System and method for adaptive automatic error correction

Patent number: 7565282

Abstract: A method for adaptive automatic error and mismatch correction is disclosed for use with a system having an automatic error and mismatch correction learning module, an automatic error and mismatch correction model, and a classifier module. The learning module operates by receiving pairs of documents, identifying and selecting effective candidate errors and mismatches, and generating classifiers corresponding to these selected errors and mismatches. The correction model operates by receiving a string of interpreted speech into the automatic error and mismatch correction module, identifying target tokens in the string of interpreted speech, creating a set of classifier features according to requirements of the automatic error and mismatch correction model, comparing the target tokens against the classifier features to detect errors and mismatches in the string of interpreted speech, and modifying the string of interpreted speech based upon the classifier features.

Type: Grant

Filed: April 14, 2005

Date of Patent: July 21, 2009

Assignee: Dictaphone Corporation

Inventors: Alwin B Carus, Larissa Lapshina, Bernardo Rechea, Amy J. Uhrbach

prev 1 2 3 next