Patents by Inventor Branimir K. Boguraev

Branimir K. Boguraev has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

EXTRACTION OF LEXICAL KERNEL UNITS FROM A DOMAIN-SPECIFIC LEXICON

Publication number: 20160203120

Abstract: According to an aspect, a candidate lexical kernel unit that includes a word token sequence having two or more words is received. Domain terms that contain the two or more words are retrieved from a terminology resource file of domain terms associated with a domain. The candidate lexical kernel unit and the retrieved domain terms are analyzed to determine whether the candidate lexical kernel unit satisfies specified criteria for use as a building block by a natural-language processing (NLP) tool for building larger lexical units in the domain. Each of the larger lexical units includes a greater number of words than the candidate lexical kernel unit. The candidate lexical kernel unit is identified as a lexical kernel unit based on determining that the candidate lexical kernel unit satisfies the specified criteria. The lexical kernel unit is output to a domain-specific lexical kernel unit file for input to the NLP tool.

Type: Application

Filed: March 11, 2015

Publication date: July 14, 2016

Inventors: Branimir K. Boguraev, Esme Manandise, Benjamin P. Segal
EXTRACTION OF LEXICAL KERNEL UNITS FROM A DOMAIN-SPECIFIC LEXICON

Publication number: 20160203119

Abstract: According to an aspect, a candidate lexical kernel unit that includes a word token sequence having two or more words is received. Domain terms that contain the two or more words are retrieved from a terminology resource file of domain terms associated with a domain. The candidate lexical kernel unit and the retrieved domain terms are analyzed to determine whether the candidate lexical kernel unit satisfies specified criteria for use as a building block by a natural-language processing (NLP) tool for building larger lexical units in the domain. Each of the larger lexical units includes a greater number of words than the candidate lexical kernel unit. The candidate lexical kernel unit is identified as a lexical kernel unit based on determining that the candidate lexical kernel unit satisfies the specified criteria. The lexical kernel unit is output to a domain-specific lexical kernel unit file for input to the NLP tool.

Type: Application

Filed: January 9, 2015

Publication date: July 14, 2016

Inventors: Branimir K. Boguraev, Esme Manandise, Benjamin P. Segal
DOMAIN-SPECIFIC COMPUTATIONAL LEXICON FORMATION

Publication number: 20160179783

Abstract: According to an aspect, a candidate token sequence including one or more word tokens is extracted from an unstructured domain glossary that includes entries associated with a domain. A look-up operation is performed to retrieve language data for each word token in the candidate token sequence and annotates each word token in the candidate token sequence found by the look-up operation with corresponding retrieved language data to form an annotated sequence. A pattern match of the annotated sequence is performed relative to a repository of patterns and identifies a best matching pattern from the repository of patterns to the annotated sequence based on matching criteria. The annotated sequence is refined with lexical information associated with the best matching pattern as a refined annotated sequence. The candidate token sequence and the refined annotated sequence are output to a domain-specific computational lexicon file.

Type: Application

Filed: March 5, 2015

Publication date: June 23, 2016

Inventors: Branimir K. Boguraev, Esme Manandise, Benjamin P. Segal
DOMAIN-SPECIFIC COMPUTATIONAL LEXICON FORMATION

Publication number: 20160179782

Abstract: According to an aspect, a candidate token sequence including one or more word tokens is extracted from an unstructured domain glossary that includes entries associated with a domain. A look-up operation is performed to retrieve language data for each word token in the candidate token sequence and annotates each word token in the candidate token sequence found by the look-up operation with corresponding retrieved language data to form an annotated sequence. A pattern match of the annotated sequence is performed relative to a repository of patterns and identifies a best matching pattern from the repository of patterns to the annotated sequence based on matching criteria. The annotated sequence is refined with lexical information associated with the best matching pattern as a refined annotated sequence. The candidate token sequence and the refined annotated sequence are output to a domain-specific computational lexicon file.

Type: Application

Filed: December 23, 2014

Publication date: June 23, 2016

Inventors: Branimir K. Boguraev, Esme Manandise, Benjamin P. Segal
SYSTEM FOR RETRIEVING, VISUALIZING AND EDITING SEMANTIC ANNOTATIONS

Publication number: 20160147840

Abstract: Annotations can be handled by a computer system that receives a query that specifies parameters for extraction of particular annotations from a set of annotations. Annotations include metadata that describes properties of the associated text fragment. A first entity subset, a second entity subset and a relations subset of annotations are extracted from an annotated text corpus. Contextual information relative to the extracted annotations is also extracted from the corpus. A user interface is generated to display frame elements that include the extracted annotations subsets and the extracted contextual information. In response to selections to the frame elements, the system receives input that specifies modifications to the annotations. Based on the input received, the set of annotations is modified in the annotated text corpus.

Type: Application

Filed: November 21, 2014

Publication date: May 26, 2016

Inventors: Branimir K. Boguraev, Anthony T. Levas
Question Correction and Evaluation Mechanism for a Question Answering System

Publication number: 20160062980

Abstract: Mechanisms are provided in a question answering (QA) system comprising a QA system pipeline that analyzes an input question and generates an answer to the input question, for pre-processing the input question. The mechanisms receive an input question and input the input question to a pre-processor flow path having one or more pre-processors. The one or more pre-processors transform the input question into a transformed question by correcting errors in a formulation of the input question that are determined to be detrimental to efficient and accurate processing of the input question by a QA system pipeline of the QA system. The transformed question is then input to the QA system pipeline of the QA system which processes the transformed question to generate and output an answer to the input question.

Type: Application

Filed: August 29, 2014

Publication date: March 3, 2016

Inventors: Branimir K. Boguraev, John P. Bufe, III, Matthew T. Hatem, Jared M.D. Smythe
Context-based disambiguation of acronyms and abbreviations

Patent number: 9031832

Abstract: Context-based disambiguation of acronyms and/or abbreviations may determine a target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage, the target abbreviation representing a shortened form of one or more word. A contextual search query including the target abbreviation and said one or more keywords may be generated. A pseudo document index may be searched for one or more expansions of the target abbreviation by invoking the contextual search query, the pseudo document index containing index of one or more pseudo documents, associated one or more abbreviations and associated context keywords. One or more pseudo documents associated with the target abbreviation may be returned based on the searching of the pseudo document index.

Type: Grant

Filed: September 6, 2012

Date of Patent: May 12, 2015

Assignee: International Business Machines Corporation

Inventors: Branimir K. Boguraev, Jennifer Chu-Carroll, David A. Ferrucci, Anthony T. Levas, John M. Prager
Context-based disambiguation of acronyms and abbreviations

Patent number: 9020805

Abstract: Context-based disambiguation of acronyms and/or abbreviations may determine a target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage, the target abbreviation representing a shortened form of one or more word. A contextual search query including the target abbreviation and said one or more keywords may be generated. A pseudo document index may be searched for one or more expansions of the target abbreviation by invoking the contextual search query, the pseudo document index containing index of one or more pseudo documents, associated one or more abbreviations and associated context keywords. One or more pseudo documents associated with the target abbreviation may be returned based on the searching of the pseudo document index.

Type: Grant

Filed: September 23, 2011

Date of Patent: April 28, 2015

Assignee: International Business Machines Corporation

Inventors: Branimir K. Boguraev, Jennifer Chu-Carroll, David A. Ferrucci, Anthony T. Levas, John M. Prager
GENERATING SECONDARY QUESTIONS IN AN INTROSPECTIVE QUESTION ANSWERING SYSTEM

Publication number: 20140072948

Abstract: A method of generating secondary questions in a question-answer system. Missing information is identified from a corpus of data using a computerized device. The missing information comprises any information that improves confidence scores for candidate answers to a question. The computerized device automatically generates a plurality of hypotheses concerning the missing information. The computerized device automatically generates at least one secondary question based on each of the plurality of hypotheses. The hypotheses are ranked based on relative utility to determine an order in which the computerized device outputs the at least one secondary question to external sources to obtain responses.

Type: Application

Filed: September 11, 2012

Publication date: March 13, 2014

Applicant: International Business Machines Corporation

Inventors: Branimir K. Boguraev, David W. Buchanan, Jennifer Chu-Carroll, David A. Ferrucci, Aditya A. Kalyanpur, James W. Murdock, IV, Siddharth A. Patwardhan
GENERATING SECONDARY QUESTIONS IN AN INTROSPECTIVE QUESTION ANSWERING SYSTEM

Publication number: 20140072947

Abstract: A method of generating secondary questions in a question-answer system. Missing information is identified from a corpus of data using a computerized device. The missing information comprises any information that improves confidence scores for candidate answers to a question. The computerized device automatically generates a plurality of hypotheses concerning the missing information. The computerized device automatically generates at least one secondary question based on each of the plurality of hypotheses. The hypotheses are ranked based on relative utility to determine an order in which the computerized device outputs the at least one secondary question to external sources to obtain responses.

Type: Application

Filed: September 11, 2012

Publication date: March 13, 2014

Applicant: International Business Machines Corporation

Inventors: Branimir K. Boguraev, David W. Buchanan, Jennifer Chu-Carroll, David A. Ferrucci, Aditya A. Kalyanpur, James W. Murdock, IV, Siddharth A. Patwardhan
PROVIDING COMMUNITY FOR CUSTOMER QUESTIONS

Publication number: 20130013546

Abstract: A system for providing community for customer questions receives a customer question. The customer question may be classified into a classification from a plurality of classifications categorizing whether a question is answerable, needs expert assistance, needs more information, or is not answerable. Based on the classification and one or more incentives, the question may be further routed to an appropriate community. The interactions with a customer in receiving and answering the customer question may be recorded.

Type: Application

Filed: September 14, 2012

Publication date: January 10, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sugato Bagchi, Branimir K. Boguraev, Anthony T. Levas, Roberto Sicconi, Wlodek W. Zadrozny
CONTEXT-BASED DISAMBIGUATION OF ACRONYMS AND ABBREVIATIONS

Publication number: 20120330648

Abstract: Context-based disambiguation of acronyms and/or abbreviations may determine a target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage, the target abbreviation representing a shortened form of one or more word. A contextual search query including the target abbreviation and said one or more keywords may be generated. A pseudo document index may be searched for one or more expansions of the target abbreviation by invoking the contextual search query, the pseudo document index containing index of one or more pseudo documents, associated one or more abbreviations and associated context keywords. One or more pseudo documents associated with the target abbreviation may be returned based on the searching of the pseudo document index.

Type: Application

Filed: September 6, 2012

Publication date: December 27, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Branimir K. Boguraev, Jennifer Chu-Carroll, David A. Ferrucci, Anthony T. Levas, John M. Prager
PROVIDING COMMUNITY FOR CUSTOMER QUESTIONS

Publication number: 20120084112

Abstract: A system for providing community for customer questions receives a customer question. The customer question may be classified into a classification from a plurality of classifications categorizing whether a question is answerable, needs expert assistance, needs more information, or is not answerable. Based on the classification and one or more incentives, the question may be further routed to an appropriate community. The interactions with a customer in receiving and answering the customer question may be recorded.

Type: Application

Filed: September 24, 2011

Publication date: April 5, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sugato Bagchi, Branimir K. Boguraev, Anthony T. Levas, Roberto Sicconi, Wlodek W. Zadrozny
CONTEXT-BASED DISAMBIGUATION OF ACRONYMS AND ABBREVIATIONS

Publication number: 20120084076

Abstract: Context-based disambiguation of acronyms and/or abbreviations may determine a target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage, the target abbreviation representing a shortened form of one or more word. A contextual search query including the target abbreviation and said one or more keywords may be generated. A pseudo document index may be searched for one or more expansions of the target abbreviation by invoking the contextual search query, the pseudo document index containing index of one or more pseudo documents, associated one or more abbreviations and associated context keywords. One or more pseudo documents associated with the target abbreviation may be returned based on the searching of the pseudo document index.

Type: Application

Filed: September 23, 2011

Publication date: April 5, 2012

Applicant: International Business Machines Corporation

Inventors: Branimir K. Boguraev, Jennifer Chu-Carroll, David A. Ferrucci, Anthony T. Levas, John M. Prager
SEMANTICALLY AWARE, DYNAMIC, MULTI-MODAL CONCORDANCE FOR UNSTRUCTURED INFORMATION ANALYSIS

Publication number: 20120036478

Abstract: An apparatus includes a data processing system for generating and displaying a semantic type concordance. The data processing system includes memory storing a computer program, a display to display data of a concordance generated by the program, and a processor configured to execute the computer program. The computer program includes instructions for displaying a user interface configured to enable a user to select semantic types and specify at least one text document, generating a concordance of the at least one document based on the semantic types, and displaying data of the generated concordance on the display.

Type: Application

Filed: August 6, 2010

Publication date: February 9, 2012

Applicant: International Business Machines Corporation

Inventors: Branimir K. Boguraev, Youssef Drissi, David A. Ferrucci, Paul T. Keyser, Anthony T. Levas
System and method for identifying document structure and associated metainformation

Patent number: 7937338

Abstract: A system and method for processing documents by utilizing the textual content and layout of the documents, including visual indicators, to more efficiently and reliably process the documents across various document types. The system and method identifies visually distinguishable elements within the document, such as section and sub-section boundary indicators, to mark, divide and label the boundaries and content type such that the sections are more clearly identifiable and easily processed. The system and method uses known elements, including section heading types, keywords, section type classifiers, sub-section heading constructs, stop words, and the like to adaptively identify and process a broad range of document types. The system and method continually refines and updates these known elements and allows users to discover and define new elements for further refinement and updating.

Type: Grant

Filed: April 30, 2008

Date of Patent: May 3, 2011

Assignee: International Business Machines Corporation

Inventors: Branimir K. Boguraev, Roy J. Byrd, Keh-Shin F. Cheng, Anni R. Coden, Michael A. Tanenblatt, Wilfried Teiken
System and Method for Identifying Document Structure and Associated Metainformation and Facilitating Appropriate Processing

Publication number: 20090276378

Abstract: A system and method for processing documents by utilizing the textual content and layout of the documents, including visual indicators, to more efficiently and reliably process the documents across various document types. The system and method identifies visually distinguishable elements within the document, such as section and sub-section boundary indicators, to mark, divide and label the boundaries and content type such that the sections are more clearly identifiable and easily processed. The system and method uses known elements, including section heading types, keywords, section type classifiers, sub-section heading constructs, stop words, and the like to adaptively identify and process a broad range of document types. The system and method continually refines and updates these known elements and allows users to discover and define new elements for further refinement and updating.

Type: Application

Filed: April 30, 2008

Publication date: November 5, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Branimir K. Boguraev, Roy J. Byrd, Keh-Shin F. Cheng, Anni R. Coden, Michael A. Tanenblatt, Wilfried Teiken
Method for extracting knowledge from online documentation and creating a glossary, index, help database or the like

Patent number: 6212494

Abstract: A method involving computer-mediated linguistic analysis of online technical documentation to extract and catalog from the documentation knowledge essential to, for example, creating a online help database useful in providing online assistance to users in performing a task. The method comprises stripping markup tags from the documentation, linguistically analyzing and annotating the text, including the steps of morphologically and lexically analyzing the text, disambiguating between possible parts-of-speech for each word, and syntactically analyzing and labeling each word.

Type: Grant

Filed: July 20, 1998

Date of Patent: April 3, 2001

Assignee: Apple Computer, Inc.

Inventor: Branimir K. Boguraev
Method for extracting knowledge from online documentation and creating a glossary, index, help database or the like

Patent number: 5799268

Abstract: A method involving computer-mediated linguistic analysis of online technical documentation to extract and catalog from the documentation knowledge essential to, for example, creating a online help database useful in providing online assistance to users in performing a task. The method comprises stripping markup tags from the documentation, linguistically analyzing and annotating the text, including the steps of morphologically and lexically analyzing the text, disambiguating between possible parts-of-speech for each word, and syntactically analyzing and labeling each word.

Type: Grant

Filed: September 28, 1994

Date of Patent: August 25, 1998

Assignee: Apple Computer, Inc.

Inventor: Branimir K. Boguraev

prev 1 2 3