Patents by Inventor Branimir K. Boguraev
Branimir K. Boguraev has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20160203120Abstract: According to an aspect, a candidate lexical kernel unit that includes a word token sequence having two or more words is received. Domain terms that contain the two or more words are retrieved from a terminology resource file of domain terms associated with a domain. The candidate lexical kernel unit and the retrieved domain terms are analyzed to determine whether the candidate lexical kernel unit satisfies specified criteria for use as a building block by a natural-language processing (NLP) tool for building larger lexical units in the domain. Each of the larger lexical units includes a greater number of words than the candidate lexical kernel unit. The candidate lexical kernel unit is identified as a lexical kernel unit based on determining that the candidate lexical kernel unit satisfies the specified criteria. The lexical kernel unit is output to a domain-specific lexical kernel unit file for input to the NLP tool.Type: ApplicationFiled: March 11, 2015Publication date: July 14, 2016Inventors: Branimir K. Boguraev, Esme Manandise, Benjamin P. Segal
-
Publication number: 20160203119Abstract: According to an aspect, a candidate lexical kernel unit that includes a word token sequence having two or more words is received. Domain terms that contain the two or more words are retrieved from a terminology resource file of domain terms associated with a domain. The candidate lexical kernel unit and the retrieved domain terms are analyzed to determine whether the candidate lexical kernel unit satisfies specified criteria for use as a building block by a natural-language processing (NLP) tool for building larger lexical units in the domain. Each of the larger lexical units includes a greater number of words than the candidate lexical kernel unit. The candidate lexical kernel unit is identified as a lexical kernel unit based on determining that the candidate lexical kernel unit satisfies the specified criteria. The lexical kernel unit is output to a domain-specific lexical kernel unit file for input to the NLP tool.Type: ApplicationFiled: January 9, 2015Publication date: July 14, 2016Inventors: Branimir K. Boguraev, Esme Manandise, Benjamin P. Segal
-
Publication number: 20160179783Abstract: According to an aspect, a candidate token sequence including one or more word tokens is extracted from an unstructured domain glossary that includes entries associated with a domain. A look-up operation is performed to retrieve language data for each word token in the candidate token sequence and annotates each word token in the candidate token sequence found by the look-up operation with corresponding retrieved language data to form an annotated sequence. A pattern match of the annotated sequence is performed relative to a repository of patterns and identifies a best matching pattern from the repository of patterns to the annotated sequence based on matching criteria. The annotated sequence is refined with lexical information associated with the best matching pattern as a refined annotated sequence. The candidate token sequence and the refined annotated sequence are output to a domain-specific computational lexicon file.Type: ApplicationFiled: March 5, 2015Publication date: June 23, 2016Inventors: Branimir K. Boguraev, Esme Manandise, Benjamin P. Segal
-
Publication number: 20160179782Abstract: According to an aspect, a candidate token sequence including one or more word tokens is extracted from an unstructured domain glossary that includes entries associated with a domain. A look-up operation is performed to retrieve language data for each word token in the candidate token sequence and annotates each word token in the candidate token sequence found by the look-up operation with corresponding retrieved language data to form an annotated sequence. A pattern match of the annotated sequence is performed relative to a repository of patterns and identifies a best matching pattern from the repository of patterns to the annotated sequence based on matching criteria. The annotated sequence is refined with lexical information associated with the best matching pattern as a refined annotated sequence. The candidate token sequence and the refined annotated sequence are output to a domain-specific computational lexicon file.Type: ApplicationFiled: December 23, 2014Publication date: June 23, 2016Inventors: Branimir K. Boguraev, Esme Manandise, Benjamin P. Segal
-
Publication number: 20160147840Abstract: Annotations can be handled by a computer system that receives a query that specifies parameters for extraction of particular annotations from a set of annotations. Annotations include metadata that describes properties of the associated text fragment. A first entity subset, a second entity subset and a relations subset of annotations are extracted from an annotated text corpus. Contextual information relative to the extracted annotations is also extracted from the corpus. A user interface is generated to display frame elements that include the extracted annotations subsets and the extracted contextual information. In response to selections to the frame elements, the system receives input that specifies modifications to the annotations. Based on the input received, the set of annotations is modified in the annotated text corpus.Type: ApplicationFiled: November 21, 2014Publication date: May 26, 2016Inventors: Branimir K. Boguraev, Anthony T. Levas
-
Publication number: 20160062980Abstract: Mechanisms are provided in a question answering (QA) system comprising a QA system pipeline that analyzes an input question and generates an answer to the input question, for pre-processing the input question. The mechanisms receive an input question and input the input question to a pre-processor flow path having one or more pre-processors. The one or more pre-processors transform the input question into a transformed question by correcting errors in a formulation of the input question that are determined to be detrimental to efficient and accurate processing of the input question by a QA system pipeline of the QA system. The transformed question is then input to the QA system pipeline of the QA system which processes the transformed question to generate and output an answer to the input question.Type: ApplicationFiled: August 29, 2014Publication date: March 3, 2016Inventors: Branimir K. Boguraev, John P. Bufe, III, Matthew T. Hatem, Jared M.D. Smythe
-
Patent number: 9031832Abstract: Context-based disambiguation of acronyms and/or abbreviations may determine a target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage, the target abbreviation representing a shortened form of one or more word. A contextual search query including the target abbreviation and said one or more keywords may be generated. A pseudo document index may be searched for one or more expansions of the target abbreviation by invoking the contextual search query, the pseudo document index containing index of one or more pseudo documents, associated one or more abbreviations and associated context keywords. One or more pseudo documents associated with the target abbreviation may be returned based on the searching of the pseudo document index.Type: GrantFiled: September 6, 2012Date of Patent: May 12, 2015Assignee: International Business Machines CorporationInventors: Branimir K. Boguraev, Jennifer Chu-Carroll, David A. Ferrucci, Anthony T. Levas, John M. Prager
-
Patent number: 9020805Abstract: Context-based disambiguation of acronyms and/or abbreviations may determine a target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage, the target abbreviation representing a shortened form of one or more word. A contextual search query including the target abbreviation and said one or more keywords may be generated. A pseudo document index may be searched for one or more expansions of the target abbreviation by invoking the contextual search query, the pseudo document index containing index of one or more pseudo documents, associated one or more abbreviations and associated context keywords. One or more pseudo documents associated with the target abbreviation may be returned based on the searching of the pseudo document index.Type: GrantFiled: September 23, 2011Date of Patent: April 28, 2015Assignee: International Business Machines CorporationInventors: Branimir K. Boguraev, Jennifer Chu-Carroll, David A. Ferrucci, Anthony T. Levas, John M. Prager
-
Publication number: 20140072948Abstract: A method of generating secondary questions in a question-answer system. Missing information is identified from a corpus of data using a computerized device. The missing information comprises any information that improves confidence scores for candidate answers to a question. The computerized device automatically generates a plurality of hypotheses concerning the missing information. The computerized device automatically generates at least one secondary question based on each of the plurality of hypotheses. The hypotheses are ranked based on relative utility to determine an order in which the computerized device outputs the at least one secondary question to external sources to obtain responses.Type: ApplicationFiled: September 11, 2012Publication date: March 13, 2014Applicant: International Business Machines CorporationInventors: Branimir K. Boguraev, David W. Buchanan, Jennifer Chu-Carroll, David A. Ferrucci, Aditya A. Kalyanpur, James W. Murdock, IV, Siddharth A. Patwardhan
-
Publication number: 20140072947Abstract: A method of generating secondary questions in a question-answer system. Missing information is identified from a corpus of data using a computerized device. The missing information comprises any information that improves confidence scores for candidate answers to a question. The computerized device automatically generates a plurality of hypotheses concerning the missing information. The computerized device automatically generates at least one secondary question based on each of the plurality of hypotheses. The hypotheses are ranked based on relative utility to determine an order in which the computerized device outputs the at least one secondary question to external sources to obtain responses.Type: ApplicationFiled: September 11, 2012Publication date: March 13, 2014Applicant: International Business Machines CorporationInventors: Branimir K. Boguraev, David W. Buchanan, Jennifer Chu-Carroll, David A. Ferrucci, Aditya A. Kalyanpur, James W. Murdock, IV, Siddharth A. Patwardhan
-
Publication number: 20130013546Abstract: A system for providing community for customer questions receives a customer question. The customer question may be classified into a classification from a plurality of classifications categorizing whether a question is answerable, needs expert assistance, needs more information, or is not answerable. Based on the classification and one or more incentives, the question may be further routed to an appropriate community. The interactions with a customer in receiving and answering the customer question may be recorded.Type: ApplicationFiled: September 14, 2012Publication date: January 10, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sugato Bagchi, Branimir K. Boguraev, Anthony T. Levas, Roberto Sicconi, Wlodek W. Zadrozny
-
Publication number: 20120330648Abstract: Context-based disambiguation of acronyms and/or abbreviations may determine a target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage, the target abbreviation representing a shortened form of one or more word. A contextual search query including the target abbreviation and said one or more keywords may be generated. A pseudo document index may be searched for one or more expansions of the target abbreviation by invoking the contextual search query, the pseudo document index containing index of one or more pseudo documents, associated one or more abbreviations and associated context keywords. One or more pseudo documents associated with the target abbreviation may be returned based on the searching of the pseudo document index.Type: ApplicationFiled: September 6, 2012Publication date: December 27, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Branimir K. Boguraev, Jennifer Chu-Carroll, David A. Ferrucci, Anthony T. Levas, John M. Prager
-
Publication number: 20120084112Abstract: A system for providing community for customer questions receives a customer question. The customer question may be classified into a classification from a plurality of classifications categorizing whether a question is answerable, needs expert assistance, needs more information, or is not answerable. Based on the classification and one or more incentives, the question may be further routed to an appropriate community. The interactions with a customer in receiving and answering the customer question may be recorded.Type: ApplicationFiled: September 24, 2011Publication date: April 5, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sugato Bagchi, Branimir K. Boguraev, Anthony T. Levas, Roberto Sicconi, Wlodek W. Zadrozny
-
Publication number: 20120084076Abstract: Context-based disambiguation of acronyms and/or abbreviations may determine a target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage, the target abbreviation representing a shortened form of one or more word. A contextual search query including the target abbreviation and said one or more keywords may be generated. A pseudo document index may be searched for one or more expansions of the target abbreviation by invoking the contextual search query, the pseudo document index containing index of one or more pseudo documents, associated one or more abbreviations and associated context keywords. One or more pseudo documents associated with the target abbreviation may be returned based on the searching of the pseudo document index.Type: ApplicationFiled: September 23, 2011Publication date: April 5, 2012Applicant: International Business Machines CorporationInventors: Branimir K. Boguraev, Jennifer Chu-Carroll, David A. Ferrucci, Anthony T. Levas, John M. Prager
-
Publication number: 20120036478Abstract: An apparatus includes a data processing system for generating and displaying a semantic type concordance. The data processing system includes memory storing a computer program, a display to display data of a concordance generated by the program, and a processor configured to execute the computer program. The computer program includes instructions for displaying a user interface configured to enable a user to select semantic types and specify at least one text document, generating a concordance of the at least one document based on the semantic types, and displaying data of the generated concordance on the display.Type: ApplicationFiled: August 6, 2010Publication date: February 9, 2012Applicant: International Business Machines CorporationInventors: Branimir K. Boguraev, Youssef Drissi, David A. Ferrucci, Paul T. Keyser, Anthony T. Levas
-
Patent number: 7937338Abstract: A system and method for processing documents by utilizing the textual content and layout of the documents, including visual indicators, to more efficiently and reliably process the documents across various document types. The system and method identifies visually distinguishable elements within the document, such as section and sub-section boundary indicators, to mark, divide and label the boundaries and content type such that the sections are more clearly identifiable and easily processed. The system and method uses known elements, including section heading types, keywords, section type classifiers, sub-section heading constructs, stop words, and the like to adaptively identify and process a broad range of document types. The system and method continually refines and updates these known elements and allows users to discover and define new elements for further refinement and updating.Type: GrantFiled: April 30, 2008Date of Patent: May 3, 2011Assignee: International Business Machines CorporationInventors: Branimir K. Boguraev, Roy J. Byrd, Keh-Shin F. Cheng, Anni R. Coden, Michael A. Tanenblatt, Wilfried Teiken
-
Publication number: 20090276378Abstract: A system and method for processing documents by utilizing the textual content and layout of the documents, including visual indicators, to more efficiently and reliably process the documents across various document types. The system and method identifies visually distinguishable elements within the document, such as section and sub-section boundary indicators, to mark, divide and label the boundaries and content type such that the sections are more clearly identifiable and easily processed. The system and method uses known elements, including section heading types, keywords, section type classifiers, sub-section heading constructs, stop words, and the like to adaptively identify and process a broad range of document types. The system and method continually refines and updates these known elements and allows users to discover and define new elements for further refinement and updating.Type: ApplicationFiled: April 30, 2008Publication date: November 5, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Branimir K. Boguraev, Roy J. Byrd, Keh-Shin F. Cheng, Anni R. Coden, Michael A. Tanenblatt, Wilfried Teiken
-
Patent number: 6212494Abstract: A method involving computer-mediated linguistic analysis of online technical documentation to extract and catalog from the documentation knowledge essential to, for example, creating a online help database useful in providing online assistance to users in performing a task. The method comprises stripping markup tags from the documentation, linguistically analyzing and annotating the text, including the steps of morphologically and lexically analyzing the text, disambiguating between possible parts-of-speech for each word, and syntactically analyzing and labeling each word.Type: GrantFiled: July 20, 1998Date of Patent: April 3, 2001Assignee: Apple Computer, Inc.Inventor: Branimir K. Boguraev
-
Patent number: 5799268Abstract: A method involving computer-mediated linguistic analysis of online technical documentation to extract and catalog from the documentation knowledge essential to, for example, creating a online help database useful in providing online assistance to users in performing a task. The method comprises stripping markup tags from the documentation, linguistically analyzing and annotating the text, including the steps of morphologically and lexically analyzing the text, disambiguating between possible parts-of-speech for each word, and syntactically analyzing and labeling each word.Type: GrantFiled: September 28, 1994Date of Patent: August 25, 1998Assignee: Apple Computer, Inc.Inventor: Branimir K. Boguraev