Patents by Inventor Brendan Bull

Brendan Bull has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PROCESSING SCANNED DOCUMENTS

Publication number: 20210034857

Abstract: Embodiments include methods, system and computer program products for processing a scanned document. Aspects include obtaining an image of the scanned document and identifying a boundary of a portion of the scanned document, wherein the portion includes at least partially obscured text. Aspects also include performing optical character recognition on the image of the scanned document to extract text from the document. Aspects further include performing additional processing on the text extracted from inside the portion of the document.

Type: Application

Filed: August 1, 2019

Publication date: February 4, 2021

Inventors: BRENDAN BULL, SCOTT CARRIER, PAUL LEWIS FELT
SEMANTIC RELATIONSHIP SEARCH AGAINST CORPUS

Publication number: 20210034676

Abstract: Methods, systems, and computer program products for semantic search are provided. Aspects include receiving a query, the query comprising one or more search concepts, determining a semantic type from a plurality of semantic types for each of the one or more search concepts, analyzing the one or more search concepts to determine one or more relationships associated with the one or more search concepts, and determining one or more search results from a corpus based at least in part on the one or more relationships and the one or more search concepts.

Type: Application

Filed: July 30, 2019

Publication date: February 4, 2021

Inventors: Scott Carrier, Brendan Bull, Dwi Sianto Mansjur, Paul Lewis Felt
Ontology-based document analysis and annotation generation

Patent number: 10909320

Abstract: Techniques for cognitive annotation are provided. An electronic document including textual data is received. A plurality of importance scores are generated for a plurality of words included in the electronic document by processing the electronic document using a trained passage encoder. Important words are identified based on the plurality of importance scores. One or more clusters of words are generated, where each of the one or more clusters of words includes at least one of the plurality of important words. A representative word is selected for a first cluster, and the representative word is mapped to one or more concepts from a predefined list of concepts. The one or more concepts are disambiguated to identify a set of relevant concepts for the electronic document. An annotated version of the electronic document is generated based at least in part on the set of relevant concepts.

Type: Grant

Filed: February 7, 2019

Date of Patent: February 2, 2021

Assignee: International Business Machines Corporation

Inventors: Brendan Bull, Paul Lewis Felt, Andrew Hicks
Breaking down a high-level business problem statement in a natural language and generating a solution from a catalog of assets

Patent number: 10902046

Abstract: The present invention includes a computing device that may receive a business problem in a natural language. The computing device may determine a domain classification from the business problem, where the domain classification is a list of domains determined from an application programming interface (API) catalog. The computing device may generate a problem graph from the business problem, where the problem graph is a parsed tree of natural language elements extracted from the natural language and stored as a database. The computing device may retrieve one or more assets from the plurality of assets based on the domain classification and the problem graph. The computing device may generate a problem-solution graph from the one or more assets and generate a solution API pipeline graph for evaluation by a user and compilation by a pipeline assembler.

Type: Grant

Filed: November 29, 2018

Date of Patent: January 26, 2021

Assignee: International Business Machines Corporation

Inventors: Scott R. Carrier, Brendan Bull, Aysu Ezen Can
USE OF MACHINE LEARNING TO CHARACTERIZE REFERENCE RELATIONSHIP APPLIED OVER A CITATION GRAPH

Publication number: 20200257709

Abstract: Techniques for document analysis using machine learning are provided. A selection of an index is received document, and a plurality of documents that refer to the index document is identified. For each respective document in the plurality of documents, a respective portion of the respective document is extracted, where the respective portion refers to the index document, and a respective vector representation is generated for the respective portion. A plurality of groupings is generated for the plurality of documents based on how each of the plurality of documents relate to the index document, by processing the vector representations using a trained classifier. Finally, at least an indication of the plurality of groupings is provided, along with the index document.

Type: Application

Filed: February 11, 2019

Publication date: August 13, 2020

Inventors: BRENDAN BULL, ANDREW HICKS, Scott Robert Carrier, Dwi Sianto Mansjur
ONTOLOGY-BASED DOCUMENT ANALYSIS AND ANNOTATION GENERATION

Publication number: 20200257761

Abstract: Techniques for cognitive annotation are provided. An electronic document including textual data is received. A plurality of importance scores are generated for a plurality of words included in the electronic document by processing the electronic document using a trained passage encoder. Important words are identified based on the plurality of importance scores. One or more clusters of words are generated, where each of the one or more clusters of words includes at least one of the plurality of important words. A representative word is selected for a first cluster, and the representative word is mapped to one or more concepts from a predefined list of concepts. The one or more concepts are disambiguated to identify a set of relevant concepts for the electronic document. An annotated version of the electronic document is generated based at least in part on the set of relevant concepts.

Type: Application

Filed: February 7, 2019

Publication date: August 13, 2020

Inventors: BRENDAN BULL, PAUL LEWIS FELT, ANDREW HICKS
GENERATING A DOMAIN-SPECIFIC PHRASAL DICTIONARY

Publication number: 20200250216

Abstract: Embodiments generally relate to the generation of a domain-specific phrasal dictionary. In some embodiments, a method includes receiving text from a user, wherein the text includes unstructured text of a natural language. The method further includes parsing the text into text chunks. The method further includes sending the text chunks to the user. The method further includes receiving one or more phrase categories and one or more predetermined phrases from the user, wherein each predetermined phrase of the one or more predetermined phrases corresponds to at least one phrase category of the one or more phrase categories. The method further includes comparing the predetermined phrases with the text chunks. The method further includes assigning at least one phrase category of the one or more phrase categories to at least one text chunk. The method further includes sending at least one text chunk and the at least one phrase category that is assigned to the at least one text chunk to the user.

Type: Application

Filed: February 4, 2019

Publication date: August 6, 2020

Inventors: Dwi Sianto MANSJUR, Scott Robert CARRIER, Brendan BULL, Andrew HICKS
IDENTIFYING AND PRIORITIZING CANDIDATE ANSWER GAPS WITHIN A CORPUS

Publication number: 20200183962

Abstract: Methods and apparatus, including computer program products, implementing and using techniques for identifying candidate answer gaps within a corpus of a question and answer system. An original question posed to the question and answer system is analyzed to identify an object and a semantic type for the question. Concepts having a same or similar semantic type are retrieved from an ontology or dictionary. For at least one retrieved concept, one or more altered questions are created by replacing the object of the original question with a preferred term of the retrieved concept. The one or more altered questions are submitted to the question and answer system. The answers to the altered questions are analyzed to identify gaps within the corpus of the question and answer system.

Type: Application

Filed: December 6, 2018

Publication date: June 11, 2020

Inventors: Scott R. Carrier, Aysu Ezen Can, BRENDAN BULL, Dwi Sianto Mansjur
BREAKING DOWN A HIGH-LEVEL BUSINESS PROBLEM STATEMENT IN A NATURAL LANGUAGE AND GENERATING A SOLUTION FROM A CATALOG OF ASSETS

Publication number: 20200175051

Abstract: The present invention includes a computing device that may receive a business problem in a natural language. The computing device may determine a domain classification from the business problem, where the domain classification is a list of domains determined from an application programming interface (API) catalog. The computing device may generate a problem graph from the business problem, where the problem graph is a parsed tree of natural language elements extracted from the natural language and stored as a database. The computing device may retrieve one or more assets from the plurality of assets based on the domain classification and the problem graph. The computing device may generate a problem-solution graph from the one or more assets and generate a solution API pipeline graph for evaluation by a user and compilation by a pipeline assembler.

Type: Application

Filed: November 29, 2018

Publication date: June 4, 2020

Inventors: SCOTT R. CARRIER, BRENDAN BULL, AYSU EZEN CAN
Accurate relationship extraction with word embeddings using minimal training data

Patent number: 10642875

Abstract: A processor-implemented method generates a plurality of smoothed transition vectors from a plurality of training data. The method receives a plurality of text and a query. The method converts the plurality of received text to a word embedding space. The method converts the received query to a set of coordinates from the word embedding space and a set of the plurality of determined smoothed transition vectors. The method determines a plurality of candidate answers based on adding the set of the smoothed transition vectors to the set of coordinates in the word embedding space. The method determines an answer to the received query, based on applying a filter, wherein the filter is selected from a group consisting of a type filtering, a conflicting type filtering, and an equivalence filtering, and the method displays the determined answer.

Type: Grant

Filed: April 28, 2017

Date of Patent: May 5, 2020

Assignee: International Business Machines Corporation

Inventors: Brendan Bull, Paul Lewis Felt
Disambiguating concepts in natural language

Patent number: 10565314

Abstract: A computer receives a plurality of text and determines a concept is present in the plurality of text. The computer determines a set of hypotheses for the determined concept, wherein the set of hypotheses is a plurality of natural language representations of the determined concept. The computer substitutes the determined concept in the plurality of text with a hypothesis from the determined set of hypotheses. The computer determines the hypothesis is valid based on analyzing the plurality of text with a neural network, wherein the neural network is trained for hypothesis validation. Based on determining that the hypothesis is valid, the computer storing the plurality of text with the determined hypothesis in place of the substituted concept and displays the stored plurality of text.

Type: Grant

Filed: April 26, 2019

Date of Patent: February 18, 2020

Assignee: International Business Machines Corporation

Inventors: Brendan Bull, Paul Lewis Felt
HEURISTIC Q&A SYSTEM

Publication number: 20200042643

Abstract: Embodiments of the present invention disclose a method, a computer program product, and a computer system for providing heuristic answers to a question that cannot be answered with sufficient confidence. A computer receives a question and the computer identifies one or more answers to the question. In addition, the computer determines that a confidence level corresponding to the one or more answers does not exceed a threshold and, based on determining that the confidence level corresponding to the one or more answers does not exceed the threshold, the computer identifies a primary concept of the question. Moreover, the computer identifies one or more related concepts to the primary concept and reformulates the received question by replacing the primary concept with the one or more related concepts. Lastly, the computer identifies and presents to a user one or more reformulated answers to the reformulated question.

Type: Application

Filed: August 6, 2018

Publication date: February 6, 2020

Inventors: Scott R. Carrier, Brendan Bull, Aysu Ezen Can, Dwi Sianto Mansjur
IDENTIFICATION OF CO-LOCATED ARTIFACTS IN COGNITIVELY ANALYZED CORPORA

Publication number: 20200027566

Abstract: Techniques for cognitive corpora analysis are provided. Vector representations are generated by processing documents in a corpus using a passage encoder. One or more concepts are identified in the documents by processing the documents with the passage encoder, where the concepts are assigned respective importance scores by the passage encoder. Further, a selection of a document is received, and a sub-corpus of documents is generated by computing a similarity measure between the vector representation of the first document and the vector representation of at least one other document in the corpus. An overall importance score is generated for a first concept, with respect to the generated sub-corpus, by identifying a respective importance score of the first concept in at least two respective documents in the sub-corpus, and aggregating the respective importance scores. Finally, an indication of the generated overall importance score is provided.

Type: Application

Filed: July 20, 2018

Publication date: January 23, 2020

Inventors: Brendan BULL, Paul Lewis FELT, Andrew HICKS
VALIDATING BELIEF STATES OF AN AI SYSTEM BY SENTIMENT ANALYSIS AND CONTROVERSY DETECTION

Publication number: 20190370391

Abstract: Validating belief states of an artificial intelligence system includes providing a question answering service; detecting a negative sentiment of a user to an answer transmitted to a device associated with the user; and responsive to detecting the negative sentiment, detecting that the answer relates to a topic on which there is controversy. Next, a new belief state is added to the question answering service based on the controversy, and an updated answer is transmitted to the device, wherein the updated answer is based on the new belief state.

Type: Application

Filed: June 5, 2018

Publication date: December 5, 2019

Inventors: Aysu Ezen Can, Brendan Bull, Scott R. Carrier, Dwi Sianto Mansjur
DISAMBIGUATING CONCEPTS IN NATURAL LANGUAGE

Publication number: 20190251173

Abstract: A computer receives a plurality of text and determines a concept is present in the plurality of text. The computer determines a set of hypotheses for the determined concept, wherein the set of hypotheses is a plurality of natural language representations of the determined concept. The computer substitutes the determined concept in the plurality of text with a hypothesis from the determined set of hypotheses. The computer determines the hypothesis is valid based on analyzing the plurality of text with a neural network, wherein the neural network is trained for hypothesis validation. Based on determining that the hypothesis is valid, the computer storing the plurality of text with the determined hypothesis in place of the substituted concept and displays the stored plurality of text.

Type: Application

Filed: April 26, 2019

Publication date: August 15, 2019

Inventors: BRENDAN BULL, PAUL LEWIS FELT
Disambiguating concepts in natural language

Patent number: 10372824

Abstract: A computer receives a plurality of text and determines a concept is present in the plurality of text. The computer determines a set of hypotheses for the determined concept, wherein the set of hypotheses is a plurality of natural language representations of the determined concept. The computer substitutes the determined concept in the plurality of text with a hypothesis from the determined set of hypotheses. The computer determines the hypothesis is valid based on analyzing the plurality of text with a neural network, wherein the neural network is trained for hypothesis validation. Based on determining that the hypothesis is valid, the computer storing the plurality of text with the determined hypothesis in place of the substituted concept and displays the stored plurality of text.

Type: Grant

Filed: May 15, 2017

Date of Patent: August 6, 2019

Assignee: International Business Machines Corporation

Inventors: Brendan Bull, Paul Lewis Felt
HIERARCHICAL QUESTION ANSWERING SYSTEM

Publication number: 20190163756

Abstract: A method for providing a hierarchical question answering system for presenting structured answers to a query is provided. The method may include receiving a query for a question answering system. The method may further include generating first queries based on the query. The method may further include generating second queries based on the first queries. The method may further include clustering the query, the first queries, and the second queries to form a hierarchy of queries. The method may also include processing the hierarchy of queries to generate answers. The method may further include clustering the answers to form a hierarchy of answers. The method may also include ranking the hierarchy of answers. The method may also include aggregating the hierarchy of answers to generate an optimal answer. The method may further include presenting the hierarchy of queries, the hierarchy of answers, and the optimal answer.

Type: Application

Filed: November 29, 2017

Publication date: May 30, 2019

Inventors: Brendan Bull, Scott R. Carrier, Aysu Ezen Can, Dwi Sianto Mansjur
Accurate relationship extraction with word embeddings using minimal training data

Patent number: 10216834

Abstract: A processor-implemented method generates a plurality of smoothed transition vectors from a plurality of training data. The method receives a plurality of text and a query. The method converts the plurality of received text to a word embedding space. The method converts the received query to a set of coordinates from the word embedding space and a set of the plurality of determined smoothed transition vectors. The method determines a plurality of candidate answers based on adding the set of the smoothed transition vectors to the set of coordinates in the word embedding space. The method determines an answer to the received query, based on applying a filter, wherein the filter is selected from a group consisting of a type filtering, a conflicting type filtering, and an equivalence filtering, and the method displays the determined answer.

Type: Grant

Filed: March 6, 2018

Date of Patent: February 26, 2019

Assignee: International Business Machines Corporation

Inventors: Brendan Bull, Paul Lewis Felt
DISAMBIGUATING CONCEPTS IN NATURAL LANGUAGE

Publication number: 20180329887

Abstract: A computer receives a plurality of text and determines a concept is present in the plurality of text. The computer determines a set of hypotheses for the determined concept, wherein the set of hypotheses is a plurality of natural language representations of the determined concept. The computer substitutes the determined concept in the plurality of text with a hypothesis from the determined set of hypotheses. The computer determines the hypothesis is valid based on analyzing the plurality of text with a neural network, wherein the neural network is trained for hypothesis validation. Based on determining that the hypothesis is valid, the computer storing the plurality of text with the determined hypothesis in place of the substituted concept and displays the stored plurality of text.

Type: Application

Filed: March 6, 2018

Publication date: November 15, 2018

Inventors: BRENDAN BULL, PAUL LEWIS FELT
DISAMBIGUATING CONCEPTS IN NATURAL LANGUAGE

Publication number: 20180329885

Abstract: A computer receives a plurality of text and determines a concept is present in the plurality of text. The computer determines a set of hypotheses for the determined concept, wherein the set of hypotheses is a plurality of natural language representations of the determined concept. The computer substitutes the determined concept in the plurality of text with a hypothesis from the determined set of hypotheses. The computer determines the hypothesis is valid based on analyzing the plurality of text with a neural network, wherein the neural network is trained for hypothesis validation. Based on determining that the hypothesis is valid, the computer storing the plurality of text with the determined hypothesis in place of the substituted concept and displays the stored plurality of text.

Type: Application

Filed: May 15, 2017

Publication date: November 15, 2018

Inventors: BRENDAN BULL, PAUL LEWIS FELT

prev 1 2 3 4 next