Patents by Inventor David Contreras

David Contreras has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Document type-specific quality model

Patent number: 11687796

Abstract: An approach is provided that receives a document and a document type of the document. The document type identifies a document category to which the received document belongs. A set of linguistic metrics are retrieved that correspond to the document type. A quality of the received document is automatically determined based on a set of linguistic features found in the document as compared to the retrieved set of linguistic metrics. The document is then ingested into a corpus that is utilized by a question-answering (QA) system. The ingestion of the document is based on the determined quality.

Type: Grant

Filed: April 17, 2019

Date of Patent: June 27, 2023

Assignee: International Business Machines Corporation

Inventors: Brien H. Muschett, Andrew R. Freed, Roberto Delima, David Contreras, Krishna Mahajan
Contextual span framework

Patent number: 11593561

Abstract: A phrase that includes a trigger word that modifies a meaning within the phrase is received. The trigger word is identified. The words of the phrase that are modified by the trigger word are identified by analyzing features of the phrase that link the trigger word to other words. The phrase is interpreted by modifying the second subset of words according to the modification of the trigger word.

Type: Grant

Filed: September 11, 2019

Date of Patent: February 28, 2023

Assignee: International Business Machines Corporation

Inventors: David Contreras, Krishna Mahajan, Roberto Delima, Kandhan Sekar, Corville O. Allen, Chris Mwarabu
Classifying text to determine a goal type used to select machine learning algorithm outcomes

Patent number: 11397851

Abstract: Provided are a computer program product, system, and method for classifying text to determine a goal type used to select machine learning algorithm outcomes. Natural language processing of text is performed to determine features in the text and their relationships. A classifier classifies the text based on the relationships and features to determine a goal type. The determined features and relationships from the text are inputted into a plurality of different machine learning algorithms to generate outcomes. For each of the machine learning algorithms, a determination is made of performance measurements resulting from the machine learning algorithms generating the outcomes. A determination is made of at least one machine learning algorithm having performance measurements that are highly correlated to the determined goal type. An outcome is determined from at least one of the outcomes.

Type: Grant

Filed: April 13, 2018

Date of Patent: July 26, 2022

Assignee: International Business Machines Corporation

Inventors: Aysu Ezen Can, David Contreras, Bob Delima, Corville O. Allen
Classifying text to determine a goal type used to select machine learning algorithm outcomes

Patent number: 11392764

Abstract: Provided are a computer program product, system, and method for classifying text to determine a goal type used to select machine learning algorithm outcomes. Natural language processing of text is performed to determine features in the text and their relationships. A classifier classifies the text based on the relationships and features to determine a goal type. The determined features and relationships from the text are inputted into a plurality of different machine learning algorithms to generate outcomes. For each of the machine learning algorithms, a determination is made of performance measurements resulting from the machine learning algorithms generating the outcomes. A determination is made of at least one machine learning algorithm having performance measurements that are highly correlated to the determined goal type. An outcome is determined from at least one of the outcomes.

Type: Grant

Filed: June 26, 2019

Date of Patent: July 19, 2022

Assignee: International Business Machines Corporation

Inventors: Aysu Ezen Can, David Contreras, Bob Delima, Corville O. Allen
Automatic detection of context switch triggers

Patent number: 11295080

Abstract: A method, system, and computer program product include providing a list of triggers, training the natural language processor with the list of triggers, providing to the natural language processor a text including one trigger, selecting nodes in the text to create an original potential span, predicting whether the original potential span includes another trigger, and adjusting, in response to predicting that the original potential span includes another trigger, the original potential span to exclude the another trigger to create a new potential span.

Type: Grant

Filed: June 4, 2019

Date of Patent: April 5, 2022

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Roberto Delima, David Contreras, Krishna Mahajan
Semantic evaluation of tentative triggers based on contextual triggers

Patent number: 11205053

Abstract: Methods, systems and computer readable media are provided for semantic evaluation of tentative triggers based on contextual triggers. Contextual triggers are identified within text. A parse tree comprising a plurality of nodes is generated corresponding to the text. Tentative triggers are identified within the text. A determination is made as to whether one or more nodes of the parse tree corresponding to the tentative trigger is within a context of one or more nodes of the parse tree corresponding to the contextual triggers. Based on the determination, the tentative trigger type is assigned to a contextual trigger type.

Type: Grant

Filed: March 26, 2020

Date of Patent: December 21, 2021

Assignee: International Business Machines Corporation

Inventors: David Contreras, Kandhan Sekar, Thomas Hay Rogers
Identifying semantic relationships using visual recognition

Patent number: 11138380

Abstract: Aspects of the present disclosure relate to identifying semantic relationships. Natural language content is received. A part of speech is determined for respective terms within the natural language content. A semantic type is determined for each of two or more terms within the natural language content. A parse tree representation containing a plurality of nodes is then generated based on the natural language content, each of the plurality of nodes corresponding to at least one term within the natural language content, wherein visual characteristics of respective nodes of the plurality of nodes within the parse tree representation depend on the part of speech and semantic type of the respective terms. A bounding box identifying a semantic relationship is then generated around a set of nodes on the parse tree representation, the set of nodes including the two or more terms.

Type: Grant

Filed: June 11, 2019

Date of Patent: October 5, 2021

Assignee: International Business Machines Corporation

Inventors: Chris Mwarabu, David Contreras, Roberto Delima, Corville O. Allen
SEMANTIC EVALUATION OF TENTATIVE TRIGGERS BASED ON CONTEXTUAL TRIGGERS

Publication number: 20210303794

Abstract: Methods, systems and computer readable media are provided for semantic evaluation of tentative triggers based on contextual triggers. Contextual triggers are identified within text. A parse tree comprising a plurality of nodes is generated corresponding to the text. Tentative triggers are identified within the text. A determination is made as to whether one or more nodes of the parse tree corresponding to the tentative trigger is within a context of one or more nodes of the parse tree corresponding to the contextual triggers. Based on the determination, the tentative trigger type is assigned to a contextual trigger type.

Type: Application

Filed: March 26, 2020

Publication date: September 30, 2021

Inventors: David Contreras, Kandhan Sekar, Thomas Hay Rogers
Identifying spans using visual recognition

Patent number: 11120215

Abstract: Aspects of the present disclosure relate to identifying spans within unstructured electronic text. Natural language content is received. A part of speech and slot name of each word within the natural language content is identified. A parse tree representation is then generated based on the natural language content, wherein visual characteristics of each node of a plurality of nodes within the parse tree representation depend on the part of speech and slot name of each word. A bounding box identifying a span category is then generated around a set of nodes on the parse tree representation by a machine learning model.

Type: Grant

Filed: April 24, 2019

Date of Patent: September 14, 2021

Assignee: International Business Machines Corporation

Inventors: Chris Mwarabu, David Contreras, Roberto Delima, Corville O. Allen
Natural language processing matrices

Patent number: 11113469

Abstract: A phrase may be received that includes a plurality of tokens in a natural language format. A plurality of levels relating to dependencies between tokens of the plurality of tokens within the phrase is determined. A matrix structure is generated for the phrase. The matrix structure utilizes a plurality of rows and a plurality of columns to store data of the phrase. The plurality of rows and the plurality of columns each indicate one of an order of tokens of the plurality of tokens or levels of the plurality of levels.

Type: Grant

Filed: March 27, 2019

Date of Patent: September 7, 2021

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Roberto Delima, Chris Mwarabu, David Contreras, Kandhan Sekar, Krishna Mahajan
Relevancy as an indicator for determining document quality

Patent number: 11017171

Abstract: A method, computer system, and a computer program product for relevancy-based document quality assessment is provided. The present invention may include computing a document quality score based on at least one container relevancy score determined based on at least one domain link to a domain knowledge base.

Type: Grant

Filed: June 6, 2019

Date of Patent: May 25, 2021

Assignee: International Business Machines Corporation

Inventors: Roberto Delima, Andrew R. Freed, Brien Muschett, Krishna Mahajan, David Contreras
List manipulation in natural language processing

Patent number: 10956662

Abstract: First content containing a plurality of list items in one or more lists can be parsed for conjunctions and implied list indicators. One or more modifications can occur at one or more conjunctions or implied list indicators. The one or more modifications can comprise one or more of expanding text, contracting text, and replacing text. The one or more modifications can generate second content conducive to natural language processing operations.

Type: Grant

Filed: September 12, 2018

Date of Patent: March 23, 2021

Assignee: International Business Machines Corporation

Inventors: Keith P. Biegert, Brendan C. Bull, David Contreras, Robert C. Sizemore, Sterling R. Smith
Cognitive document quality determination with automated heuristic generation

Patent number: 10902044

Abstract: Techniques for cognitive document quality determination and automated heuristic generation are provided. A plurality of documents is received, where each of the plurality of documents contains natural language text. A plurality of values is determined for a first plurality of predefined attributes of the plurality of documents. A plurality of quality scores is generated for the plurality of documents by processing the plurality of values using a machine learning model, where the plurality of quality scores indicate a suitability of each of the plurality of documents to be processed using a target processing operation. A subset of documents is identified from the plurality of documents having respective quality scores below a predefined threshold. The subset of documents is flagged for further processing. At least one document of the plurality of documents that is not flagged is selectively processed using the target processing operation.

Type: Grant

Filed: November 2, 2018

Date of Patent: January 26, 2021

Assignee: International Business Machines Corporation

Inventors: David Contreras, Aysu Ezen Can
IDENTIFYING SEMANTIC RELATIONSHIPS USING VISUAL RECOGNITION

Publication number: 20200394267

Abstract: Aspects of the present disclosure relate to identifying semantic relationships. Natural language content is received. A part of speech is determined for respective terms within the natural language content. A semantic type is determined for each of two or more terms within the natural language content. A parse tree representation containing a plurality of nodes is then generated based on the natural language content, each of the plurality of nodes corresponding to at least one term within the natural language content, wherein visual characteristics of respective nodes of the plurality of nodes within the parse tree representation depend on the part of speech and semantic type of the respective terms. A bounding box identifying a semantic relationship is then generated around a set of nodes on the parse tree representation, the set of nodes including the two or more terms.

Type: Application

Filed: June 11, 2019

Publication date: December 17, 2020

Inventors: CHRIS MWARABU, David Contreras, Roberto Delima, Corville O. Allen
RELEVANCY AS AN INDICATOR FOR DETERMINING DOCUMENT QUALITY

Publication number: 20200387571

Abstract: A method, computer system, and a computer program product for relevancy-based document quality assessment is provided. The present invention may include computing a document quality score based on at least one container relevancy score determined based on at least one domain link to a domain knowledge base.

Type: Application

Filed: June 6, 2019

Publication date: December 10, 2020

Inventors: Roberto Delima, Andrew R. Freed, Brien Muschett, Krishna Mahajan, David Contreras
AUTOMATIC DETECTION OF CONTEXT SWITCH TRIGGERS

Publication number: 20200387572

Abstract: A method, system, and computer program product include providing a list of triggers, training the natural language processor with the list of triggers, providing to the natural language processor a text including one trigger, selecting nodes in the text to create an original potential span, predicting whether the original potential span includes another trigger, and adjusting, in response to predicting that the original potential span includes another trigger, the original potential span to exclude the another trigger to create a new potential span.

Type: Application

Filed: June 4, 2019

Publication date: December 10, 2020

Inventors: Corville O. Allen, Roberto Delima, David Contreras, Krishna Mahajan
IDENTIFYING SPANS USING VISUAL RECOGNITION

Publication number: 20200342053

Abstract: Aspects of the present disclosure relate to identifying spans within unstructured electronic text. Natural language content is received. A part of speech and slot name of each word within the natural language content is identified. A parse tree representation is then generated based on the natural language content, wherein visual characteristics of each node of a plurality of nodes within the parse tree representation depend on the part of speech and slot name of each word. A bounding box identifying a span category is then generated around a set of nodes on the parse tree representation by a machine learning model.

Type: Application

Filed: April 24, 2019

Publication date: October 29, 2020

Inventors: CHRIS MWARABU, David Contreras, Roberto Delima, Corville O. Allen
DOCUMENT TYPE-SPECIFIC QUALITY MODEL

Publication number: 20200334546

Abstract: An approach is provided that receives a document and a document type of the document. The document type identifies a document category to which the received document belongs. A set of linguistic metrics are retrieved that correspond to the document type. A quality of the received document is automatically determined based on a set of linguistic features found in the document as compared to the retrieved set of linguistic metrics. The document is then ingested into a corpus that is utilized by a question-answering (QA) system. The ingestion of the document is based on the determined quality.

Type: Application

Filed: April 17, 2019

Publication date: October 22, 2020

Inventors: Brien H. Muschett, Andrew R. Freed, Roberto Delima, David Contreras, Krishna Mahajan
NATURAL LANGUAGE PROCESSING MATRICES

Publication number: 20200311197

Abstract: A phrase may be received that includes a plurality of tokens in a natural language format. A plurality of levels relating to dependencies between tokens of the plurality of tokens within the phrase is determined. A matrix structure is generated for the phrase. The matrix structure utilizes a plurality of rows and a plurality of columns to store data of the phrase. The plurality of rows and the plurality of columns each indicate one of an order of tokens of the plurality of tokens or levels of the plurality of levels.

Type: Application

Filed: March 27, 2019

Publication date: October 1, 2020

Inventors: Corville O. Allen, Roberto Delima, Chris Mwarabu, David Contreras, Kandhan Sekar, Krishna Mahajan
CLIENT-SPECIFIC DOCUMENT QUALITY MODEL

Publication number: 20200302332

Abstract: A computer-implemented method, system and computer program product for generating a client-specific document quality model, by: analyzing data using existing quality heuristics to identify new, unexpected or problem patterns in the data; forming the quality heuristics into one or more clusters for each container level of the data; exploring each of the clusters to identify sources of the patterns; and developing new quality heuristics based on the sources of the patterns, wherein the new quality heuristics are used to generate the client-specific document quality model.

Type: Application

Filed: March 20, 2019

Publication date: September 24, 2020

Inventors: David Contreras, Krishna Mahajan, Roberto Delima, Andrew R. Freed, Brien Muschett

1 2 3 next