Patents by Inventor Roberto DeLima

Roberto DeLima has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Deep learning approach to grammatical correction for incomplete parses

Patent number: 10740555

Abstract: Performing an operation comprising determining that a parse of an input string comprising a plurality of tokens is incomplete, generating, based on a machine learning (ML) model: (i) a plurality of candidate addition tokens for adding to the input string, and (ii) a plurality of candidate removal tokens for removing from the input string, selecting, from the plurality of candidate addition tokens and the plurality of candidate removal tokens, a first candidate token, and modifying the input string based on the first candidate token to facilitate a complete parse of the modified input string by a parser.

Type: Grant

Filed: December 7, 2017

Date of Patent: August 11, 2020

Assignee: International Business Machines Corporation

Inventors: Aysu Ezen Can, Roberto Delima, David Contreras, Corville O. Allen
GENERATING RULES FOR AUTOMATED TEXT ANNOTATION

Publication number: 20200175104

Abstract: Natural language text and annotated text can be received. The annotated text can specify at least one anchor and at least one trigger contained in the natural language text and indicate a correspondence between the anchor and the trigger. The natural language text, the annotated text and at least one parse tree generated from the natural language text can be processed. Based on the processing, at least one natural language processing rule can be generated and output. The natural language processing rule can be configured to be executed by a processor to process other natural language text.

Type: Application

Filed: November 29, 2018

Publication date: June 4, 2020

Inventors: Yurdaer N. Doganata, Roberto Delima, Aysu Ezen Can
DE-IDENTIFICATION OF ELECTRONIC MEDICAL RECORDS FOR CONTINUOUS DATA DEVELOPMENT

Publication number: 20200175203

Abstract: A method for de-identifying protected health information (PHI) associated with electronic medical records (EMRs) based on a common analysis structure (CAS) is provided. The method may include detecting a system event associated with a system comprising the EMRs. The method may further include in response to detecting the system event, detecting a first CAS associated with the EMRs. The method may further include extracting first CAS data associated with the first CAS, wherein the first CAS data comprises unstructured data associated with the EMRs and normalized annotations based on CAS objects that are associated with the unstructured data. The method may further include obfuscating the unstructured data associated with the first CAS. The method may also include generating a second CAS comprising the obfuscated unstructured data and a copied version of the normalized annotations, wherein the copied version of normalized annotations are correlated with the obfuscated unstructured data.

Type: Application

Filed: November 30, 2018

Publication date: June 4, 2020

Inventors: Corville O. Allen, Aysu Ezen Can, Roberto Delima, Robert C. Sizemore
CONTEXTUAL SPAN FRAMEWORK

Publication number: 20200175116

Abstract: A phrase that includes a trigger word that modifies a meaning within the phrase is received. The trigger word is identified. The words of the phrase that are modified by the trigger word are identified by analyzing features of the phrase that link the trigger word to other words. The phrase is interpreted by modifying the second subset of words according to the modification of the trigger word.

Type: Application

Filed: September 11, 2019

Publication date: June 4, 2020

Inventors: David Contreras, Krishna Mahajan, Roberto Delima, Kandhan Sekar, Corville O. Allen, Chris Mwarabu
ACTIVE LEARNING FOR CONCEPT DISAMBIGUATION

Publication number: 20190370696

Abstract: A method, computer system, and a computer program product for active machine learning is provided. The present invention may include annotating a plurality of data entries. The present invention may also include building a first dataset based on the annotated plurality of data entries. The present invention may then include receiving user feedback based on the built first dataset. The present invention may further include assigning a plurality of weights to a plurality of data entry subsets. The present invention may also include generating a second weighted dataset based on the received user feedback.

Type: Application

Filed: June 3, 2018

Publication date: December 5, 2019

Inventors: Aysu Ezen Can, Corville O. Allen, Roberto Delima
Scoring attributes in a deep question answering system based on syntactic or semantic guidelines

Patent number: 10437835

Abstract: Systems and computer program products to, responsive to receiving a case by a deep question answering (deep QA) system, identify, in a corpus of information, a first variable for which a value was not specified in the case, compute an importance score for the first variable based on a concept in the corpus, wherein the concept is associated with the first variable, and upon determining that the importance score exceeds an importance threshold, determine that specifying a value for the first variable increases a confidence score of a response returned by the deep QA system beyond a confidence threshold.

Type: Grant

Filed: December 18, 2014

Date of Patent: October 8, 2019

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Roberto Delima, Thomas J. Eggebraaten, Marie L. Setnes
Scoring attributes in a deep question answering system based on syntactic or semantic guidelines

Patent number: 10437836

Abstract: Methods to, responsive to receiving a case by a deep question answering (deep QA) system, identify, in a corpus of information, a first variable for which a value was not specified in the case, compute an importance score for the first variable based on a concept in the corpus, wherein the concept is associated with the first variable, and upon determining that the importance score exceeds an importance threshold, determine that specifying a value for the first variable increases a confidence score of a response returned by the deep QA system beyond a confidence threshold.

Type: Grant

Filed: September 23, 2015

Date of Patent: October 8, 2019

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Roberto Delima, Thomas J. Eggebraaten, Marie L. Setnes
Mining new negation triggers dynamically based on structured and unstructured knowledge

Patent number: 10380251

Abstract: A mechanism is provided in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to implement a cognitive natural language processing system. The cognitive natural language processing (NLP) system analyzes a portion of natural language text to identify an attribute specified in the natural language text. The cognitive NLP system analyzes the portion of natural language text to determine whether a known negation trigger is present in the natural language text in association with the attribute. In response to determining that the natural language text does not contain a known negation trigger in association with the attribute, the cognitive NLP system determines whether the attribute is negated based on instances of the attribute in other natural language content similar to the natural language text.

Type: Grant

Filed: September 9, 2016

Date of Patent: August 13, 2019

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Roberto DeLima, Aysu Ezen Can, Robert C. Sizemore
LINGUISTIC BASED DETERMINATION OF TEXT LOCATION ORIGIN

Publication number: 20190236131

Abstract: A method and system for determining a location of origin and a time period in which a document was written is disclosed. A text is received and a set of linguistic characteristics for the text are identified. A set of possible locations and time periods for the text are determined based on the set of linguistic characteristics. A set of reference documents are used to determine a proximity rating for the text based upon a determination of how close the text is to the reference documents. The potential locations and time periods are ranked and returned for presentation.

Type: Application

Filed: April 8, 2019

Publication date: August 1, 2019

Inventors: Corville O. Allen, Roberto DeLima, Andrew R. Freed, Robert L. Nielsen
Anaphora resolution for medical text with machine learning and relevance feedback

Patent number: 10366161

Abstract: The program directs a computer processor to resolve an anaphor in electronic natural language text. The program detects a plurality of entities and an anaphor in a span of parsed natural language text comprising one or more sentences, and extracts pairs of related entities based on domain knowledge. The program constructs a set of tuples, wherein each tuple is a data type comprising an anaphor, an antecedent entity (AE) appearing before the anaphor in the span of parsed natural language text, and an entity (E) appearing after the anaphor in the span of parsed natural language text, wherein the anaphor refers to the AE and relates the AE to the E. The program resolves the anaphor by determining which entity in the plurality of entities the anaphor references, using the constructed set of tuples, and selecting an AE among one or more candidate AEs.

Type: Grant

Filed: August 2, 2017

Date of Patent: July 30, 2019

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Roberto DeLima, Aysu Ezen Can, Robert C. Sizemore
Personalized approach to handling hypotheticals in text

Patent number: 10360301

Abstract: Mechanisms receive natural language content and analyze the natural language content to generate a parse tree data structure. The mechanisms process the parse tree data structure to identify one or more instances of candidate hypothetical spans in the natural language content. Hypothetical spans are terms or phrases indicative of a hypothetical statement. The mechanisms calculate, for each candidate hypothetical span, a confidence score value indicative of a confidence that the candidate hypothetical span is an actual hypothetical span based on a personalized hypothetical dictionary data structure associated with a source of the natural language content. The mechanisms perform an operation based on the natural language content.

Type: Grant

Filed: October 10, 2016

Date of Patent: July 23, 2019

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Roberto DeLima, Aysu Ezen Can, Robert C. Sizemore
DEEP LEARNING APPROACH TO GRAMMATICAL CORRECTION FOR INCOMPLETE PARSES

Publication number: 20190179887

Abstract: Performing an operation comprising determining that a parse of an input string comprising a plurality of tokens is incomplete, generating, based on a machine learning (ML) model: (i) a plurality of candidate addition tokens for adding to the input string, and (ii) a plurality of candidate removal tokens for removing from the input string, selecting, from the plurality of candidate addition tokens and the plurality of candidate removal tokens, a first candidate token, and modifying the input string based on the first candidate token to facilitate a complete parse of the modified input string by a parser.

Type: Application

Filed: December 7, 2017

Publication date: June 13, 2019

Inventors: Aysu EZEN CAN, Roberto DELIMA, David CONTRERAS, Corville O. ALLEN
Medical Concept Sorting Based on Machine Learning of Attribute Value Differentiation

Publication number: 20190163875

Abstract: Mechanisms are provided for performing entity differentiation. A cognitive medical system ingests a corpus of medical content having references to medical entities, and performs entity recognition on the medical content to identify the medical entities. Responsive to the cognitive medical system identifying a medical entity having a plurality of annotations for a same medical entity attribute, an entity differentiation component executes an ordered set of entity differentiation algorithms, corresponding to the medical entity, for differentiating medical entity attribute values. The entity differentiation component runs the ordered set of entity differentiation algorithms, in order, on the plurality of annotations for the attribute to generate a ranked list of medical entity attribute values corresponding to the annotations in the plurality of annotations. The cognitive medical system performs a cognitive operation on the medical entity based on the ranked list of medical entity attribute values.

Type: Application

Filed: November 27, 2017

Publication date: May 30, 2019

Inventors: Corville O. Allen, Roberto DeLima, Aysu Ezen Can, Robert C. Sizemore
Determining context using weighted parsing scoring

Patent number: 10275456

Abstract: According to one embodiment, a method, computer system, and computer program product for natural language processing is provided. The present invention may include detecting natural language entities, and running parsing algorithms on the natural language entities to determine the relationship between said natural language entities. The present invention may further comprise assigning, by the parsing algorithms, initial scores to detected natural language entities based on the relationship between said natural language entities; choosing a final score for plurality of natural language entities; and comparing the final score against a threshold to determine whether the natural language entities are within the same context.

Type: Grant

Filed: June 15, 2017

Date of Patent: April 30, 2019

Assignee: International Business Machines Corporation

Inventors: Aysu Ezen Can, Roberto DeLima, Corville Allen
Linguistic based determination of text location origin

Patent number: 10275446

Abstract: A method and system for determining a location of origin and a time period in which a document was written is disclosed. A text is received and a set of linguistic characteristics for the text are identified. A set of possible locations and time periods for the text are determined based on the set of linguistic characteristics. A set of reference documents are used to determine a proximity rating for the text based upon a determination of how close the text is to the reference documents. The potential locations and time periods are ranked and returned for presentation.

Type: Grant

Filed: September 6, 2016

Date of Patent: April 30, 2019

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Roberto DeLima, Andrew R. Freed, Robert L. Nielsen
ANAPHORA RESOLUTION FOR MEDICAL TEXT WITH MACHINE LEARNING AND RELEVANCE FEEDBACK

Publication number: 20190042559

Abstract: The program directs a computer processor to resolve an anaphor in electronic natural language text. The program detects a plurality of entities and an anaphor in a span of parsed natural language text comprising one or more sentences, and extracts pairs of related entities based on domain knowledge. The program constructs a set of tuples, wherein each tuple is a data type comprising an anaphor, an antecedent entity (AE) appearing before the anaphor in the span of parsed natural language text, and an entity (E) appearing after the anaphor in the span of parsed natural language text, wherein the anaphor refers to the AE and relates the AE to the E. The program resolves the anaphor by determining which entity in the plurality of entities the anaphor references, using the constructed set of tuples, and selecting an AE among one or more candidate AEs.

Type: Application

Filed: August 2, 2017

Publication date: February 7, 2019

Inventors: Corville O. Allen, Roberto DeLima, Aysu Ezen Can, Robert C. Sizemore
DETERMINING CONTEXT USING WEIGHTED PARSING SCORING

Publication number: 20180365226

Abstract: According to one embodiment, a method, computer system, and computer program product for natural language processing is provided. The present invention may include detecting natural language entities, and running parsing algorithms on the natural language entities to determine the relationship between said natural language entities. The present invention may further comprise assigning, by the parsing algorithms, initial scores to detected natural language entities based on the relationship between said natural language entities; choosing a final score for plurality of natural language entities; and comparing the final score against a threshold to determine whether the natural language entities are within the same context.

Type: Application

Filed: February 22, 2018

Publication date: December 20, 2018

Inventors: Aysu Ezen Can, Roberto DeLima, Corville Allen
DETERMINING CONTEXT USING WEIGHTED PARSING SCORING

Publication number: 20180365219

Abstract: According to one embodiment, a method, computer system, and computer program product for natural language processing is provided. The present invention may include detecting natural language entities, and running parsing algorithms on the natural language entities to determine the relationship between said natural language entities. The present invention may further comprise assigning, by the parsing algorithms, initial scores to detected natural language entities based on the relationship between said natural language entities; choosing a final score for plurality of natural language entities; and comparing the final score against a threshold to determine whether the natural language entities are within the same context.

Type: Application

Filed: June 15, 2017

Publication date: December 20, 2018

Inventors: Aysu Ezen Can, Roberto DeLima, Corville Allen
Sorting Medical Concepts According to Priority

Publication number: 20180357383

Abstract: A mechanism is provided in a data processing system comprising least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to implement a cognitive medical system. The data processing system determines a cognitive operation to be performed by the cognitive medical system. The cognitive medical system ingests a corpus of medical content. The medical content comprises references to medical entities and ingesting the corpus comprises performing entity recognition on the medical content to identify the medical entities. Responsive to identifying a given medical entity having a plurality of annotations for an attribute.

Type: Application

Filed: June 7, 2017

Publication date: December 13, 2018

Inventors: Corville O. Allen, Roberto DeLima, Aysu Ezen Can, Robert C. Sizemore
Scoring attributes in deep question answering systems based on algorithmic source code influences

Patent number: 10127284

Abstract: Systems and computer program products to perform an operation comprising: identifying a first attribute of a source code in a deep question answering system, computing an influence score for the first attribute based on a rule in the source code used to compute a confidence score for each of a plurality of candidate answers generated by the deep question answering system, computing an importance score for the first attribute based at least in part on the computed influence score, and upon determining that the importance score exceeds a predefined threshold, storing an indication that the first attribute is an important attribute relative to other attributes specified in the source code.

Type: Grant

Filed: December 18, 2014

Date of Patent: November 13, 2018

Assignee: International Business Machines Corporation

Inventors: Corville O. Allen, Roberto Delima, Thomas J. Eggebraaten, Marie L. Setnes

prev 1 2 3 4 5 next