Patents by Inventor Roberto DeLima

Roberto DeLima has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11823798
    Abstract: A mechanism is provided in a data processing system comprising least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to implement a clinical decision support system. The mechanism receives a plurality of patient electronic medical records (EMRs) for a patient from a plurality of different sources. For a portion of a patient EMR record of the plurality of patient EMRs, the mechanism detects entities and analyzes a document structure of the portion of the patient EMR to identify a hierarchical structure of the portion of the patient EMR. The mechanism generates a container representation of the portion of the patient EMR based on the hierarchical structure. The mechanism placing each of the one or more sentences within the container representation based on relative position within the hierarchical structure.
    Type: Grant
    Filed: September 28, 2016
    Date of Patent: November 21, 2023
    Inventors: Corville O. Allen, Roberto DeLima, Aysu Ezen Can, Robert C. Sizemore
  • Patent number: 11687796
    Abstract: An approach is provided that receives a document and a document type of the document. The document type identifies a document category to which the received document belongs. A set of linguistic metrics are retrieved that correspond to the document type. A quality of the received document is automatically determined based on a set of linguistic features found in the document as compared to the retrieved set of linguistic metrics. The document is then ingested into a corpus that is utilized by a question-answering (QA) system. The ingestion of the document is based on the determined quality.
    Type: Grant
    Filed: April 17, 2019
    Date of Patent: June 27, 2023
    Assignee: International Business Machines Corporation
    Inventors: Brien H. Muschett, Andrew R. Freed, Roberto Delima, David Contreras, Krishna Mahajan
  • Patent number: 11636376
    Abstract: A method, computer system, and a computer program product for active machine learning is provided. The present invention may include annotating a plurality of data entries. The present invention may also include building a first dataset based on the annotated plurality of data entries. The present invention may then include receiving user feedback based on the built first dataset. The present invention may further include assigning a plurality of weights to a plurality of data entry subsets. The present invention may also include generating a second weighted dataset based on the received user feedback.
    Type: Grant
    Filed: June 3, 2018
    Date of Patent: April 25, 2023
    Assignee: International Business Machines Corporation
    Inventors: Aysu Ezen Can, Corville O. Allen, Roberto Delima
  • Patent number: 11593561
    Abstract: A phrase that includes a trigger word that modifies a meaning within the phrase is received. The trigger word is identified. The words of the phrase that are modified by the trigger word are identified by analyzing features of the phrase that link the trigger word to other words. The phrase is interpreted by modifying the second subset of words according to the modification of the trigger word.
    Type: Grant
    Filed: September 11, 2019
    Date of Patent: February 28, 2023
    Assignee: International Business Machines Corporation
    Inventors: David Contreras, Krishna Mahajan, Roberto Delima, Kandhan Sekar, Corville O. Allen, Chris Mwarabu
  • Patent number: 11295080
    Abstract: A method, system, and computer program product include providing a list of triggers, training the natural language processor with the list of triggers, providing to the natural language processor a text including one trigger, selecting nodes in the text to create an original potential span, predicting whether the original potential span includes another trigger, and adjusting, in response to predicting that the original potential span includes another trigger, the original potential span to exclude the another trigger to create a new potential span.
    Type: Grant
    Filed: June 4, 2019
    Date of Patent: April 5, 2022
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Roberto Delima, David Contreras, Krishna Mahajan
  • Patent number: 11138373
    Abstract: A method and system for determining a location of origin and a time period in which a document was written is disclosed. A text is received and a set of linguistic characteristics for the text are identified. A set of possible locations and time periods for the text are determined based on the set of linguistic characteristics. A set of reference documents are used to determine a proximity rating for the text based upon a determination of how close the text is to the reference documents. The potential locations and time periods are ranked and returned for presentation.
    Type: Grant
    Filed: April 8, 2019
    Date of Patent: October 5, 2021
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Roberto DeLima, Andrew R. Freed, Robert L. Nielsen
  • Patent number: 11138380
    Abstract: Aspects of the present disclosure relate to identifying semantic relationships. Natural language content is received. A part of speech is determined for respective terms within the natural language content. A semantic type is determined for each of two or more terms within the natural language content. A parse tree representation containing a plurality of nodes is then generated based on the natural language content, each of the plurality of nodes corresponding to at least one term within the natural language content, wherein visual characteristics of respective nodes of the plurality of nodes within the parse tree representation depend on the part of speech and semantic type of the respective terms. A bounding box identifying a semantic relationship is then generated around a set of nodes on the parse tree representation, the set of nodes including the two or more terms.
    Type: Grant
    Filed: June 11, 2019
    Date of Patent: October 5, 2021
    Assignee: International Business Machines Corporation
    Inventors: Chris Mwarabu, David Contreras, Roberto Delima, Corville O. Allen
  • Patent number: 11120215
    Abstract: Aspects of the present disclosure relate to identifying spans within unstructured electronic text. Natural language content is received. A part of speech and slot name of each word within the natural language content is identified. A parse tree representation is then generated based on the natural language content, wherein visual characteristics of each node of a plurality of nodes within the parse tree representation depend on the part of speech and slot name of each word. A bounding box identifying a span category is then generated around a set of nodes on the parse tree representation by a machine learning model.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: September 14, 2021
    Assignee: International Business Machines Corporation
    Inventors: Chris Mwarabu, David Contreras, Roberto Delima, Corville O. Allen
  • Patent number: 11113469
    Abstract: A phrase may be received that includes a plurality of tokens in a natural language format. A plurality of levels relating to dependencies between tokens of the plurality of tokens within the phrase is determined. A matrix structure is generated for the phrase. The matrix structure utilizes a plurality of rows and a plurality of columns to store data of the phrase. The plurality of rows and the plurality of columns each indicate one of an order of tokens of the plurality of tokens or levels of the plurality of levels.
    Type: Grant
    Filed: March 27, 2019
    Date of Patent: September 7, 2021
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Roberto Delima, Chris Mwarabu, David Contreras, Kandhan Sekar, Krishna Mahajan
  • Patent number: 11113418
    Abstract: A method for de-identifying protected health information (PHI) associated with electronic medical records (EMRs) based on a common analysis structure (CAS) is provided. The method may include detecting a system event associated with a system comprising the EMRs. The method may further include in response to detecting the system event, detecting a first CAS associated with the EMRs. The method may further include extracting first CAS data associated with the first CAS, wherein the first CAS data comprises unstructured data associated with the EMRs and normalized annotations based on CAS objects that are associated with the unstructured data. The method may further include obfuscating the unstructured data associated with the first CAS. The method may also include generating a second CAS comprising the obfuscated unstructured data and a copied version of the normalized annotations, wherein the copied version of normalized annotations are correlated with the obfuscated unstructured data.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: September 7, 2021
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Aysu Ezen Can, Roberto Delima, Robert C. Sizemore
  • Patent number: 11017171
    Abstract: A method, computer system, and a computer program product for relevancy-based document quality assessment is provided. The present invention may include computing a document quality score based on at least one container relevancy score determined based on at least one domain link to a domain knowledge base.
    Type: Grant
    Filed: June 6, 2019
    Date of Patent: May 25, 2021
    Assignee: International Business Machines Corporation
    Inventors: Roberto Delima, Andrew R. Freed, Brien Muschett, Krishna Mahajan, David Contreras
  • Patent number: 10937551
    Abstract: Mechanisms are provided for performing entity differentiation. A cognitive medical system ingests a corpus of medical content having references to medical entities, and performs entity recognition on the medical content to identify the medical entities. Responsive to the cognitive medical system identifying a medical entity having a plurality of annotations for a same medical entity attribute, an entity differentiation component executes an ordered set of entity differentiation algorithms, corresponding to the medical entity, for differentiating medical entity attribute values. The entity differentiation component runs the ordered set of entity differentiation algorithms, in order, on the plurality of annotations for the attribute to generate a ranked list of medical entity attribute values corresponding to the annotations in the plurality of annotations. The cognitive medical system performs a cognitive operation on the medical entity based on the ranked list of medical entity attribute values.
    Type: Grant
    Filed: November 27, 2017
    Date of Patent: March 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Roberto DeLima, Aysu Ezen Can, Robert C. Sizemore
  • Patent number: 10902198
    Abstract: Natural language text and annotated text can be received. The annotated text can specify at least one anchor and at least one trigger contained in the natural language text and indicate a correspondence between the anchor and the trigger. The natural language text, the annotated text and at least one parse tree generated from the natural language text can be processed. Based on the processing, at least one natural language processing rule can be generated and output. The natural language processing rule can be configured to be executed by a processor to process other natural language text.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: January 26, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Yurdaer N. Doganata, Roberto Delima, Aysu Ezen Can
  • Publication number: 20200394267
    Abstract: Aspects of the present disclosure relate to identifying semantic relationships. Natural language content is received. A part of speech is determined for respective terms within the natural language content. A semantic type is determined for each of two or more terms within the natural language content. A parse tree representation containing a plurality of nodes is then generated based on the natural language content, each of the plurality of nodes corresponding to at least one term within the natural language content, wherein visual characteristics of respective nodes of the plurality of nodes within the parse tree representation depend on the part of speech and semantic type of the respective terms. A bounding box identifying a semantic relationship is then generated around a set of nodes on the parse tree representation, the set of nodes including the two or more terms.
    Type: Application
    Filed: June 11, 2019
    Publication date: December 17, 2020
    Inventors: CHRIS MWARABU, David Contreras, Roberto Delima, Corville O. Allen
  • Publication number: 20200387572
    Abstract: A method, system, and computer program product include providing a list of triggers, training the natural language processor with the list of triggers, providing to the natural language processor a text including one trigger, selecting nodes in the text to create an original potential span, predicting whether the original potential span includes another trigger, and adjusting, in response to predicting that the original potential span includes another trigger, the original potential span to exclude the another trigger to create a new potential span.
    Type: Application
    Filed: June 4, 2019
    Publication date: December 10, 2020
    Inventors: Corville O. Allen, Roberto Delima, David Contreras, Krishna Mahajan
  • Publication number: 20200387571
    Abstract: A method, computer system, and a computer program product for relevancy-based document quality assessment is provided. The present invention may include computing a document quality score based on at least one container relevancy score determined based on at least one domain link to a domain knowledge base.
    Type: Application
    Filed: June 6, 2019
    Publication date: December 10, 2020
    Inventors: Roberto Delima, Andrew R. Freed, Brien Muschett, Krishna Mahajan, David Contreras
  • Publication number: 20200342053
    Abstract: Aspects of the present disclosure relate to identifying spans within unstructured electronic text. Natural language content is received. A part of speech and slot name of each word within the natural language content is identified. A parse tree representation is then generated based on the natural language content, wherein visual characteristics of each node of a plurality of nodes within the parse tree representation depend on the part of speech and slot name of each word. A bounding box identifying a span category is then generated around a set of nodes on the parse tree representation by a machine learning model.
    Type: Application
    Filed: April 24, 2019
    Publication date: October 29, 2020
    Inventors: CHRIS MWARABU, David Contreras, Roberto Delima, Corville O. Allen
  • Publication number: 20200334546
    Abstract: An approach is provided that receives a document and a document type of the document. The document type identifies a document category to which the received document belongs. A set of linguistic metrics are retrieved that correspond to the document type. A quality of the received document is automatically determined based on a set of linguistic features found in the document as compared to the retrieved set of linguistic metrics. The document is then ingested into a corpus that is utilized by a question-answering (QA) system. The ingestion of the document is based on the determined quality.
    Type: Application
    Filed: April 17, 2019
    Publication date: October 22, 2020
    Inventors: Brien H. Muschett, Andrew R. Freed, Roberto Delima, David Contreras, Krishna Mahajan
  • Publication number: 20200311197
    Abstract: A phrase may be received that includes a plurality of tokens in a natural language format. A plurality of levels relating to dependencies between tokens of the plurality of tokens within the phrase is determined. A matrix structure is generated for the phrase. The matrix structure utilizes a plurality of rows and a plurality of columns to store data of the phrase. The plurality of rows and the plurality of columns each indicate one of an order of tokens of the plurality of tokens or levels of the plurality of levels.
    Type: Application
    Filed: March 27, 2019
    Publication date: October 1, 2020
    Inventors: Corville O. Allen, Roberto Delima, Chris Mwarabu, David Contreras, Kandhan Sekar, Krishna Mahajan
  • Publication number: 20200302332
    Abstract: A computer-implemented method, system and computer program product for generating a client-specific document quality model, by: analyzing data using existing quality heuristics to identify new, unexpected or problem patterns in the data; forming the quality heuristics into one or more clusters for each container level of the data; exploring each of the clusters to identify sources of the patterns; and developing new quality heuristics based on the sources of the patterns, wherein the new quality heuristics are used to generate the client-specific document quality model.
    Type: Application
    Filed: March 20, 2019
    Publication date: September 24, 2020
    Inventors: David Contreras, Krishna Mahajan, Roberto Delima, Andrew R. Freed, Brien Muschett