Patents by Inventor Ismini Lourentzou

Ismini Lourentzou has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11645464
    Abstract: Systems, computer-implemented methods, and computer program products to transform a lexicon that describes an information asset are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a term validation component that can determine from a subject matter expert, a validated term that can indicate validation of a candidate term that describes an information asset. The computer executable components can further comprise a lexicon transforming component that, based on the validated term, can transform a lexicon that describes the information asset, by incorporating the validated term into the lexicon.
    Type: Grant
    Filed: March 18, 2021
    Date of Patent: May 9, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Anna Lisa Gentile, Chad Eric DeLuca, Petar Ristoski, Ismini Lourentzou, Linda Ha Kato, Alfredo Alba, Daniel Gruhl, Steven R. Welch
  • Patent number: 11593419
    Abstract: One embodiment provides a method that includes determining candidate ontologies for alignment from multiple available knowledge bases. An initial target ontology is selected from the candidate ontologies and correcting the initial selected ontology with received refinement input. Concepts in the selected initial ontology are aligned with concepts of the target ontology using a deep learning hierarchical classification with received review input. A user is assisted to build, change and grow the selected initial ontology exploiting both the target ontology and new facts extracted from unstructured data.
    Type: Grant
    Filed: September 25, 2018
    Date of Patent: February 28, 2023
    Assignee: International Business Machines Corporation
    Inventors: Petar Ristoski, Anna Lisa Gentile, Daniel Gruhl, Alfredo Alba, Chris Kau, Chad DeLuca, Linda Kato, Ismini Lourentzou, Steven R. Welch
  • Patent number: 11551437
    Abstract: Embodiments relate to a system, program product, and method for information extraction and annotation of a data set. Neural models are utilized to automatically attach machine annotations to data elements within an unlabeled data set. The attached machine annotations are evaluated and a score is attached to the annotations. The score reflects a confidence of correctness of the annotations. A labeled data set is iteratively expanded with selectively evaluated annotations based on the attached score. The labeled data set is applied to an unexplored corpus to identify matching corpus data to populated instances of the labeled data set.
    Type: Grant
    Filed: May 29, 2019
    Date of Patent: January 10, 2023
    Assignee: International Business Machines Corporation
    Inventors: Ismini Lourentzou, Anna Lisa Gentile, Daniel Gruhl, Alfredo Alba, Petar Ristoski, Chad Eric DeLuca, Linda Ha Kato, Chris Kau, Steven R. Welch
  • Publication number: 20220300709
    Abstract: Systems, computer-implemented methods, and computer program products to transform a lexicon that describes an information asset are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a term validation component that can determine from a subject matter expert, a validated term that can indicate validation of a candidate term that describes an information asset. The computer executable components can further comprise a lexicon transforming component that, based on the validated term, can transform a lexicon that describes the information asset, by incorporating the validated term into the lexicon.
    Type: Application
    Filed: March 18, 2021
    Publication date: September 22, 2022
    Inventors: Anna Lisa Gentile, Chad Eric DeLuca, Petar Ristoski, Ismini Lourentzou, Linda Ha Kato, Alfredo Alba, Daniel Gruhl, Steven R. Welch
  • Patent number: 11416562
    Abstract: In an approach to corpus expansion using lexical signatures, one or more computer processors retrieve a donor corpus of text, wherein the donor corpus includes a plurality of documents. One or more computer processors generate a document signature for each of the plurality of documents in the donor corpus. One or more computer processors retrieve a target corpus of text for expansion. One or more computer processors generate a corpus signature for the target corpus. One or more computer processors compare each document signature to the corpus signature. Based on the comparison, one or more computer processors determine a similarity score for each document signature. One or more computer processors rank the plurality of documents by the similarity score. One or more computer processors add one or more top-ranked documents of the plurality of documents to the target corpus.
    Type: Grant
    Filed: April 23, 2021
    Date of Patent: August 16, 2022
    Assignee: International Business Machines Corporation
    Inventors: Daniel Gruhl, Anna Lisa Gentile, Petar Ristoski, Linda Ha Kato, Chad Eric DeLuca, Steven R. Welch, Alfredo Alba, Ismini Lourentzou
  • Patent number: 11379669
    Abstract: Embodiments relate to a system, program product, and method for dictionary membership management directed at identifying ambiguity in semantic resources. A dictionary of seed terms is applied to a text corpus and matching items in the corpus are identified. The linguistic properties for each matching item are characterized and a context pattern of each matching item is constructed. Each context pattern is applied to the dictionary and matching content between the seed terms and the context pattern is identified and quantified. Lexicon items from the dictionary that have anomalous behavior reflected in the quantification are identified. One or more seed words identified as having anomalous behavior are selectively removed from the dictionary.
    Type: Grant
    Filed: July 29, 2019
    Date of Patent: July 5, 2022
    Assignee: International Business Machines Corporation
    Inventors: Anna Lisa Gentile, Anni R. Coden, Ismini Lourentzou, Daniel Gruhl, Chad Eric DeLuca, Petar Ristoski, Linda Ha Kato, Chris Kau, Steven R. Welch, Alfredo Alba
  • Publication number: 20220101188
    Abstract: An embodiment includes generating a query prompting a user to select from among a plurality of response options related to a first query set of objects. The embodiment also receives, responsive to the query, user input representative of a selected response option selected by the user from among the plurality of response options. The embodiment also calculates a plurality of weight values for respective ones of a plurality of similarity matrices based on the selected response option, where the plurality of similarity matrices include respective different sets of similarity values, each set of similarity values comprising similarity values representative of similarities of respective pairs of the plurality of objects. The embodiment stores a designated similarity matrix that is selected from among the plurality of similarity matrices based at least in part on a weight value from among the plurality of weight values assigned to the designated similarity matrix.
    Type: Application
    Filed: September 30, 2020
    Publication date: March 31, 2022
    Applicant: International Business Machines Corporation
    Inventors: Ismini Lourentzou, Daniel Gruhl, Steven R. Welch, Chad Eric DeLuca, Alfredo Alba, Linda Ha Kato, Petar Ristoski, Anna Lisa Gentile
  • Patent number: 11151175
    Abstract: One embodiment provides a method for on-demand relation extraction from unstructured text that includes obtaining a text corpus of domain related unstructured text. Representations of the unstructured text that capture entity-specific syntactic knowledge are created. Initial user seeds of informative examples containing relations are received. Extraction models in a neural network are trained using the initial user seeds. Performance information and a confidence score are provided for each prediction for each extraction model. A next batch of informative examples are identified for annotation from the text corpus based on training a neural network classifier on a pool of labeled informative examples. Stopping criteria is determined based on differences of the performance information and the confidence score in relation to parameters for each extraction model. Based on the stopping criteria, it is determined whether to retrain a particular extraction model after the informative examples have been labeled.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: October 19, 2021
    Assignee: International Business Machines Corporation
    Inventors: Ismini Lourentzou, Anna Lisa Gentile, Daniel Gruhl, Alfredo Alba, Chris Kau, Chad DeLuca, Linda Kato, Petar Ristoski, Steven R. Welch
  • Patent number: 11030402
    Abstract: Embodiments relate to a system, program product, and method for iterative expansion and application of a domain-specific dictionary. One or more dictionary instances are applied against a text corpus. The dictionary is iteratively expanded and selectively populated with one or more additional dictionary instances, including semantically similar instances to the applied dictionary instances and extension instances contextually related to the applied dictionary instances. The iteratively expanded dictionary is applied to an unexplored corpus to identify matching corpus data to populated instances of the dictionary.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: June 8, 2021
    Assignee: International Business Machines Corporation
    Inventors: Petar Ristoski, Daniel Gruhl, Alfredo Alba, Anna Lisa Gentile, Ismini Lourentzou, Chad Eric DeLuca, Linda Ha Kato, Steven R. Welch, Chris Kau
  • Publication number: 20210034704
    Abstract: Embodiments relate to a system, program product, and method for dictionary membership management directed at identifying ambiguity in semantic resources. A dictionary of seed terms is applied to a text corpus and matching items in the corpus are identified. The linguistic properties for each matching item are characterized and a context pattern of each matching item is constructed. Each context pattern is applied to the dictionary and matching content between the seed terms and the context pattern is identified and quantified. Lexicon items from the dictionary that have anomalous behavior reflected in the quantification are identified. One or more seed words identified as having anomalous behavior are selectively removed from the dictionary.
    Type: Application
    Filed: July 29, 2019
    Publication date: February 4, 2021
    Applicant: International Business Machines Corporation
    Inventors: Anna Lisa Gentile, Anni R. Coden, Ismini Lourentzou, Daniel Gruhl, Chad Eric DeLuca, Petar Ristoski, Linda Ha Kato, Chris Kau, Steven R. Welch, Alfredo Alba
  • Publication number: 20200380311
    Abstract: Embodiments relate to a system, program product, and method for information extraction and annotation of a data set. Neural models are utilized to automatically attach machine annotations to data elements within an unlabeled data set. The attached machine annotations are evaluated and a score is attached to the annotations. The score reflects a confidence of correctness of the annotations. A labeled data set is iteratively expanded with selectively evaluated annotations based on the attached score. The labeled data set is applied to an unexplored corpus to identify matching corpus data to populated instances of the labeled data set.
    Type: Application
    Filed: May 29, 2019
    Publication date: December 3, 2020
    Applicant: International Business Machines Corporation
    Inventors: Ismini Lourentzou, Anna Lisa Gentile, Daniel Gruhl, Alfredo Alba, Petar Ristoski, Chad Eric DeLuca, Linda Ha Kato, Chris Kau, Steven R. Welch
  • Publication number: 20200349226
    Abstract: Embodiments relate to a system, program product, and method for iterative expansion and application of a domain-specific dictionary. One or more dictionary instances are applied against a text corpus. The dictionary is iteratively expanded and selectively populated with one or more additional dictionary instances, including semantically similar instances to the applied dictionary instances and extension instances contextually related to the applied dictionary instances. The iteratively expanded dictionary is applied to an unexplored corpus to identify matching corpus data to populated instances of the dictionary.
    Type: Application
    Filed: May 3, 2019
    Publication date: November 5, 2020
    Applicant: International Business Machines Corporation
    Inventors: Petar Ristoski, Daniel Gruhl, Alfredo Alba, Anna Lisa Gentile, Ismini Lourentzou, Chad Eric DeLuca, Linda Ha Kato, Steven R. Welch, Chris Kau
  • Publication number: 20200097602
    Abstract: One embodiment provides a method that includes determining candidate ontologies for alignment from multiple available knowledge bases. An initial target ontology is selected from the candidate ontologies and correcting the initial selected ontology with received refinement input. Concepts in the selected initial ontology are aligned with concepts of the target ontology using a deep learning hierarchical classification with received review input. A user is assisted to build, change and grow the selected initial ontology exploiting both the target ontology and new facts extracted from unstructured data.
    Type: Application
    Filed: September 25, 2018
    Publication date: March 26, 2020
    Inventors: Petar Ristoski, Anna Lisa Gentile, Daniel Gruhl, Alfredo Alba, Chris Kau, Chad DeLuca, Linda Kato, Ismini Lourentzou, Steven R. Welch
  • Publication number: 20200097597
    Abstract: One embodiment provides a method for on-demand relation extraction from unstructured text that includes obtaining a text corpus of domain related unstructured text. Representations of the unstructured text that capture entity-specific syntactic knowledge are created. Initial user seeds of informative examples containing relations are received. Extraction models in a neural network are trained using the initial user seeds. Performance information and a confidence score are provided for each prediction for each extraction model. A next batch of informative examples are identified for annotation from the text corpus based on training a neural network classifier on a pool of labeled informative examples. Stopping criteria is determined based on differences of the performance information and the confidence score in relation to parameters for each extraction model. Based on the stopping criteria, it is determined whether to retrain a particular extraction model after the informative examples have been labeled.
    Type: Application
    Filed: September 24, 2018
    Publication date: March 26, 2020
    Inventors: Ismini Lourentzou, Anna Lisa Gentile, Daniel Gruhl, Alfredo Alba, Chris Kau, Chad DeLuca, Linda Kato, Petar Ristoski, Steven R. Welch