Patents by Inventor Alfio Massimiliano Gliozzo

Alfio Massimiliano Gliozzo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240111969
    Abstract: Methods, systems, and computer program products for natural language data generation using automated knowledge distillation techniques are provided herein. A computer-implemented method includes retrieving, in response to an input query, a set of passages from at least one knowledge base by processing the input query using a first set of artificial intelligence techniques; ranking at least a portion of the set of passages by processing the set of passages using a second set of artificial intelligence techniques; generating at least one natural language answer, in response to the input query, by processing a subset of the set of passages in connection with automated knowledge distillation techniques based on the ranking of the at least a portion of the set of passages; and performing automated actions based on the ranking of the at least a portion of the set of passages and/or the at least one generated natural language answer.
    Type: Application
    Filed: September 30, 2022
    Publication date: April 4, 2024
    Inventors: Michael Robert Glass, Gaetano Rossiello, Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
  • Patent number: 11941010
    Abstract: Embodiments of the present invention provide a computer system, a computer program product, and a method that comprises analyzing a performed query by identifying a plurality of indicative markers based on a pre-stored classification database associated with the performed query; generating a plurality of facets based on the analysis of the performed query; selecting at least two facets within the generated plurality of facets by determining a quantitative similarity value between each respective facet and the plurality of identified indicative markers associated with the performed query; dynamically ranking the selected facets by prioritizing the selected facets based on a calculated overall score associated with assigned weighted values for each selected facet in the generated plurality of facets using a supervised machine learning algorithm; and displaying the dynamically ranked facets within a user interface of a computing device associated with a user.
    Type: Grant
    Filed: December 22, 2020
    Date of Patent: March 26, 2024
    Assignee: International Business Machines Corporation
    Inventors: Soumitra Sarkar, Md Faisal Mahbub Chowdhury, Ruchi Mahindru, Gaetano Rossiello, Alfio Massimiliano Gliozzo, Nicolas Rodolfo Fauceglia
  • Patent number: 11907842
    Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.
    Type: Grant
    Filed: January 13, 2023
    Date of Patent: February 20, 2024
    Assignee: NTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
  • Publication number: 20230409806
    Abstract: An embodiment for encoding permutation-invariant representations of linearized tabular data. The embodiment may receive input including tabular data and linearize a column or row within the received tabular data. The embodiment may automatically assign an increasing sequence of position identifiers to each non-delimiting tokenized cell in the linearized column or row until a header delimiter is reached. The embodiment may, in response to reaching the header delimiter, automatically assign a monotonically increasing sequence of position identifiers for each non-delimiting tokenized cell positioned after the header delimiter, restarting from an integer corresponding to 1 greater than the position identifier assigned to the header delimiter for each non-delimiting tokenized cell positioned after cell delimiters.
    Type: Application
    Filed: June 17, 2022
    Publication date: December 21, 2023
    Inventors: Sarthak Dash, Sugato Bagchi, NANDANA MIHINDUKULASOORIYA, Alfio Massimiliano Gliozzo
  • Patent number: 11755843
    Abstract: Systems and techniques that facilitate spurious relationship filtration from external knowledge graphs based on distributional semantics of an input corpus are provided. In one or more embodiments, a context component can generate a context-based word embedding of one or more first terms in a document collection. The embedding can yield vector representations of the one or more first terms. The one or more first terms can correspond to knowledge terms in one or more first nodes of a knowledge graph. In one or more embodiments, a filtering component can filter out a relationship between the one or more first nodes and a second node of the knowledge graph based on a similarity value being less than a threshold. The similarity value can be a function of the vector representations of the one or more first terms. In various embodiments, cosine similarity can be used to compute the similarity value.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: September 12, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Nandana Mihindukulasooriya, Robert G. Farrell, Nicolas Rodolfo Fauceglia, Alfio Massimiliano Gliozzo
  • Patent number: 11693896
    Abstract: Techniques regarding autonomous classification and/or identification of various types of noise comprised within a knowledge graph are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a knowledge extraction component, operatively coupled to the processor, that can classify a type of noise comprised within a knowledge graph. The type of noise can be generated by an information extraction process.
    Type: Grant
    Filed: September 25, 2018
    Date of Patent: July 4, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Nandana Sampath Mihindukulasooriya, Oktie Hassanzadeh, Alfio Massimiliano Gliozzo, Sarthak Dash
  • Patent number: 11694035
    Abstract: One embodiment of the present invention provides a method comprising receiving a text corpus, and generating a first list of triples based on the text corpus. Each triple of the first list comprises a first term representing a candidate hyponym, a second term representing a candidate hypernym, and a frequency value indicative of a number of times a hypernymy relation is observed between the candidate hyponym and the candidate hypernym in the text corpus. The method further comprises training a neural network for hypernym induction based on the first list. The trained neural network is a strict partial order network (SPON) model.
    Type: Grant
    Filed: June 9, 2021
    Date of Patent: July 4, 2023
    Assignee: International Business Machines Corporation
    Inventors: Sarthak Dash, Alfio Massimiliano Gliozzo, Md Faisal Mahbub Chowdhury
  • Publication number: 20230177335
    Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.
    Type: Application
    Filed: January 13, 2023
    Publication date: June 8, 2023
    Inventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
  • Patent number: 11645513
    Abstract: Methods and systems are described for populating knowledge graphs. A processor can identify a set of data in a knowledge graph. The processor can identify a plurality of portions of an unannotated corpus, where a portion includes at least one entity. The processor can cluster the plurality of portions into at least one data set based on the at least one entity of the plurality of portions. The processor can train a model using the at least one data set and the set of data identified from the knowledge graph. The processor can apply the model to a set of entities in the unannotated corpus to predict unary relations associated with the set of entities. The processor can convert the predicted unary relations into a set of binary relations associated with the set of entities. The processor can add the set of binary relations to the knowledge graph.
    Type: Grant
    Filed: July 3, 2019
    Date of Patent: May 9, 2023
    Assignee: International Business Machines Corporation
    Inventors: Michael Robert Glass, Alfio Massimiliano Gliozzo
  • Patent number: 11625573
    Abstract: A first neural network is operated on a processor and a memory to encode a first natural language string into a first sentence encoding including a set of word encodings. Using a word-based attention mechanism with a context vector, a weight value for a word encoding within the first sentence encoding is adjusted to form an adjusted first sentence encoding. Using a sentence-based attention mechanism, a first relationship encoding corresponding to the adjusted first sentence encoding is determined. An absolute difference between the first relationship encoding and a second relationship encoding is computed. Using a multi-layer perceptron, a degree of analogical similarity between the first relationship encoding and a second relationship encoding is determined.
    Type: Grant
    Filed: October 29, 2018
    Date of Patent: April 11, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfio Massimiliano Gliozzo, Gaetano Rossiello, Robert G. Farrell
  • Patent number: 11615154
    Abstract: In an approach to unsupervised corpus expansion using domain-specific terms, one or more computer processors retrieve one or more domain-specific terms from a corpus of text. One or more computer processors search the World Wide Web for the one or more domain-specific terms to produce a plurality of web pages associated with each of the one or more domain-specific terms. One or more computer processors determine a confidence score for each of the plurality of web pages. One or more computer processors determine the confidence score of at least one of the plurality of web pages exceeds a pre-defined threshold. One or more computer processors add the at least one of the plurality of web pages to the corpus of text.
    Type: Grant
    Filed: February 17, 2021
    Date of Patent: March 28, 2023
    Assignee: International Business Machines Corporation
    Inventors: Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
  • Publication number: 20230087667
    Abstract: Embodiments of the present invention provide computer-implemented methods, computer program products and computer systems. Embodiments of the present invention can, in response to receiving information, learn entity representations and cluster assignments of respective entity representations in a joint manner for both entities and relations of respective entities.
    Type: Application
    Filed: September 21, 2021
    Publication date: March 23, 2023
    Inventors: Sarthak Dash, Gaetano Rossiello, NANDANA MIHINDUKULASOORIYA, Sugato Bagchi, Alfio Massimiliano Gliozzo
  • Patent number: 11573994
    Abstract: A computer-implemented method for performing cross-document coreference for a corpus of input documents includes determining mentions by parsing the input documents. Each mention includes a first vector for spelling data and a second vector for context data. A hierarchical tree data structure is created by generating several leaf nodes corresponding to respective mentions. Further, for each node, a similarity score is computed based on the first and second vectors of each node. The hierarchical tree is populated iteratively until a root node is created. Each iteration includes merging two nodes that have the highest similarity scores and creating an entity node instead at a hierarchical level that is above the two nodes being merged. Further, each iteration includes computing the similarity score for the entity node. The nodes with the similarity scores above a predetermined value are entities for which coreference has been performed in input documents.
    Type: Grant
    Filed: April 14, 2020
    Date of Patent: February 7, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael Robert Glass, Nicholas Brady Garvan Monath, Robert G. Farrell, Alfio Massimiliano Gliozzo, Gaetano Rossiello
  • Patent number: 11574179
    Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.
    Type: Grant
    Filed: January 7, 2019
    Date of Patent: February 7, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
  • Patent number: 11556712
    Abstract: Methods and systems for natural language processing include pretraining a machine learning model that is based on a bidirectional encoder representations from transformers model, using a span selection training data set that associates a masked word with a passage. A natural language processing task is performed using the span selection pretrained machine learning model.
    Type: Grant
    Filed: October 8, 2019
    Date of Patent: January 17, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael R. Glass, Alfio Massimiliano Gliozzo
  • Publication number: 20230009946
    Abstract: Systems, devices, computer-implemented methods, and/or computer program products that facilitate generative relation linking for question answering over knowledge bases. In one example, a system can comprise a processor that executes computer executable components stored in memory. The computer executable components can comprise a relation linking component. The relation linking component can map relations identified in a natural language question to corresponding relations of a knowledge base using a generative model.
    Type: Application
    Filed: July 12, 2021
    Publication date: January 12, 2023
    Inventors: Gaetano Rossiello, Nandana Mihindukulasooriya, Alfio Massimiliano Gliozzo
  • Patent number: 11526688
    Abstract: One embodiment of the invention provides a method for terminology ranking for use in natural language processing. The method comprises receiving a list of terms extracted from a corpus, where the list comprises a ranking of the terms based on frequencies of the terms across the corpus. The method further comprises accessing a domain ontology associated with the corpus, and re-ranking the list based on the domain ontology. The resulting re-ranked list comprises a different ranking of the terms based on relevance of the terms using knowledge from the domain ontology. The method further comprises generating clusters of terms via a trained model adapted to the corpus, and boosting a rank of at least one term of the re-ranked list based on the clusters to increase a relevance of the at least one term using knowledge from the trained model.
    Type: Grant
    Filed: April 16, 2020
    Date of Patent: December 13, 2022
    Assignee: International Business Machines Corporation
    Inventors: Nandana Mihindukulasooriya, Ruchi Mahindru, Md Faisal Mahbub Chowdhury, Yu Deng, Alfio Massimiliano Gliozzo, Sarthak Dash, Nicolas Rodolfo Fauceglia, Gaetano Rossiello
  • Patent number: 11520762
    Abstract: A computer-implemented method according to one embodiment includes converting an input question into a vector form using trained word embeddings; constructing a type similarity matrix using a predetermined ontology; and determining a score for all possible types for the input question, based on the input question in vector form and the type similarity matrix.
    Type: Grant
    Filed: December 13, 2019
    Date of Patent: December 6, 2022
    Assignee: International Business Machines Corporation
    Inventors: Sarthak Dash, Gaetano Rossiello, Alfio Massimiliano Gliozzo, Robert G. Farrell, Bassem Makni, Avirup Sil, Vittorio Castelli, Radu Florian
  • Patent number: 11501070
    Abstract: An approach to induction of unknown terms into a term taxonomy graph may be provided. The approach may include analyzing a domain specific corpus to generate a term taxonomy graph using a term taxonomy graph generation model with a term knowledge base and determining which terms within the domain specific corpus are out of vocabulary (OOV) terms. The approach may also analyze the terms in the domain specific corpus with a semantic representation model to generate feature vectors of the OOV terms and terms known within the generated term taxonomy graph. The approach may determine if an OOV can be a hyponym of a term within the term taxonomy graph based on the feature vectors and insert the OOV term into the graph at the appropriate location.
    Type: Grant
    Filed: July 1, 2020
    Date of Patent: November 15, 2022
    Assignee: International Business Machines Corporation
    Inventors: Feifei Pan, Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
  • Patent number: 11500910
    Abstract: Techniques regarding similarity based negative sample analysis are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a similarity component that can determine similarity metrics for respective entities based on a vector space model. The respective entities can be represented by a dataset. Also, the computer executable components can comprise a sampling component that can perform a negative sampling analysis on the dataset based on the similarity metrics.
    Type: Grant
    Filed: March 21, 2018
    Date of Patent: November 15, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sarthak Dash, Alfio Massimiliano Gliozzo, Michael Robert Glass