Patents by Inventor Alfio Massimiliano Gliozzo
Alfio Massimiliano Gliozzo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240111969Abstract: Methods, systems, and computer program products for natural language data generation using automated knowledge distillation techniques are provided herein. A computer-implemented method includes retrieving, in response to an input query, a set of passages from at least one knowledge base by processing the input query using a first set of artificial intelligence techniques; ranking at least a portion of the set of passages by processing the set of passages using a second set of artificial intelligence techniques; generating at least one natural language answer, in response to the input query, by processing a subset of the set of passages in connection with automated knowledge distillation techniques based on the ranking of the at least a portion of the set of passages; and performing automated actions based on the ranking of the at least a portion of the set of passages and/or the at least one generated natural language answer.Type: ApplicationFiled: September 30, 2022Publication date: April 4, 2024Inventors: Michael Robert Glass, Gaetano Rossiello, Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
-
Patent number: 11941010Abstract: Embodiments of the present invention provide a computer system, a computer program product, and a method that comprises analyzing a performed query by identifying a plurality of indicative markers based on a pre-stored classification database associated with the performed query; generating a plurality of facets based on the analysis of the performed query; selecting at least two facets within the generated plurality of facets by determining a quantitative similarity value between each respective facet and the plurality of identified indicative markers associated with the performed query; dynamically ranking the selected facets by prioritizing the selected facets based on a calculated overall score associated with assigned weighted values for each selected facet in the generated plurality of facets using a supervised machine learning algorithm; and displaying the dynamically ranked facets within a user interface of a computing device associated with a user.Type: GrantFiled: December 22, 2020Date of Patent: March 26, 2024Assignee: International Business Machines CorporationInventors: Soumitra Sarkar, Md Faisal Mahbub Chowdhury, Ruchi Mahindru, Gaetano Rossiello, Alfio Massimiliano Gliozzo, Nicolas Rodolfo Fauceglia
-
Patent number: 11907842Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.Type: GrantFiled: January 13, 2023Date of Patent: February 20, 2024Assignee: NTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
-
Publication number: 20230409806Abstract: An embodiment for encoding permutation-invariant representations of linearized tabular data. The embodiment may receive input including tabular data and linearize a column or row within the received tabular data. The embodiment may automatically assign an increasing sequence of position identifiers to each non-delimiting tokenized cell in the linearized column or row until a header delimiter is reached. The embodiment may, in response to reaching the header delimiter, automatically assign a monotonically increasing sequence of position identifiers for each non-delimiting tokenized cell positioned after the header delimiter, restarting from an integer corresponding to 1 greater than the position identifier assigned to the header delimiter for each non-delimiting tokenized cell positioned after cell delimiters.Type: ApplicationFiled: June 17, 2022Publication date: December 21, 2023Inventors: Sarthak Dash, Sugato Bagchi, NANDANA MIHINDUKULASOORIYA, Alfio Massimiliano Gliozzo
-
Patent number: 11755843Abstract: Systems and techniques that facilitate spurious relationship filtration from external knowledge graphs based on distributional semantics of an input corpus are provided. In one or more embodiments, a context component can generate a context-based word embedding of one or more first terms in a document collection. The embedding can yield vector representations of the one or more first terms. The one or more first terms can correspond to knowledge terms in one or more first nodes of a knowledge graph. In one or more embodiments, a filtering component can filter out a relationship between the one or more first nodes and a second node of the knowledge graph based on a similarity value being less than a threshold. The similarity value can be a function of the vector representations of the one or more first terms. In various embodiments, cosine similarity can be used to compute the similarity value.Type: GrantFiled: May 18, 2021Date of Patent: September 12, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nandana Mihindukulasooriya, Robert G. Farrell, Nicolas Rodolfo Fauceglia, Alfio Massimiliano Gliozzo
-
Patent number: 11693896Abstract: Techniques regarding autonomous classification and/or identification of various types of noise comprised within a knowledge graph are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a knowledge extraction component, operatively coupled to the processor, that can classify a type of noise comprised within a knowledge graph. The type of noise can be generated by an information extraction process.Type: GrantFiled: September 25, 2018Date of Patent: July 4, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nandana Sampath Mihindukulasooriya, Oktie Hassanzadeh, Alfio Massimiliano Gliozzo, Sarthak Dash
-
Patent number: 11694035Abstract: One embodiment of the present invention provides a method comprising receiving a text corpus, and generating a first list of triples based on the text corpus. Each triple of the first list comprises a first term representing a candidate hyponym, a second term representing a candidate hypernym, and a frequency value indicative of a number of times a hypernymy relation is observed between the candidate hyponym and the candidate hypernym in the text corpus. The method further comprises training a neural network for hypernym induction based on the first list. The trained neural network is a strict partial order network (SPON) model.Type: GrantFiled: June 9, 2021Date of Patent: July 4, 2023Assignee: International Business Machines CorporationInventors: Sarthak Dash, Alfio Massimiliano Gliozzo, Md Faisal Mahbub Chowdhury
-
Publication number: 20230177335Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.Type: ApplicationFiled: January 13, 2023Publication date: June 8, 2023Inventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
-
Patent number: 11645513Abstract: Methods and systems are described for populating knowledge graphs. A processor can identify a set of data in a knowledge graph. The processor can identify a plurality of portions of an unannotated corpus, where a portion includes at least one entity. The processor can cluster the plurality of portions into at least one data set based on the at least one entity of the plurality of portions. The processor can train a model using the at least one data set and the set of data identified from the knowledge graph. The processor can apply the model to a set of entities in the unannotated corpus to predict unary relations associated with the set of entities. The processor can convert the predicted unary relations into a set of binary relations associated with the set of entities. The processor can add the set of binary relations to the knowledge graph.Type: GrantFiled: July 3, 2019Date of Patent: May 9, 2023Assignee: International Business Machines CorporationInventors: Michael Robert Glass, Alfio Massimiliano Gliozzo
-
Patent number: 11625573Abstract: A first neural network is operated on a processor and a memory to encode a first natural language string into a first sentence encoding including a set of word encodings. Using a word-based attention mechanism with a context vector, a weight value for a word encoding within the first sentence encoding is adjusted to form an adjusted first sentence encoding. Using a sentence-based attention mechanism, a first relationship encoding corresponding to the adjusted first sentence encoding is determined. An absolute difference between the first relationship encoding and a second relationship encoding is computed. Using a multi-layer perceptron, a degree of analogical similarity between the first relationship encoding and a second relationship encoding is determined.Type: GrantFiled: October 29, 2018Date of Patent: April 11, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Alfio Massimiliano Gliozzo, Gaetano Rossiello, Robert G. Farrell
-
Patent number: 11615154Abstract: In an approach to unsupervised corpus expansion using domain-specific terms, one or more computer processors retrieve one or more domain-specific terms from a corpus of text. One or more computer processors search the World Wide Web for the one or more domain-specific terms to produce a plurality of web pages associated with each of the one or more domain-specific terms. One or more computer processors determine a confidence score for each of the plurality of web pages. One or more computer processors determine the confidence score of at least one of the plurality of web pages exceeds a pre-defined threshold. One or more computer processors add the at least one of the plurality of web pages to the corpus of text.Type: GrantFiled: February 17, 2021Date of Patent: March 28, 2023Assignee: International Business Machines CorporationInventors: Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
-
Publication number: 20230087667Abstract: Embodiments of the present invention provide computer-implemented methods, computer program products and computer systems. Embodiments of the present invention can, in response to receiving information, learn entity representations and cluster assignments of respective entity representations in a joint manner for both entities and relations of respective entities.Type: ApplicationFiled: September 21, 2021Publication date: March 23, 2023Inventors: Sarthak Dash, Gaetano Rossiello, NANDANA MIHINDUKULASOORIYA, Sugato Bagchi, Alfio Massimiliano Gliozzo
-
Patent number: 11573994Abstract: A computer-implemented method for performing cross-document coreference for a corpus of input documents includes determining mentions by parsing the input documents. Each mention includes a first vector for spelling data and a second vector for context data. A hierarchical tree data structure is created by generating several leaf nodes corresponding to respective mentions. Further, for each node, a similarity score is computed based on the first and second vectors of each node. The hierarchical tree is populated iteratively until a root node is created. Each iteration includes merging two nodes that have the highest similarity scores and creating an entity node instead at a hierarchical level that is above the two nodes being merged. Further, each iteration includes computing the similarity score for the entity node. The nodes with the similarity scores above a predetermined value are entities for which coreference has been performed in input documents.Type: GrantFiled: April 14, 2020Date of Patent: February 7, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael Robert Glass, Nicholas Brady Garvan Monath, Robert G. Farrell, Alfio Massimiliano Gliozzo, Gaetano Rossiello
-
Patent number: 11574179Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.Type: GrantFiled: January 7, 2019Date of Patent: February 7, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
-
Patent number: 11556712Abstract: Methods and systems for natural language processing include pretraining a machine learning model that is based on a bidirectional encoder representations from transformers model, using a span selection training data set that associates a masked word with a passage. A natural language processing task is performed using the span selection pretrained machine learning model.Type: GrantFiled: October 8, 2019Date of Patent: January 17, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael R. Glass, Alfio Massimiliano Gliozzo
-
Publication number: 20230009946Abstract: Systems, devices, computer-implemented methods, and/or computer program products that facilitate generative relation linking for question answering over knowledge bases. In one example, a system can comprise a processor that executes computer executable components stored in memory. The computer executable components can comprise a relation linking component. The relation linking component can map relations identified in a natural language question to corresponding relations of a knowledge base using a generative model.Type: ApplicationFiled: July 12, 2021Publication date: January 12, 2023Inventors: Gaetano Rossiello, Nandana Mihindukulasooriya, Alfio Massimiliano Gliozzo
-
Patent number: 11526688Abstract: One embodiment of the invention provides a method for terminology ranking for use in natural language processing. The method comprises receiving a list of terms extracted from a corpus, where the list comprises a ranking of the terms based on frequencies of the terms across the corpus. The method further comprises accessing a domain ontology associated with the corpus, and re-ranking the list based on the domain ontology. The resulting re-ranked list comprises a different ranking of the terms based on relevance of the terms using knowledge from the domain ontology. The method further comprises generating clusters of terms via a trained model adapted to the corpus, and boosting a rank of at least one term of the re-ranked list based on the clusters to increase a relevance of the at least one term using knowledge from the trained model.Type: GrantFiled: April 16, 2020Date of Patent: December 13, 2022Assignee: International Business Machines CorporationInventors: Nandana Mihindukulasooriya, Ruchi Mahindru, Md Faisal Mahbub Chowdhury, Yu Deng, Alfio Massimiliano Gliozzo, Sarthak Dash, Nicolas Rodolfo Fauceglia, Gaetano Rossiello
-
Patent number: 11520762Abstract: A computer-implemented method according to one embodiment includes converting an input question into a vector form using trained word embeddings; constructing a type similarity matrix using a predetermined ontology; and determining a score for all possible types for the input question, based on the input question in vector form and the type similarity matrix.Type: GrantFiled: December 13, 2019Date of Patent: December 6, 2022Assignee: International Business Machines CorporationInventors: Sarthak Dash, Gaetano Rossiello, Alfio Massimiliano Gliozzo, Robert G. Farrell, Bassem Makni, Avirup Sil, Vittorio Castelli, Radu Florian
-
Patent number: 11501070Abstract: An approach to induction of unknown terms into a term taxonomy graph may be provided. The approach may include analyzing a domain specific corpus to generate a term taxonomy graph using a term taxonomy graph generation model with a term knowledge base and determining which terms within the domain specific corpus are out of vocabulary (OOV) terms. The approach may also analyze the terms in the domain specific corpus with a semantic representation model to generate feature vectors of the OOV terms and terms known within the generated term taxonomy graph. The approach may determine if an OOV can be a hyponym of a term within the term taxonomy graph based on the feature vectors and insert the OOV term into the graph at the appropriate location.Type: GrantFiled: July 1, 2020Date of Patent: November 15, 2022Assignee: International Business Machines CorporationInventors: Feifei Pan, Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
-
Patent number: 11500910Abstract: Techniques regarding similarity based negative sample analysis are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a similarity component that can determine similarity metrics for respective entities based on a vector space model. The respective entities can be represented by a dataset. Also, the computer executable components can comprise a sampling component that can perform a negative sampling analysis on the dataset based on the similarity metrics.Type: GrantFiled: March 21, 2018Date of Patent: November 15, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sarthak Dash, Alfio Massimiliano Gliozzo, Michael Robert Glass