Patents by Inventor Gaetano Rossiello

Gaetano Rossiello has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

NATURAL LANGUAGE DATA GENERATION USING AUTOMATED KNOWLEDGE DISTILLATION TECHNIQUES

Publication number: 20240111969

Abstract: Methods, systems, and computer program products for natural language data generation using automated knowledge distillation techniques are provided herein. A computer-implemented method includes retrieving, in response to an input query, a set of passages from at least one knowledge base by processing the input query using a first set of artificial intelligence techniques; ranking at least a portion of the set of passages by processing the set of passages using a second set of artificial intelligence techniques; generating at least one natural language answer, in response to the input query, by processing a subset of the set of passages in connection with automated knowledge distillation techniques based on the ranking of the at least a portion of the set of passages; and performing automated actions based on the ranking of the at least a portion of the set of passages and/or the at least one generated natural language answer.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Inventors: Michael Robert Glass, Gaetano Rossiello, Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
Dynamic facet ranking

Patent number: 11941010

Abstract: Embodiments of the present invention provide a computer system, a computer program product, and a method that comprises analyzing a performed query by identifying a plurality of indicative markers based on a pre-stored classification database associated with the performed query; generating a plurality of facets based on the analysis of the performed query; selecting at least two facets within the generated plurality of facets by determining a quantitative similarity value between each respective facet and the plurality of identified indicative markers associated with the performed query; dynamically ranking the selected facets by prioritizing the selected facets based on a calculated overall score associated with assigned weighted values for each selected facet in the generated plurality of facets using a supervised machine learning algorithm; and displaying the dynamically ranked facets within a user interface of a computing device associated with a user.

Type: Grant

Filed: December 22, 2020

Date of Patent: March 26, 2024

Assignee: International Business Machines Corporation

Inventors: Soumitra Sarkar, Md Faisal Mahbub Chowdhury, Ruchi Mahindru, Gaetano Rossiello, Alfio Massimiliano Gliozzo, Nicolas Rodolfo Fauceglia
Relation extraction from text using machine learning

Patent number: 11625573

Abstract: A first neural network is operated on a processor and a memory to encode a first natural language string into a first sentence encoding including a set of word encodings. Using a word-based attention mechanism with a context vector, a weight value for a word encoding within the first sentence encoding is adjusted to form an adjusted first sentence encoding. Using a sentence-based attention mechanism, a first relationship encoding corresponding to the adjusted first sentence encoding is determined. An absolute difference between the first relationship encoding and a second relationship encoding is computed. Using a multi-layer perceptron, a degree of analogical similarity between the first relationship encoding and a second relationship encoding is determined.

Type: Grant

Filed: October 29, 2018

Date of Patent: April 11, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alfio Massimiliano Gliozzo, Gaetano Rossiello, Robert G. Farrell
CANONICALIZATION OF DATA WITHIN OPEN KNOWLEDGE GRAPHS

Publication number: 20230087667

Abstract: Embodiments of the present invention provide computer-implemented methods, computer program products and computer systems. Embodiments of the present invention can, in response to receiving information, learn entity representations and cluster assignments of respective entity representations in a joint manner for both entities and relations of respective entities.

Type: Application

Filed: September 21, 2021

Publication date: March 23, 2023

Inventors: Sarthak Dash, Gaetano Rossiello, NANDANA MIHINDUKULASOORIYA, Sugato Bagchi, Alfio Massimiliano Gliozzo
Encoding entity representations for cross-document coreference

Patent number: 11573994

Abstract: A computer-implemented method for performing cross-document coreference for a corpus of input documents includes determining mentions by parsing the input documents. Each mention includes a first vector for spelling data and a second vector for context data. A hierarchical tree data structure is created by generating several leaf nodes corresponding to respective mentions. Further, for each node, a similarity score is computed based on the first and second vectors of each node. The hierarchical tree is populated iteratively until a root node is created. Each iteration includes merging two nodes that have the highest similarity scores and creating an entity node instead at a hierarchical level that is above the two nodes being merged. Further, each iteration includes computing the similarity score for the entity node. The nodes with the similarity scores above a predetermined value are entities for which coreference has been performed in input documents.

Type: Grant

Filed: April 14, 2020

Date of Patent: February 7, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael Robert Glass, Nicholas Brady Garvan Monath, Robert G. Farrell, Alfio Massimiliano Gliozzo, Gaetano Rossiello
GENERATIVE RELATION LINKING FOR QUESTION ANSWERING

Publication number: 20230009946

Abstract: Systems, devices, computer-implemented methods, and/or computer program products that facilitate generative relation linking for question answering over knowledge bases. In one example, a system can comprise a processor that executes computer executable components stored in memory. The computer executable components can comprise a relation linking component. The relation linking component can map relations identified in a natural language question to corresponding relations of a knowledge base using a generative model.

Type: Application

Filed: July 12, 2021

Publication date: January 12, 2023

Inventors: Gaetano Rossiello, Nandana Mihindukulasooriya, Alfio Massimiliano Gliozzo
Discovering ranked domain relevant terms using knowledge

Patent number: 11526688

Abstract: One embodiment of the invention provides a method for terminology ranking for use in natural language processing. The method comprises receiving a list of terms extracted from a corpus, where the list comprises a ranking of the terms based on frequencies of the terms across the corpus. The method further comprises accessing a domain ontology associated with the corpus, and re-ranking the list based on the domain ontology. The resulting re-ranked list comprises a different ranking of the terms based on relevance of the terms using knowledge from the domain ontology. The method further comprises generating clusters of terms via a trained model adapted to the corpus, and boosting a rank of at least one term of the re-ranked list based on the clusters to increase a relevance of the at least one term using knowledge from the trained model.

Type: Grant

Filed: April 16, 2020

Date of Patent: December 13, 2022

Assignee: International Business Machines Corporation

Inventors: Nandana Mihindukulasooriya, Ruchi Mahindru, Md Faisal Mahbub Chowdhury, Yu Deng, Alfio Massimiliano Gliozzo, Sarthak Dash, Nicolas Rodolfo Fauceglia, Gaetano Rossiello
Performing fine-grained question type classification

Patent number: 11520762

Abstract: A computer-implemented method according to one embodiment includes converting an input question into a vector form using trained word embeddings; constructing a type similarity matrix using a predetermined ontology; and determining a score for all possible types for the input question, based on the input question in vector form and the type similarity matrix.

Type: Grant

Filed: December 13, 2019

Date of Patent: December 6, 2022

Assignee: International Business Machines Corporation

Inventors: Sarthak Dash, Gaetano Rossiello, Alfio Massimiliano Gliozzo, Robert G. Farrell, Bassem Makni, Avirup Sil, Vittorio Castelli, Radu Florian
Automated evaluation of information retrieval

Patent number: 11481404

Abstract: A method, system, and computer program product for automated evaluation of information retrieval systems are provided. The method accesses a natural language query from a set of natural language queries. The natural language query is associated with a query difficulty level. The method generates one or more natural language responses to the natural language natural language query. Each natural language response is associated with at least one facet of the plurality of facets. The method generates a set of feedback cues. A set of search results for the natural language query are returned. The set of search results include a highest ranked natural language response of the one or more natural language responses. The method generates an evaluation result for the HCIR system for the query difficulty level based on the one or more natural language responses, the set of search results, and the set of feedback cues.

Type: Grant

Filed: September 16, 2020

Date of Patent: October 25, 2022

Assignee: International Business Machines Corporation

Inventors: Md Faisal Mahbub Chowdhury, Yu Deng, Alfio Massimiliano Gliozzo, Ruchi Mahindru, Nandana Mihindukulasooriya, Nicolas Rodolfo Fauceglia, Gaetano Rossiello
Transformer-Based Model Knowledge Graph Link Prediction

Publication number: 20220327356

Abstract: A system, product, and method are provided for improving knowledge graph (KG) link prediction using transformer-based artificial neural networks. A first topic model is leveraged against a first dataset derived from a KG containing a plurality of first triples. The first triples include first entities and first edges connecting the first entities to represent relationships between the first connected entities. A first similarity function is applied to the first connected entities of the first triples to provide respective first similarity scores. A first subset of one of more first triples is selected from the plurality of first triples based upon the first similarity scores. An artificial neural network is trained using the selected first subset of one or more first triples.

Type: Application

Filed: April 12, 2021

Publication date: October 13, 2022

Applicant: International Business Machines Corporation

Inventors: Gaetano Rossiello, Alfio Massimiliano Gliozzo, Xuan Wang
Building pre-trained contextual embeddings for programming languages using specialized vocabulary

Patent number: 11429352

Abstract: A method, a computer system, and a computer program product for building pre-trained contextual embeddings is provided. Embodiments of the present invention may include collecting programming code. Embodiments of the present invention may include loading and preparing the programming code using a specialized programming language keywords-based vocabulary. Embodiments of the present invention may include creating contextual embeddings for the programming code. Embodiments of the present invention may include storing the contextual embeddings.

Type: Grant

Filed: July 1, 2020

Date of Patent: August 30, 2022

Assignee: International Business Machines Corporation

Inventors: Saurabh Pujar, Luca Buratti, Alessandro Morari, Jim Alain Laredo, Alfio Massimiliano Gliozzo, Gaetano Rossiello
DYNAMIC FACET RANKING

Publication number: 20220197916

Abstract: Embodiments of the present invention provide a computer system, a computer program product, and a method that comprises analyzing a performed query by identifying a plurality of indicative markers based on a pre-stored classification database associated with the performed query; generating a plurality of facets based on the analysis of the performed query; selecting at least two facets within the generated plurality of facets by determining a quantitative similarity value between each respective facet and the plurality of identified indicative markers associated with the performed query; dynamically ranking the selected facets by prioritizing the selected facets based on a calculated overall score associated with assigned weighted values for each selected facet in the generated plurality of facets using a supervised machine learning algorithm; and displaying the dynamically ranked facets within a user interface of a computing device associated with a user.

Type: Application

Filed: December 22, 2020

Publication date: June 23, 2022

Inventors: Soumitra Sarkar, Md Faisal Mahbub Chowdhury, Ruchi Mahindru, Gaetano Rossiello, Alfio Massimiliano Gliozzo, Nicolas Rodolfo Fauceglia
IMPLEMENTING RELATION LINKING FOR KNOWLEDGE BASES

Publication number: 20220129770

Abstract: A computer-implemented method according to one embodiment includes identifying a natural language query; translating the natural language query into an intermediate representation; converting the intermediate representation into one or more query triples; and performing relation linking between each of the one or more query triples and a plurality of knowledge base triples.

Type: Application

Filed: October 23, 2020

Publication date: April 28, 2022

Inventors: Nandana Mihindukulasooriya, Gaetano Rossiello, Alfio Massimiliano Gliozzo, Pavan Kapanipathi Bangalore, Salim Roukos
AUTOMATED EVALUATION OF INFORMATION RETRIEVAL

Publication number: 20220083559

Abstract: A method, system, and computer program product for automated evaluation of information retrieval systems are provided. The method accesses a natural language query from a set of natural language queries. The natural language query is associated with a query difficulty level. The method generates one or more natural language responses to the natural language natural language query. Each natural language response is associated with at least one facet of the plurality of facets. The method generates a set of feedback cues. A set of search results for the natural language query are returned. The set of search results include a highest ranked natural language response of the one or more natural language responses. The method generates an evaluation result for the HCIR system for the query difficulty level based on the one or more natural language responses, the set of search results, and the set of feedback cues.

Type: Application

Filed: September 16, 2020

Publication date: March 17, 2022

Inventors: Md Faisal Mahbub Chowdhury, Yu Deng, Alfio Massimiliano Gliozzo, Ruchi Mahindru, NANDANA MIHINDUKULASOORIYA, Nicolas Rodolfo Fauceglia, Gaetano Rossiello
BUILDING PRE-TRAINED CONTEXTUAL EMBEDDINGS FOR PROGRAMMING LANGUAGES USING SPECIALIZED VOCABULARY

Publication number: 20220004365

Abstract: A method, a computer system, and a computer program product for building pre-trained contextual embeddings is provided. Embodiments of the present invention may include collecting programming code. Embodiments of the present invention may include loading and preparing the programming code using a specialized programming language keywords-based vocabulary. Embodiments of the present invention may include creating contextual embeddings for the programming code. Embodiments of the present invention may include storing the contextual embeddings.

Type: Application

Filed: July 1, 2020

Publication date: January 6, 2022

Inventors: Saurabh Pujar, Luca Buratti, Alessandro Morari, Jim Alain Laredo, Alfio Massimiliano Gliozzo, Gaetano Rossiello
VULNERABILITY ANALYSIS USING CONTEXTUAL EMBEDDINGS

Publication number: 20220004642

Abstract: A method, a computer system, and a computer program product for vulnerability analysis using contextual embeddings is provided. Embodiments of the present invention may include collecting labeled code snippets. Embodiments of the present invention may include preparing the labeled code snippets. Embodiments of the present invention may include tokenizing the labeled code snippets. Embodiments of the present invention may include fine-tuning a model. Embodiments of the present invention may include collecting unlabeled code snippets. Embodiments of the present invention may include predicting a vulnerability of the unlabeled code snippets using the model.

Type: Application

Filed: July 1, 2020

Publication date: January 6, 2022

Inventors: Saurabh Pujar, Luca Buratti, Alessandro Morari, Jim Alain Laredo, Alfio Massimiliano Gliozzo, Gaetano Rossiello
DISCOVERING RANKED DOMAIN RELEVANT TERMS USING KNOWLEDGE

Publication number: 20210326636

Abstract: One embodiment of the invention provides a method for terminology ranking for use in natural language processing. The method comprises receiving a list of terms extracted from a corpus, where the list comprises a ranking of the terms based on frequencies of the terms across the corpus. The method further comprises accessing a domain ontology associated with the corpus, and re-ranking the list based on the domain ontology. The resulting re-ranked list comprises a different ranking of the terms based on relevance of the terms using knowledge from the domain ontology. The method further comprises generating clusters of terms via a trained model adapted to the corpus, and boosting a rank of at least one term of the re-ranked list based on the clusters to increase a relevance of the at least one term using knowledge from the trained model.

Type: Application

Filed: April 16, 2020

Publication date: October 21, 2021

Inventors: Nandana Mihindukulasooriya, Ruchi Mahindru, Md Faisal Mahbub Chowdhury, Yu Deng, Alfio Massimiliano Gliozzo, Sarthak Dash, Nicolas Rodolfo Fauceglia, Gaetano Rossiello
ENCODING ENTITY REPRESENTATIONS FOR CROSS-DOCUMENT COREFERENCE

Publication number: 20210319054

Abstract: A computer-implemented method for performing cross-document coreference for a corpus of input documents includes determining mentions by parsing the input documents. Each mention includes a first vector for spelling data and a second vector for context data. A hierarchical tree data structure is created by generating several leaf nodes corresponding to respective mentions. Further, for each node, a similarity score is computed based on the first and second vectors of each node. The hierarchical tree is populated iteratively until a root node is created. Each iteration includes merging two nodes that have the highest similarity scores and creating an entity node instead at a hierarchical level that is above the two nodes being merged. Further, each iteration includes computing the similarity score for the entity node. The nodes with the similarity scores above a predetermined value are entities for which coreference has been performed in input documents.

Type: Application

Filed: April 14, 2020

Publication date: October 14, 2021

Inventors: Michael Robert Glass, Nicholas Brady Garvan Monath, Robert G. Farrell, Alfio Massimiliano Gliozzo, Gaetano Rossiello
Using relation suggestions to build a relational database

Patent number: 11080300

Abstract: Aspects of the invention include a system, a computer program product and a computer-implemented method for building a relational database between a first term and a second term. An embedding management system is used to determine a first term set based on the first term and a second term set based on the second term. An entity linking engine and a knowledge base engine are used to determine a relation match for the first term set when a relation of the first term set relates to a term from a second term set and for determining a relation match for the second term set when a relation of the second term set relates to a term from a first term set. A relation ranking engine selects a relation for the relational database having a selected number of matches.

Type: Grant

Filed: August 21, 2018

Date of Patent: August 3, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alfio Massimiliano Gliozzo, Gaetano Rossiello
PERFORMING FINE-GRAINED QUESTION TYPE CLASSIFICATION

Publication number: 20210182258

Abstract: A computer-implemented method according to one embodiment includes converting an input question into a vector form using trained word embeddings; constructing a type similarity matrix using a predetermined ontology; and determining a score for all possible types for the input question, based on the input question in vector form and the type similarity matrix.

Type: Application

Filed: December 13, 2019

Publication date: June 17, 2021

Inventors: Sarthak Dash, Gaetano Rossiello, Alfio Massimiliano Gliozzo, Robert G. Farrell, Bassem Makni, Avirup Sil, Vittorio Castelli, Radu Florian

1 2 next