Patents by Inventor Michael Robert Glass

Michael Robert Glass has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

NATURAL LANGUAGE DATA GENERATION USING AUTOMATED KNOWLEDGE DISTILLATION TECHNIQUES

Publication number: 20240111969

Abstract: Methods, systems, and computer program products for natural language data generation using automated knowledge distillation techniques are provided herein. A computer-implemented method includes retrieving, in response to an input query, a set of passages from at least one knowledge base by processing the input query using a first set of artificial intelligence techniques; ranking at least a portion of the set of passages by processing the set of passages using a second set of artificial intelligence techniques; generating at least one natural language answer, in response to the input query, by processing a subset of the set of passages in connection with automated knowledge distillation techniques based on the ranking of the at least a portion of the set of passages; and performing automated actions based on the ranking of the at least a portion of the set of passages and/or the at least one generated natural language answer.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Inventors: Michael Robert Glass, Gaetano Rossiello, Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
Deep symbolic validation of information extraction systems

Patent number: 11907842

Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.

Type: Grant

Filed: January 13, 2023

Date of Patent: February 20, 2024

Assignee: NTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
Training transfer-focused models for deep learning

Patent number: 11853877

Abstract: Whether to train a new neural network model can be determined based on similarity estimates between a sample data set and a plurality of source data sets associated with a plurality of prior-trained neural network models. A cluster among the plurality of prior-trained neural network models can be determined. A set of training data based on the cluster can be determined. The new neural network model can be trained based on the set of training data.

Type: Grant

Filed: April 2, 2019

Date of Patent: December 26, 2023

Assignee: International Business Machines Corporation

Inventors: Patrick Watson, Bishwaranjan Bhattacharjee, Siyu Huo, Noel Christopher Codella, Brian Michael Belgodere, Parijat Dube, Michael Robert Glass, John Ronald Kender, Matthew Leon Hill
DEEP SYMBOLIC VALIDATION OF INFORMATION EXTRACTION SYSTEMS

Publication number: 20230177335

Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.

Type: Application

Filed: January 13, 2023

Publication date: June 8, 2023

Inventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
Unary relation extraction using distant supervision

Patent number: 11645513

Abstract: Methods and systems are described for populating knowledge graphs. A processor can identify a set of data in a knowledge graph. The processor can identify a plurality of portions of an unannotated corpus, where a portion includes at least one entity. The processor can cluster the plurality of portions into at least one data set based on the at least one entity of the plurality of portions. The processor can train a model using the at least one data set and the set of data identified from the knowledge graph. The processor can apply the model to a set of entities in the unannotated corpus to predict unary relations associated with the set of entities. The processor can convert the predicted unary relations into a set of binary relations associated with the set of entities. The processor can add the set of binary relations to the knowledge graph.

Type: Grant

Filed: July 3, 2019

Date of Patent: May 9, 2023

Assignee: International Business Machines Corporation

Inventors: Michael Robert Glass, Alfio Massimiliano Gliozzo
Deep symbolic validation of information extraction systems

Patent number: 11574179

Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.

Type: Grant

Filed: January 7, 2019

Date of Patent: February 7, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
Encoding entity representations for cross-document coreference

Patent number: 11573994

Abstract: A computer-implemented method for performing cross-document coreference for a corpus of input documents includes determining mentions by parsing the input documents. Each mention includes a first vector for spelling data and a second vector for context data. A hierarchical tree data structure is created by generating several leaf nodes corresponding to respective mentions. Further, for each node, a similarity score is computed based on the first and second vectors of each node. The hierarchical tree is populated iteratively until a root node is created. Each iteration includes merging two nodes that have the highest similarity scores and creating an entity node instead at a hierarchical level that is above the two nodes being merged. Further, each iteration includes computing the similarity score for the entity node. The nodes with the similarity scores above a predetermined value are entities for which coreference has been performed in input documents.

Type: Grant

Filed: April 14, 2020

Date of Patent: February 7, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael Robert Glass, Nicholas Brady Garvan Monath, Robert G. Farrell, Alfio Massimiliano Gliozzo, Gaetano Rossiello
Unsupervised hypernym induction machine learning

Patent number: 11507828

Abstract: Training a machine learning model such as a neural network, which can automatically extract a hypernym from unstructured data, is disclosed. A preliminary candidate list of hyponym-hypernym pairs can be parsed from the corpus. A preliminary super-term—sub-term glossary can be generated from the corpus, the preliminary super-term—sub-term glossary containing one or more super-term—sub-term pairs. A super-term—sub-term pair can be filtered from the preliminary super-term—sub-term glossary, responsive to detecting that the super-term—sub-term pair is not a candidate for hyponym-hypernym pair, to generate a final super-term—sub-term glossary. The preliminary candidate list of hyponym-hypernym pairs and the final super-term—sub-term glossary can be combined to generate a final list of hyponym-hypernym pairs. An artificial neural network can be trained using the final list of hyponym-hypernym pairs as a training data set, the artificial neural network trained to identify a hypernym given new text data.

Type: Grant

Filed: October 29, 2019

Date of Patent: November 22, 2022

Assignee: International Business Machines Corporation

Inventors: Md Faisal Mahbub Chowdhury, Robert G. Farrell, Nicholas Brady Garvan Monath, Michael Robert Glass, Md Arafat Sultan
Similarity based negative sampling analysis

Patent number: 11500910

Abstract: Techniques regarding similarity based negative sample analysis are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a similarity component that can determine similarity metrics for respective entities based on a vector space model. The respective entities can be represented by a dataset. Also, the computer executable components can comprise a sampling component that can perform a negative sampling analysis on the dataset based on the similarity metrics.

Type: Grant

Filed: March 21, 2018

Date of Patent: November 15, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sarthak Dash, Alfio Massimiliano Gliozzo, Michael Robert Glass
OPTIMISTIC FACET SET SELECTION FOR DYNAMIC FACETED SEARCH

Publication number: 20220207087

Abstract: Determining an initial rank and a probability of relevance of each of a retrieved plurality of electronic documents relevant to a query. For each of a plurality of candidate facets, determine a revised rank for each of the retrieved plurality of electronic documents relevant to the query. Selecting, for each of the retrieved plurality of electronic documents relevant to the query, a minimum rank from among the initial rank and the revised rank for each of the plurality of candidate facets. Determine an expected discounted cumulative gain based on the probability of relevance and the minimum rank for each of the retrieved plurality of electronic documents relevant to the query. Select a set of optimistic facets based on maximizing the expected discounted cumulative gain.

Type: Application

Filed: December 26, 2020

Publication date: June 30, 2022

Inventors: Michael Robert Glass, Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
ANSWERING QUESTIONS WITH ARTIFICIAL INTELLIGENCE USING TABULAR DATA

Publication number: 20220101052

Abstract: A computer answers a question using a data table. The computer receives a user question and a target table containing a target cell corresponding to a target answer for the user question, with the target cell corresponding to a target column and a target row. The computer generates, a first classifier to provide column correlation values reflecting the probability that a given column is the target column. The computer generates a second classifier that provides row correlation values reflecting the probability that a given row is the target row. The computer applies the first classifier to the target table to determine a column correlation value for each column. The computer applies the second classifier to the target table to determine a row correlation value for each row. The computer suggests, as the target cell, a cell having elevated column and row correlation values relative to other target table cells.

Type: Application

Filed: September 30, 2020

Publication date: March 31, 2022

Inventors: Mustafa Canim, Michael Robert Glass, Alfio Massimiliano Gliozzo, Nicolas Rodolfo Fauceglia
ENCODING ENTITY REPRESENTATIONS FOR CROSS-DOCUMENT COREFERENCE

Publication number: 20210319054

Abstract: A computer-implemented method for performing cross-document coreference for a corpus of input documents includes determining mentions by parsing the input documents. Each mention includes a first vector for spelling data and a second vector for context data. A hierarchical tree data structure is created by generating several leaf nodes corresponding to respective mentions. Further, for each node, a similarity score is computed based on the first and second vectors of each node. The hierarchical tree is populated iteratively until a root node is created. Each iteration includes merging two nodes that have the highest similarity scores and creating an entity node instead at a hierarchical level that is above the two nodes being merged. Further, each iteration includes computing the similarity score for the entity node. The nodes with the similarity scores above a predetermined value are entities for which coreference has been performed in input documents.

Type: Application

Filed: April 14, 2020

Publication date: October 14, 2021

Inventors: Michael Robert Glass, Nicholas Brady Garvan Monath, Robert G. Farrell, Alfio Massimiliano Gliozzo, Gaetano Rossiello
Geographic location specific models for information extraction and knowledge discovery

Patent number: 11055491

Abstract: Computer-implemented methods, computer systems and computer program products for providing geographic location specific models for information extraction and knowledge discovery are provided. Aspects include receiving a body of input text using a processor having natural language processing functionality. Aspects also include using information extraction functionality of the processor to extract preliminary information including a relational table from the body of input text. Aspects also include determining one or more geographical contexts associated with the input text based on the preliminary information. Aspects also include determining inferred information based on the preliminary information and the one or more geographical contexts associated with the input text. Aspect also include augmenting the relational table with the inferred information.

Type: Grant

Filed: February 5, 2019

Date of Patent: July 6, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Md Faisal Mahbub Chowdhury, Michael Robert Glass
UNSUPERVISED HYPERNYM INDUCTION MACHINE LEARNING

Publication number: 20210125058

Abstract: Training a machine learning model such as a neural network, which can automatically extract a hypernym from unstructured data, is disclosed. A preliminary candidate list of hyponym-hypernym pairs can be parsed from the corpus. A preliminary super-term-sub-term glossary can be generated from the corpus, the preliminary super-term-sub-term glossary containing one or more super-term-sub-term pairs. A super-term-sub-term pair can be filtered from the preliminary super-term-sub-term glossary, responsive to detecting that the super-term-sub-term pair is not a candidate for hyponym-hypernym pair, to generate a final super-term-sub-term glossary. The preliminary candidate list of hyponym-hypernym pairs and the final super-term-sub-term glossary can be combined to generate a final list of hyponym-hypernym pairs. An artificial neural network can be trained using the final list of hyponym-hypernym pairs as a training data set, the artificial neural network trained to identify a hypernym given new text data.

Type: Application

Filed: October 29, 2019

Publication date: April 29, 2021

Inventors: Md Faisal Mahbub Chowdhury, Robert G. Farrell, Nicholas Brady Garvan Monath, Michael Robert Glass, Md Arafat Sultan
UNARY RELATION EXTRACTION USING DISTANT SUPERVISION

Publication number: 20210004672

Abstract: Methods and systems are described for populating knowledge graphs. A processor can identify a set of data in a knowledge graph. The processor can identify a plurality of portions of an unannotated corpus, where a portion includes at least one entity. The processor can cluster the plurality of portions into at least one data set based on the at least one entity of the plurality of portions. The processor can train a model using the at least one data set and the set of data identified from the knowledge graph. The processor can apply the model to a set of entities in the unannotated corpus to predict unary relations associated with the set of entities. The processor can convert the predicted unary relations into a set of binary relations associated with the set of entities. The processor can add the set of binary relations to the knowledge graph.

Type: Application

Filed: July 3, 2019

Publication date: January 7, 2021

Inventors: Michael Robert Glass, Alfio Massimiliano Gliozzo
TRAINING TRANSFER-FOCUSED MODELS FOR DEEP LEARNING

Publication number: 20200320379

Abstract: Whether to train a new neural network model can be determined based on similarity estimates between a sample data set and a plurality of source data sets associated with a plurality of prior-trained neural network models. A cluster among the plurality of prior-trained neural network models can be determined. A set of training data based on the cluster can be determined. The new neural network model can be trained based on the set of training data.

Type: Application

Filed: April 2, 2019

Publication date: October 8, 2020

Inventors: Patrick Watson, Bishwaranjan Bhattacharjee, Siyu Huo, Noel Christopher Codella, Brian Michael Belgodere, Parijat Dube, Michael Robert Glass, John Ronald Kender, Matthew Leon Hill
GEOGRAPHIC LOCATION SPECIFIC MODELS FOR INFORMATION EXTRACTION AND KNOWLEDGE DISCOVERY

Publication number: 20200250275

Abstract: Computer-implemented methods, computer systems and computer program products for providing geographic location specific models for information extraction and knowledge discovery are provided. Aspects include receiving a body of input text using a processor having natural language processing functionality. Aspects also include using information extraction functionality of the processor to extract preliminary information including a relational table from the body of input text. Aspects also include determining one or more geographical contexts associated with the input text based on the preliminary information. Aspects also include determining inferred information based on the preliminary information and the one or more geographical contexts associated with the input text. Aspect also include augmenting the relational table with the inferred information.

Type: Application

Filed: February 5, 2019

Publication date: August 6, 2020

Inventors: Md Faisal Mahbub Chowdhury, Michael Robert Glass
DEEP SYMBOLIC VALIDATION OF INFORMATION EXTRACTION SYSTEMS

Publication number: 20200218968

Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.

Type: Application

Filed: January 7, 2019

Publication date: July 9, 2020

Inventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
IDENTIFYING TRANSFER MODELS FOR MACHINE LEARNING TASKS

Publication number: 20190354850

Abstract: Techniques regarding autonomously facilitating the selection of one or more transfer models to enhance the performance of one or more machine learning tasks are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise an assessment component that can assess a similarity metric between a source data set and a sample data set from a target machine learning task. The computer executable components can also comprise an identification component that can identify a pre-trained neural network model associated with the source data set based on the similarity metric to perform the target machine learning task.

Type: Application

Filed: May 17, 2018

Publication date: November 21, 2019

Inventors: Patrick Watson, Bishwaranjan Bhattacharjee, Noel Christopher Codella, Brian Michael Belgodere, Parijat Dube, Michael Robert Glass, John Ronald Kender, Siyu Huo, Matthew Leon Hill
SIMILARITY BASED NEGATIVE SAMPLING ANALYSIS

Publication number: 20190294694

Abstract: Techniques regarding similarity based negative sample analysis are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a similarity component that can determine similarity metrics for respective entities based on a vector space model. The respective entities can be represented by a dataset. Also, the computer executable components can comprise a sampling component that can perform a negative sampling analysis on the dataset based on the similarity metrics.

Type: Application

Filed: March 21, 2018

Publication date: September 26, 2019

Inventors: Sarthak Dash, Alfio Massimiliano Gliozzo, Michael Robert Glass