Patents by Inventor Michael Robert Glass
Michael Robert Glass has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240111969Abstract: Methods, systems, and computer program products for natural language data generation using automated knowledge distillation techniques are provided herein. A computer-implemented method includes retrieving, in response to an input query, a set of passages from at least one knowledge base by processing the input query using a first set of artificial intelligence techniques; ranking at least a portion of the set of passages by processing the set of passages using a second set of artificial intelligence techniques; generating at least one natural language answer, in response to the input query, by processing a subset of the set of passages in connection with automated knowledge distillation techniques based on the ranking of the at least a portion of the set of passages; and performing automated actions based on the ranking of the at least a portion of the set of passages and/or the at least one generated natural language answer.Type: ApplicationFiled: September 30, 2022Publication date: April 4, 2024Inventors: Michael Robert Glass, Gaetano Rossiello, Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
-
Patent number: 11907842Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.Type: GrantFiled: January 13, 2023Date of Patent: February 20, 2024Assignee: NTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
-
Patent number: 11853877Abstract: Whether to train a new neural network model can be determined based on similarity estimates between a sample data set and a plurality of source data sets associated with a plurality of prior-trained neural network models. A cluster among the plurality of prior-trained neural network models can be determined. A set of training data based on the cluster can be determined. The new neural network model can be trained based on the set of training data.Type: GrantFiled: April 2, 2019Date of Patent: December 26, 2023Assignee: International Business Machines CorporationInventors: Patrick Watson, Bishwaranjan Bhattacharjee, Siyu Huo, Noel Christopher Codella, Brian Michael Belgodere, Parijat Dube, Michael Robert Glass, John Ronald Kender, Matthew Leon Hill
-
Publication number: 20230177335Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.Type: ApplicationFiled: January 13, 2023Publication date: June 8, 2023Inventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
-
Patent number: 11645513Abstract: Methods and systems are described for populating knowledge graphs. A processor can identify a set of data in a knowledge graph. The processor can identify a plurality of portions of an unannotated corpus, where a portion includes at least one entity. The processor can cluster the plurality of portions into at least one data set based on the at least one entity of the plurality of portions. The processor can train a model using the at least one data set and the set of data identified from the knowledge graph. The processor can apply the model to a set of entities in the unannotated corpus to predict unary relations associated with the set of entities. The processor can convert the predicted unary relations into a set of binary relations associated with the set of entities. The processor can add the set of binary relations to the knowledge graph.Type: GrantFiled: July 3, 2019Date of Patent: May 9, 2023Assignee: International Business Machines CorporationInventors: Michael Robert Glass, Alfio Massimiliano Gliozzo
-
Patent number: 11574179Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.Type: GrantFiled: January 7, 2019Date of Patent: February 7, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
-
Patent number: 11573994Abstract: A computer-implemented method for performing cross-document coreference for a corpus of input documents includes determining mentions by parsing the input documents. Each mention includes a first vector for spelling data and a second vector for context data. A hierarchical tree data structure is created by generating several leaf nodes corresponding to respective mentions. Further, for each node, a similarity score is computed based on the first and second vectors of each node. The hierarchical tree is populated iteratively until a root node is created. Each iteration includes merging two nodes that have the highest similarity scores and creating an entity node instead at a hierarchical level that is above the two nodes being merged. Further, each iteration includes computing the similarity score for the entity node. The nodes with the similarity scores above a predetermined value are entities for which coreference has been performed in input documents.Type: GrantFiled: April 14, 2020Date of Patent: February 7, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael Robert Glass, Nicholas Brady Garvan Monath, Robert G. Farrell, Alfio Massimiliano Gliozzo, Gaetano Rossiello
-
Patent number: 11507828Abstract: Training a machine learning model such as a neural network, which can automatically extract a hypernym from unstructured data, is disclosed. A preliminary candidate list of hyponym-hypernym pairs can be parsed from the corpus. A preliminary super-term—sub-term glossary can be generated from the corpus, the preliminary super-term—sub-term glossary containing one or more super-term—sub-term pairs. A super-term—sub-term pair can be filtered from the preliminary super-term—sub-term glossary, responsive to detecting that the super-term—sub-term pair is not a candidate for hyponym-hypernym pair, to generate a final super-term—sub-term glossary. The preliminary candidate list of hyponym-hypernym pairs and the final super-term—sub-term glossary can be combined to generate a final list of hyponym-hypernym pairs. An artificial neural network can be trained using the final list of hyponym-hypernym pairs as a training data set, the artificial neural network trained to identify a hypernym given new text data.Type: GrantFiled: October 29, 2019Date of Patent: November 22, 2022Assignee: International Business Machines CorporationInventors: Md Faisal Mahbub Chowdhury, Robert G. Farrell, Nicholas Brady Garvan Monath, Michael Robert Glass, Md Arafat Sultan
-
Patent number: 11500910Abstract: Techniques regarding similarity based negative sample analysis are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a similarity component that can determine similarity metrics for respective entities based on a vector space model. The respective entities can be represented by a dataset. Also, the computer executable components can comprise a sampling component that can perform a negative sampling analysis on the dataset based on the similarity metrics.Type: GrantFiled: March 21, 2018Date of Patent: November 15, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sarthak Dash, Alfio Massimiliano Gliozzo, Michael Robert Glass
-
Publication number: 20220207087Abstract: Determining an initial rank and a probability of relevance of each of a retrieved plurality of electronic documents relevant to a query. For each of a plurality of candidate facets, determine a revised rank for each of the retrieved plurality of electronic documents relevant to the query. Selecting, for each of the retrieved plurality of electronic documents relevant to the query, a minimum rank from among the initial rank and the revised rank for each of the plurality of candidate facets. Determine an expected discounted cumulative gain based on the probability of relevance and the minimum rank for each of the retrieved plurality of electronic documents relevant to the query. Select a set of optimistic facets based on maximizing the expected discounted cumulative gain.Type: ApplicationFiled: December 26, 2020Publication date: June 30, 2022Inventors: Michael Robert Glass, Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
-
Publication number: 20220101052Abstract: A computer answers a question using a data table. The computer receives a user question and a target table containing a target cell corresponding to a target answer for the user question, with the target cell corresponding to a target column and a target row. The computer generates, a first classifier to provide column correlation values reflecting the probability that a given column is the target column. The computer generates a second classifier that provides row correlation values reflecting the probability that a given row is the target row. The computer applies the first classifier to the target table to determine a column correlation value for each column. The computer applies the second classifier to the target table to determine a row correlation value for each row. The computer suggests, as the target cell, a cell having elevated column and row correlation values relative to other target table cells.Type: ApplicationFiled: September 30, 2020Publication date: March 31, 2022Inventors: Mustafa Canim, Michael Robert Glass, Alfio Massimiliano Gliozzo, Nicolas Rodolfo Fauceglia
-
Publication number: 20210319054Abstract: A computer-implemented method for performing cross-document coreference for a corpus of input documents includes determining mentions by parsing the input documents. Each mention includes a first vector for spelling data and a second vector for context data. A hierarchical tree data structure is created by generating several leaf nodes corresponding to respective mentions. Further, for each node, a similarity score is computed based on the first and second vectors of each node. The hierarchical tree is populated iteratively until a root node is created. Each iteration includes merging two nodes that have the highest similarity scores and creating an entity node instead at a hierarchical level that is above the two nodes being merged. Further, each iteration includes computing the similarity score for the entity node. The nodes with the similarity scores above a predetermined value are entities for which coreference has been performed in input documents.Type: ApplicationFiled: April 14, 2020Publication date: October 14, 2021Inventors: Michael Robert Glass, Nicholas Brady Garvan Monath, Robert G. Farrell, Alfio Massimiliano Gliozzo, Gaetano Rossiello
-
Patent number: 11055491Abstract: Computer-implemented methods, computer systems and computer program products for providing geographic location specific models for information extraction and knowledge discovery are provided. Aspects include receiving a body of input text using a processor having natural language processing functionality. Aspects also include using information extraction functionality of the processor to extract preliminary information including a relational table from the body of input text. Aspects also include determining one or more geographical contexts associated with the input text based on the preliminary information. Aspects also include determining inferred information based on the preliminary information and the one or more geographical contexts associated with the input text. Aspect also include augmenting the relational table with the inferred information.Type: GrantFiled: February 5, 2019Date of Patent: July 6, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Md Faisal Mahbub Chowdhury, Michael Robert Glass
-
Publication number: 20210125058Abstract: Training a machine learning model such as a neural network, which can automatically extract a hypernym from unstructured data, is disclosed. A preliminary candidate list of hyponym-hypernym pairs can be parsed from the corpus. A preliminary super-term-sub-term glossary can be generated from the corpus, the preliminary super-term-sub-term glossary containing one or more super-term-sub-term pairs. A super-term-sub-term pair can be filtered from the preliminary super-term-sub-term glossary, responsive to detecting that the super-term-sub-term pair is not a candidate for hyponym-hypernym pair, to generate a final super-term-sub-term glossary. The preliminary candidate list of hyponym-hypernym pairs and the final super-term-sub-term glossary can be combined to generate a final list of hyponym-hypernym pairs. An artificial neural network can be trained using the final list of hyponym-hypernym pairs as a training data set, the artificial neural network trained to identify a hypernym given new text data.Type: ApplicationFiled: October 29, 2019Publication date: April 29, 2021Inventors: Md Faisal Mahbub Chowdhury, Robert G. Farrell, Nicholas Brady Garvan Monath, Michael Robert Glass, Md Arafat Sultan
-
Publication number: 20210004672Abstract: Methods and systems are described for populating knowledge graphs. A processor can identify a set of data in a knowledge graph. The processor can identify a plurality of portions of an unannotated corpus, where a portion includes at least one entity. The processor can cluster the plurality of portions into at least one data set based on the at least one entity of the plurality of portions. The processor can train a model using the at least one data set and the set of data identified from the knowledge graph. The processor can apply the model to a set of entities in the unannotated corpus to predict unary relations associated with the set of entities. The processor can convert the predicted unary relations into a set of binary relations associated with the set of entities. The processor can add the set of binary relations to the knowledge graph.Type: ApplicationFiled: July 3, 2019Publication date: January 7, 2021Inventors: Michael Robert Glass, Alfio Massimiliano Gliozzo
-
Publication number: 20200320379Abstract: Whether to train a new neural network model can be determined based on similarity estimates between a sample data set and a plurality of source data sets associated with a plurality of prior-trained neural network models. A cluster among the plurality of prior-trained neural network models can be determined. A set of training data based on the cluster can be determined. The new neural network model can be trained based on the set of training data.Type: ApplicationFiled: April 2, 2019Publication date: October 8, 2020Inventors: Patrick Watson, Bishwaranjan Bhattacharjee, Siyu Huo, Noel Christopher Codella, Brian Michael Belgodere, Parijat Dube, Michael Robert Glass, John Ronald Kender, Matthew Leon Hill
-
Publication number: 20200250275Abstract: Computer-implemented methods, computer systems and computer program products for providing geographic location specific models for information extraction and knowledge discovery are provided. Aspects include receiving a body of input text using a processor having natural language processing functionality. Aspects also include using information extraction functionality of the processor to extract preliminary information including a relational table from the body of input text. Aspects also include determining one or more geographical contexts associated with the input text based on the preliminary information. Aspects also include determining inferred information based on the preliminary information and the one or more geographical contexts associated with the input text. Aspect also include augmenting the relational table with the inferred information.Type: ApplicationFiled: February 5, 2019Publication date: August 6, 2020Inventors: Md Faisal Mahbub Chowdhury, Michael Robert Glass
-
Publication number: 20200218968Abstract: A system comprises a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components. The system includes a receiving component that receives a corpus of data; a relation extraction component that generates noisy knowledge graphs from the corpus; and a training component that acquires global representations of entities and relation by training from output of the relation extraction component.Type: ApplicationFiled: January 7, 2019Publication date: July 9, 2020Inventors: Alfio Massimiliano Gliozzo, Sarthak Dash, Michael Robert Glass, Mustafa Canim
-
Publication number: 20190354850Abstract: Techniques regarding autonomously facilitating the selection of one or more transfer models to enhance the performance of one or more machine learning tasks are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise an assessment component that can assess a similarity metric between a source data set and a sample data set from a target machine learning task. The computer executable components can also comprise an identification component that can identify a pre-trained neural network model associated with the source data set based on the similarity metric to perform the target machine learning task.Type: ApplicationFiled: May 17, 2018Publication date: November 21, 2019Inventors: Patrick Watson, Bishwaranjan Bhattacharjee, Noel Christopher Codella, Brian Michael Belgodere, Parijat Dube, Michael Robert Glass, John Ronald Kender, Siyu Huo, Matthew Leon Hill
-
Publication number: 20190294694Abstract: Techniques regarding similarity based negative sample analysis are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a similarity component that can determine similarity metrics for respective entities based on a vector space model. The respective entities can be represented by a dataset. Also, the computer executable components can comprise a sampling component that can perform a negative sampling analysis on the dataset based on the similarity metrics.Type: ApplicationFiled: March 21, 2018Publication date: September 26, 2019Inventors: Sarthak Dash, Alfio Massimiliano Gliozzo, Michael Robert Glass