Patents by Inventor Michael R. Glass

Michael R. Glass has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Span selection training for natural language processing

Patent number: 11556712

Abstract: Methods and systems for natural language processing include pretraining a machine learning model that is based on a bidirectional encoder representations from transformers model, using a span selection training data set that associates a masked word with a passage. A natural language processing task is performed using the span selection pretrained machine learning model.

Type: Grant

Filed: October 8, 2019

Date of Patent: January 17, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael R. Glass, Alfio Massimiliano Gliozzo
Generating and augmenting transfer learning datasets with pseudo-labeled images

Patent number: 11151410

Abstract: A computer-implemented method for data labeling is provided. The computer-implemented method assigns pseudo-labels to unlabeled examples of data using a similarity metric on an embedding space to produce pseudo-labeled examples. A curriculum learning model is trained using the pseudo-labeled examples. The curriculum learning model trained with the pseudo-labeled examples is employed in in a fine-tuning task to enhance classification accuracy of the data.

Type: Grant

Filed: September 7, 2018

Date of Patent: October 19, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Patrick Watson, Bishwaranjan Bhattacharjee, Siyu Huo, Noel C. Codella, Brian M. Belgodere, Parijat Dube, Michael R. Glass, John R. Kender, Matthew L. Hill
SPAN SELECTION TRAINING FOR NATURAL LANGUAGE PROCESSING

Publication number: 20210103775

Abstract: Methods and systems for natural language processing include pretraining a machine learning model that is based on a bidirectional encoder representations from transformers model, using a span selection training data set that associates a masked word with a passage. A natural language processing task is performed using the span selection pretrained machine learning model.

Type: Application

Filed: October 8, 2019

Publication date: April 8, 2021

Inventors: Michael R. Glass, Alfio Massimiliano Gliozzo
GENERATING AND AUGMENTING TRANSFER LEARNING DATASETS WITH PSEUDO-LABELED IMAGES

Publication number: 20200082210

Abstract: A computer-implemented method for data labeling is provided. The computer-implemented method assigns pseudo-labels to unlabeled examples of data using a similarity metric on an embedding space to produce pseudo-labeled examples. A curriculum learning model is trained using the pseudo-labeled examples. The curriculum learning model trained with the pseudo-labeled examples is employed in in a fine-tuning task to enhance classification accuracy of the data.

Type: Application

Filed: September 7, 2018

Publication date: March 12, 2020

Inventors: Patrick Watson, Bishwaranjan Bhattacharjee, Siyu Huo, Noel C. Codella, Brian M. Belgodere, Parijat Dube, Michael R. Glass, John R. Kender, Matthew L. Hill
Learning parameters in a feed forward probabilistic graphical model

Patent number: 10445654

Abstract: According to an aspect, learning parameters in a feed forward probabilistic graphical model includes creating an inference model via a computer processor. The creation of the inference model includes receiving a training set that includes multiple scenarios, each scenario comprised of one or more natural language statements, and each scenario corresponding to a plurality of candidate answers. The creation also includes constructing evidence graphs for each of the multiple scenarios based on the training set, and calculating weights for common features across the evidence graphs that will maximize a probability of the inference model locating correct answers from corresponding candidate answers across all of the multiple scenarios. In response to an inquiry from a user that includes a scenario, the inference model constructs an evidence graph and recursively constructs formulas to express a confidence of each node in the evidence graph in terms of its parents in the evidence graph.

Type: Grant

Filed: September 1, 2015

Date of Patent: October 15, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael R. Glass, James W. Murdock, IV
Learning parameters in a feed forward probabilistic graphical model

Patent number: 10430722

Abstract: According to an aspect, learning parameters in a feed forward probabilistic graphical model includes creating an inference model via a computer processor. The creation of the inference model includes receiving a training set that includes multiple scenarios, each scenario comprised of one or more natural language statements, and each scenario corresponding to a plurality of candidate answers. The creation also includes constructing evidence graphs for each of the multiple scenarios based on the training set, and calculating weights for common features across the evidence graphs that will maximize a probability of the inference model locating correct answers from corresponding candidate answers across all of the multiple scenarios. In response to an inquiry from a user that includes a scenario, the inference model constructs an evidence graph and recursively constructs formulas to express a confidence of each node in the evidence graph in terms of its parents in the evidence graph.

Type: Grant

Filed: November 23, 2015

Date of Patent: October 1, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael R. Glass, James W. Murdock, IV
Structured term recognition

Patent number: 10339214

Abstract: A method, system and computer program product for recognizing terms in a specified corpus. In one embodiment, the method comprises providing a set of known terms t?T, each of the known terms t belonging to a set of types ? (t)={?1, . . . }, wherein each of the terms is comprised of a list of words, t=w1, w2, . . . , wn, and the union of all the words for all the terms is a word set W. The method further comprises using the set of terms T and the set of types to determine a set of pattern-to-type mappings p??; and using the set of pattern-to-type mappings to recognize terms in the specified corpus and, for each of the recognized terms in the specified corpus, to recognize one or more of the types ? for said each recognized term.

Type: Grant

Filed: November 2, 2012

Date of Patent: July 2, 2019

Assignee: International Business Machines Corporation

Inventors: Michael R. Glass, Alfio M. Gliozzo
Hybrid approach for short form detection and expansion to long forms

Patent number: 10282421

Abstract: Embodiments provide a system and method for short form and long form detection. Using a language-independent process, the detection system can ingest a corpus of documents, pre-process those documents by tokenizing the documents and performing a part-of-speech analysis, and can filter one or more candidate short forms using one or more filters that select for semantic criteria. Semantic criteria can include the part of speech of a token, whether the token contains more than a pre-determined amount of symbols or digits, whether the token appears too frequently in the corpus of documents, and whether the token has at least one uppercase letter. The detection system can detect short forms independent of case and punctuation, and independent of language-specific metaphone variants.

Type: Grant

Filed: June 29, 2018

Date of Patent: May 7, 2019

Assignee: International Business Machines Corporation

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
Hybrid approach for short form detection and expansion to long forms

Patent number: 10261990

Abstract: Embodiments provide a system and method for short form and long form detection. Using a language-independent process, the detection system can ingest a corpus of documents, pre-process those documents by tokenizing the documents and performing a part-of-speech analysis, and can filter one or more candidate short forms using one or more filters that select for semantic criteria. Semantic criteria can include the part of speech of a token, whether the token contains more than a pre-determined amount of symbols or digits, whether the token appears too frequently in the corpus of documents, and whether the token has at least one uppercase letter. The detection system can detect short forms independent of case and punctuation, and independent of language-specific metaphone variants.

Type: Grant

Filed: June 28, 2016

Date of Patent: April 16, 2019

Assignee: International Business Machines Corporation

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
HYBRID APPROACH FOR SHORT FORM DETECTION AND EXPANSION TO LONG FORMS

Publication number: 20180365210

Abstract: Embodiments provide a system and method for short form and long form detection. Given candidate short forms, the system can generate one or more n-gram combinations, resulting in one or more candidate short form and n-gram combination pairs. For each candidate short form and n-gram combination pair, the system can calculate an approximate string matching distance, calculate a best possible alignment score, calculate a confidence score, calculate a topic similarity score, and calculate a semantic similarity score. The system can determine the validity, through a meta learner, of the one or more valid candidate short form and n-gram combination pairs based upon each short form and n-gram combination pair's confidence score, topic similarity score, and semantic similarity score, and store the valid short form and n-gram combination pairs in a repository. The system has no language specific constraints and can extract short form and long form pairs from documents written in various languages.

Type: Application

Filed: August 22, 2018

Publication date: December 20, 2018

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
HYBRID APPROACH FOR SHORT FORM DETECTION AND EXPANSION TO LONG FORMS

Publication number: 20180307681

Abstract: Embodiments provide a system and method for short form and long form detection. Using a language-independent process, the detection system can ingest a corpus of documents, pre-process those documents by tokenizing the documents and performing a part-of-speech analysis, and can filter one or more candidate short forms using one or more filters that select for semantic criteria. Semantic criteria can include the part of speech of a token, whether the token contains more than a pre-determined amount of symbols or digits, whether the token appears too frequently in the corpus of documents, and whether the token has at least one uppercase letter. The detection system can detect short forms independent of case and punctuation, and independent of language-specific metaphone variants.

Type: Application

Filed: June 29, 2018

Publication date: October 25, 2018

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
Hybrid approach for short form detection and expansion to long forms

Patent number: 10083170

Abstract: Embodiments provide a system and method for short form and long form detection. Given candidate short forms, the system can generate one or more n-gram combinations, resulting in one or more candidate short form and n-gram combination pairs. For each candidate short form and n-gram combination pair, the system can calculate an approximate string matching distance, calculate a best possible alignment score, calculate a confidence score, calculate a topic similarity score, and calculate a semantic similarity score. The system can determine the validity, through a meta learner, of the one or more valid candidate short form and n-gram combination pairs based upon each short form and n-gram combination pair's confidence score, topic similarity score, and semantic similarity score, and store the valid short form and n-gram combination pairs in a repository. The system has no language specific constraints and can extract short form and long form pairs from documents written in various languages.

Type: Grant

Filed: June 28, 2016

Date of Patent: September 25, 2018

Assignee: International Business Machines Corporation

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
Evaluating passages in a question answering computer system

Patent number: 9946763

Abstract: According to an aspect, a processing system of a question answering computer system determines a first set of relations between one or more pairs of terms in a question. The processing system also determines a second set of relations between one or more pairs of terms in a candidate passage including a candidate answer to the question. The processing system matches the first set of relations to the second set of relations. A plurality of scores is determined by the processing system based on the matching. The processing system aggregates the scores to produce an answer score indicative of a level of support that the candidate answer correctly answers the question.

Type: Grant

Filed: November 5, 2014

Date of Patent: April 17, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael A. Barborak, James J. Fan, Michael R. Glass, Aditya A. Kalyanpur, Adam P. Lally, James W. Murdock, IV, Benjamin P. Segal
Evaluating passages in a question answering computer system

Patent number: 9946764

Abstract: According to an aspect, a processing system of a question answering computer system determines a first set of relations between one or more pairs of terms in a question. The processing system also determines a second set of relations between one or more pairs of terms in a candidate passage including a candidate answer to the question. The processing system matches the first set of relations to the second set of relations. A plurality of scores is determined by the processing system based on the matching. The processing system aggregates the scores to produce an answer score indicative of a level of support that the candidate answer correctly answers the question.

Type: Grant

Filed: March 6, 2015

Date of Patent: April 17, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael A. Barborak, James J. Fan, Michael R. Glass, Aditya A. Kalyanpur, Adam P. Lally, James W. Murdock, IV, Benjamin P. Segal
HYBRID APPROACH FOR SHORT FORM DETECTION AND EXPANSION TO LONG FORMS

Publication number: 20170371862

Abstract: Embodiments provide a system and method for short form and long form detection. Using a language-independent process, the detection system can ingest a corpus of documents, pre-process those documents by tokenizing the documents and performing a part-of-speech analysis, and can filter one or more candidate short forms using one or more filters that select for semantic criteria. Semantic criteria can include the part of speech of a token, whether the token contains more than a pre-determined amount of symbols or digits, whether the token appears too frequently in the corpus of documents, and whether the token has at least one uppercase letter. The detection system can detect short forms independent of case and punctuation, and independent of language-specific metaphone variants.

Type: Application

Filed: June 28, 2016

Publication date: December 28, 2017

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
HYBRID APPROACH FOR SHORT FORM DETECTION AND EXPANSION TO LONG FORMS

Publication number: 20170371857

Abstract: Embodiments provide a system and method for short form and long form detection. Given candidate short forms, the system can generate one or more n-gram combinations, resulting in one or more candidate short form and n-gram combination pairs. For each candidate short form and n-gram combination pair, the system can calculate an approximate string matching distance, calculate a best possible alignment score, calculate a confidence score, calculate a topic similarity score, and calculate a semantic similarity score. The system can determine the validity, through a meta learner, of the one or more valid candidate short form and n-gram combination pairs based upon each short form and n-gram combination pair's confidence score, topic similarity score, and semantic similarity score, and store the valid short form and n-gram combination pairs in a repository. The system has no language specific constraints and can extract short form and long form pairs from documents written in various languages.

Type: Application

Filed: June 28, 2016

Publication date: December 28, 2017

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
LEARNING PARAMETERS IN A FEED FORWARD PROBABILISTIC GRAPHICAL MODEL

Publication number: 20170061324

Abstract: According to an aspect, learning parameters in a feed forward probabilistic graphical model includes creating an inference model via a computer processor. The creation of the inference model includes receiving a training set that includes multiple scenarios, each scenario comprised of one or more natural language statements, and each scenario corresponding to a plurality of candidate answers. The creation also includes constructing evidence graphs for each of the multiple scenarios based on the training set, and calculating weights for common features across the evidence graphs that will maximize a probability of the inference model locating correct answers from corresponding candidate answers across all of the multiple scenarios. In response to an inquiry from a user that includes a scenario, the inference model constructs an evidence graph and recursively constructs formulas to express a confidence of each node in the evidence graph in terms of its parents in the evidence graph.

Type: Application

Filed: September 1, 2015

Publication date: March 2, 2017

Inventors: Michael R. Glass, James W. Murdock, IV
LEARNING PARAMETERS IN A FEED FORWARD PROBABILISTIC GRAPHICAL MODEL

Publication number: 20170061301

Abstract: According to an aspect, learning parameters in a feed forward probabilistic graphical model includes creating an inference model via a computer processor. The creation of the inference model includes receiving a training set that includes multiple scenarios, each scenario comprised of one or more natural language statements, and each scenario corresponding to a plurality of candidate answers. The creation also includes constructing evidence graphs for each of the multiple scenarios based on the training set, and calculating weights for common features across the evidence graphs that will maximize a probability of the inference model locating correct answers from corresponding candidate answers across all of the multiple scenarios. In response to an inquiry from a user that includes a scenario, the inference model constructs an evidence graph and recursively constructs formulas to express a confidence of each node in the evidence graph in terms of its parents in the evidence graph.

Type: Application

Filed: November 23, 2015

Publication date: March 2, 2017

Inventors: Michael R. Glass, James W. Murdock, IV
EVALUATING PASSAGES IN A QUESTION ANSWERING COMPUTER SYSTEM

Publication number: 20160125013

Abstract: According to an aspect, a processing system of a question answering computer system determines a first set of relations between one or more pairs of terms in a question. The processing system also determines a second set of relations between one or more pairs of terms in a candidate passage including a candidate answer to the question. The processing system matches the first set of relations to the second set of relations. A plurality of scores is determined by the processing system based on the matching. The processing system aggregates the scores to produce an answer score indicative of a level of support that the candidate answer correctly answers the question.

Type: Application

Filed: March 6, 2015

Publication date: May 5, 2016

Inventors: Michael A. Barborak, James J. Fan, Michael R. Glass, Aditya A. Kalyanpur, Adam P. Lally, James W. Murdock, IV, Benjamin P. Segal
EVALUATING PASSAGES IN A QUESTION ANSWERING COMPUTER SYSTEM

Publication number: 20160124962

Abstract: According to an aspect, a processing system of a question answering computer system determines a first set of relations between one or more pairs of terms in a question. The processing system also determines a second set of relations between one or more pairs of terms in a candidate passage including a candidate answer to the question. The processing system matches the first set of relations to the second set of relations. A plurality of scores is determined by the processing system based on the matching. The processing system aggregates the scores to produce an answer score indicative of a level of support that the candidate answer correctly answers the question.

Type: Application

Filed: November 5, 2014

Publication date: May 5, 2016

Inventors: Michael A. Barborak, James J. Fan, Michael R. Glass, Aditya A. Kalyanpur, Adam P. Lally, James W. Murdock, IV, Benjamin P. Segal