Patents by Inventor Alfio M. Gliozzo

Alfio M. Gliozzo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Joint learning of local and global features for entity linking via neural networks

Patent number: 11755885

Abstract: A system, method and computer program product for disambiguating one or more entity mentions in one or more documents. The method facilitates the simultaneous linking entity mentions in a document based on convolution neural networks and recurrent neural networks that model both the local and global features for entity linking. The framework uses the capacity of convolution neural networks to induce the underlying representations for local contexts and the advantage of recurrent neural networks to adaptively compress variable length sequences of predictions for global constraints. The RNN functions to accumulate information about the previous entity mentions and/or target entities, and provide them as the global constraints for the linking process of a current entity mention.

Type: Grant

Filed: April 6, 2020

Date of Patent: September 12, 2023

Assignee: International Business Machines Corporation

Inventors: Nicolas R. Fauceglia, Alfio M. Gliozzo, Oktie Hassanzadeh, Thien H. Nguyen, Mariano Rodriguez Muro, Mohammad Sadoghi Hamedani
Structured term recognition

Patent number: 11222175

Abstract: A method, system and computer program product for recognizing terms in a specified corpus. In one embodiment, the method comprises providing a set of known terms t?T, each of the known terms t belonging to a set of types ? (t)={?1, . . . }, wherein each of the terms is comprised of a list of words, t=w1, w2, . . . , wn, and the union of all the words for all the terms is a word set W. The method further comprises using the set of terms T and the set of types to determine a set of pattern-to-type mappings p??; and using the set of pattern-to-type mappings to recognize terms in the specified corpus and, for each of the recognized terms in the specified corpus, to recognize one or more of the types ? for said each recognized term.

Type: Grant

Filed: May 24, 2019

Date of Patent: January 11, 2022

Assignee: International Business Machines Corporation

Inventors: Michael Glass, Alfio M Gliozzo
Greedy active learning for reducing labeled data imbalances

Patent number: 11138523

Abstract: A method, system and computer-usable medium are disclosed for reducing labeled data imbalances when training an active learning system. The ratio of instances having positive labels or negative labels in a collection of labeled instances associated with an input category used for learning is determined. A first instance for annotation is selected from a collection of unlabeled instances if a first threshold for negative instances, and a first threshold confidence level of being a positive instance of the input category, have been met. A second instance for annotation is selected if a second threshold for positive instances, and a second threshold confidence level of being a negative instance of the input category, have been met. The first and second instances are respectively annotated with a positive and negative label and added to the collection of labeled instances, which are then used for training.

Type: Grant

Filed: July 27, 2016

Date of Patent: October 5, 2021

Assignee: International Business Machines Corporation

Inventors: Md Faisal M. Chowdhury, Sarthak Dash, Alfio M. Gliozzo
JOINT LEARNING OF LOCAL AND GLOBAL FEATURES FOR ENTITY LINKING VIA NEURAL NETWORKS

Publication number: 20200234102

Abstract: A system, method and computer program product for disambiguating one or more entity mentions in one or more documents. The method facilitates the simultaneous linking entity mentions in a document based on convolution neural networks and recurrent neural networks that model both the local and global features for entity linking. The framework uses the capacity of convolution neural networks to induce the underlying representations for local contexts and the advantage of recurrent neural networks to adaptively compress variable length sequences of predictions for global constraints. The RNN functions to accumulate information about the previous entity mentions and/or target entities, and provide them as the global constraints for the linking process of a current entity mention.

Type: Application

Filed: April 6, 2020

Publication date: July 23, 2020

Inventors: Nicolas R. Fauceglia, Alfio M. Gliozzo, Oktie Hassanzadeh, Thien H. Nguyen, Mariano Rodriguez Muro, Mohammad Sadoghi Hamedani
Joint learning of local and global features for entity linking via neural networks

Patent number: 10643120

Abstract: A system, method and computer program product for disambiguating one or more entity mentions in one or more documents. The method facilitates the simultaneous linking entity mentions in a document based on convolution neural networks and recurrent neural networks that model both the local and global features for entity linking. The framework uses the capacity of convolution neural networks to induce the underlying representations for local contexts and the advantage of recurrent neural networks to adaptively compress variable length sequences of predictions for global constraints. The RNN functions to accumulate information about the previous entity mentions and/or target entities, and provide them as the global constraints for the linking process of a current entity mention.

Type: Grant

Filed: November 15, 2016

Date of Patent: May 5, 2020

Assignee: International Business Machines Corporation

Inventors: Nicolas R. Fauceglia, Alfio M. Gliozzo, Oktie Hassanzadeh, Thien H. Nguyen, Mariano Rodriguez Muro, Mohammad Sadoghi Hamedani
Methods and system for fast, adaptive correction of misspells

Patent number: 10579729

Abstract: Embodiments are directed to a spellcheck module for an enterprise search engine. The spellcheck module includes a candidate suggestion generation module that generates a number of candidate words that may be the correction of the misspelled word. The candidate suggestion generation module implements an algorithm for indexing, searching, and storing terms from an index with a constrained edit distance, using words in a collection of documents. The spellcheck module further includes a candidate suggestion ranking module. In one embodiment, a non-contextual approach using a linear combination of distance and probability scores is utilized; while in another embodiment, a context sensitive approach accounting for real-word misspells and adopting deep learning models is utilized. In use, a query is provided to the spellcheck module to generate results in the form of a ranked list of generated candidate entries that may be an entry a user accidentally misspelled.

Type: Grant

Filed: October 18, 2016

Date of Patent: March 3, 2020

Assignee: International Business Machines Corporation

Inventors: Alfio M. Gliozzo, Piero Molino
STRUCTURED TERM RECOGNITION

Publication number: 20190286693

Abstract: A method, system and computer program product for recognizing terms in a specified corpus. In one embodiment, the method comprises providing a set of known terms t ? T, each of the known terms t belonging to a set of types ? (t)={?1, . . . }, wherein each of the terms is comprised of a list of words, t=w1, w2, . . . , wn, and the union of all the words for all the terms is a word set W. The method further comprises using the set of terms T and the set of types to determine a set of pattern-to-type mappings p??; and using the set of pattern-to-type mappings to recognize terms in the specified corpus and, for each of the recognized terms in the specified corpus, to recognize one or more of the types ? for said each recognized term.

Type: Application

Filed: May 24, 2019

Publication date: September 19, 2019

Inventors: Michael Glass, Alfio M Gliozzo
Methods and system for fast, adaptive correction of misspells

Patent number: 10372814

Abstract: Embodiments are directed to a spellcheck module for an enterprise search engine. The spellcheck module includes a candidate suggestion generation module that generates a number of candidate words that may be the correction of the misspelled word. The candidate suggestion generation module implements an algorithm for indexing, searching, and storing terms from an index with a constrained edit distance, using words in a collection of documents. The spellcheck module further includes a candidate suggestion ranking module. In one embodiment, a non-contextual approach using a linear combination of distance and probability scores is utilized; while in another embodiment, a context sensitive approach accounting for real-word misspells and adopting deep learning models is utilized. In use, a query is provided to the spellcheck module to generate results in the form of a ranked list of generated candidate entries that may be an entry a user accidentally misspelled.

Type: Grant

Filed: October 18, 2016

Date of Patent: August 6, 2019

Assignee: International Business Machines Corporation

Inventors: Alfio M. Gliozzo, Piero Molino
Structured term recognition

Patent number: 10339214

Abstract: A method, system and computer program product for recognizing terms in a specified corpus. In one embodiment, the method comprises providing a set of known terms t?T, each of the known terms t belonging to a set of types ? (t)={?1, . . . }, wherein each of the terms is comprised of a list of words, t=w1, w2, . . . , wn, and the union of all the words for all the terms is a word set W. The method further comprises using the set of terms T and the set of types to determine a set of pattern-to-type mappings p??; and using the set of pattern-to-type mappings to recognize terms in the specified corpus and, for each of the recognized terms in the specified corpus, to recognize one or more of the types ? for said each recognized term.

Type: Grant

Filed: November 2, 2012

Date of Patent: July 2, 2019

Assignee: International Business Machines Corporation

Inventors: Michael R. Glass, Alfio M. Gliozzo
Hybrid approach for short form detection and expansion to long forms

Patent number: 10282421

Abstract: Embodiments provide a system and method for short form and long form detection. Using a language-independent process, the detection system can ingest a corpus of documents, pre-process those documents by tokenizing the documents and performing a part-of-speech analysis, and can filter one or more candidate short forms using one or more filters that select for semantic criteria. Semantic criteria can include the part of speech of a token, whether the token contains more than a pre-determined amount of symbols or digits, whether the token appears too frequently in the corpus of documents, and whether the token has at least one uppercase letter. The detection system can detect short forms independent of case and punctuation, and independent of language-specific metaphone variants.

Type: Grant

Filed: June 29, 2018

Date of Patent: May 7, 2019

Assignee: International Business Machines Corporation

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
Identifying salient terms for passage justification in a question answering system

Patent number: 10275454

Abstract: According to an aspect, a term saliency model is trained to identify salient terms that provide supporting evidence of a candidate answer in a question answering computer system based on a training dataset. The question answering computer system can perform term saliency weighting of a candidate passage to identify one or more salient terms and term weights in the candidate passage based on the term saliency model. The one or more salient terms and term weights can be provided to at least one passage scorer of the question answering computer system to determine whether the candidate passage is justified as providing supporting evidence of the candidate answer.

Type: Grant

Filed: October 13, 2014

Date of Patent: April 30, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Md Faisal Mahbub Chowdhury, Alfio M. Gliozzo, Adam Lally
Hybrid approach for short form detection and expansion to long forms

Patent number: 10261990

Abstract: Embodiments provide a system and method for short form and long form detection. Using a language-independent process, the detection system can ingest a corpus of documents, pre-process those documents by tokenizing the documents and performing a part-of-speech analysis, and can filter one or more candidate short forms using one or more filters that select for semantic criteria. Semantic criteria can include the part of speech of a token, whether the token contains more than a pre-determined amount of symbols or digits, whether the token appears too frequently in the corpus of documents, and whether the token has at least one uppercase letter. The detection system can detect short forms independent of case and punctuation, and independent of language-specific metaphone variants.

Type: Grant

Filed: June 28, 2016

Date of Patent: April 16, 2019

Assignee: International Business Machines Corporation

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
HYBRID APPROACH FOR SHORT FORM DETECTION AND EXPANSION TO LONG FORMS

Publication number: 20180365210

Abstract: Embodiments provide a system and method for short form and long form detection. Given candidate short forms, the system can generate one or more n-gram combinations, resulting in one or more candidate short form and n-gram combination pairs. For each candidate short form and n-gram combination pair, the system can calculate an approximate string matching distance, calculate a best possible alignment score, calculate a confidence score, calculate a topic similarity score, and calculate a semantic similarity score. The system can determine the validity, through a meta learner, of the one or more valid candidate short form and n-gram combination pairs based upon each short form and n-gram combination pair's confidence score, topic similarity score, and semantic similarity score, and store the valid short form and n-gram combination pairs in a repository. The system has no language specific constraints and can extract short form and long form pairs from documents written in various languages.

Type: Application

Filed: August 22, 2018

Publication date: December 20, 2018

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
HYBRID APPROACH FOR SHORT FORM DETECTION AND EXPANSION TO LONG FORMS

Publication number: 20180307681

Abstract: Embodiments provide a system and method for short form and long form detection. Using a language-independent process, the detection system can ingest a corpus of documents, pre-process those documents by tokenizing the documents and performing a part-of-speech analysis, and can filter one or more candidate short forms using one or more filters that select for semantic criteria. Semantic criteria can include the part of speech of a token, whether the token contains more than a pre-determined amount of symbols or digits, whether the token appears too frequently in the corpus of documents, and whether the token has at least one uppercase letter. The detection system can detect short forms independent of case and punctuation, and independent of language-specific metaphone variants.

Type: Application

Filed: June 29, 2018

Publication date: October 25, 2018

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
Hybrid approach for short form detection and expansion to long forms

Patent number: 10083170

Abstract: Embodiments provide a system and method for short form and long form detection. Given candidate short forms, the system can generate one or more n-gram combinations, resulting in one or more candidate short form and n-gram combination pairs. For each candidate short form and n-gram combination pair, the system can calculate an approximate string matching distance, calculate a best possible alignment score, calculate a confidence score, calculate a topic similarity score, and calculate a semantic similarity score. The system can determine the validity, through a meta learner, of the one or more valid candidate short form and n-gram combination pairs based upon each short form and n-gram combination pair's confidence score, topic similarity score, and semantic similarity score, and store the valid short form and n-gram combination pairs in a repository. The system has no language specific constraints and can extract short form and long form pairs from documents written in various languages.

Type: Grant

Filed: June 28, 2016

Date of Patent: September 25, 2018

Assignee: International Business Machines Corporation

Inventors: Md Faisal M. Chowdhury, Michael R. Glass, Alfio M. Gliozzo
JOINT LEARNING OF LOCAL AND GLOBAL FEATURES FOR ENTITY LINKING VIA NEURAL NETWORKS

Publication number: 20180137404

Abstract: A system, method and computer program product for disambiguating one or more entity mentions in one or more documents. The method facilitates the simultaneous linking entity mentions in a document based on convolution neural networks and recurrent neural networks that model both the local and global features for entity linking. The framework uses the capacity of convolution neural networks to induce the underlying representations for local contexts and the advantage of recurrent neural networks to adaptively compress variable length sequences of predictions for global constraints. The RNN functions to accumulate information about the previous entity mentions and/or target entities, and provide them as the global constraints for the linking process of a current entity mention.

Type: Application

Filed: November 15, 2016

Publication date: May 17, 2018

Inventors: Nicolas R. Fauceglia, Alfio M. Gliozzo, Oktie Hassanzadeh, Thien H. Nguyen, Mariano Rodriguez Muro, Mohammad Sadoghi Hamedani
METHODS AND SYSTEM FOR FAST, ADAPTIVE CORRECTION OF MISSPELLS

Publication number: 20180107642

Abstract: Embodiments are directed to a spellcheck module for an enterprise search engine. The spellcheck module includes a candidate suggestion generation module that generates a number of candidate words that may be the correction of the misspelled word. The candidate suggestion generation module implements an algorithm for indexing, searching, and storing terms from an index with a constrained edit distance, using words in a collection of documents. The spellcheck module further includes a candidate suggestion ranking module. In one embodiment, a non-contextual approach using a linear combination of distance and probability scores is utilized; while in another embodiment, a context sensitive approach accounting for real-word misspells and adopting deep learning models is utilized. In use, a query is provided to the spellcheck module to generate results in the form of a ranked list of generated candidate entries that may be an entry a user accidentally misspelled.

Type: Application

Filed: October 18, 2016

Publication date: April 19, 2018

Inventors: Alfio M. Gliozzo, Piero Molino
METHODS AND SYSTEM FOR FAST, ADAPTIVE CORRECTION OF MISSPELLS

Publication number: 20180107643

Abstract: Embodiments are directed to a spellcheck module for an enterprise search engine. The spellcheck module includes a candidate suggestion generation module that generates a number of candidate words that may be the correction of the misspelled word. The candidate suggestion generation module implements an algorithm for indexing, searching, and storing terms from an index with a constrained edit distance, using words in a collection of documents. The spellcheck module further includes a candidate suggestion ranking module. In one embodiment, a non-contextual approach using a linear combination of distance and probability scores is utilized; while in another embodiment, a context sensitive approach accounting for real-word misspells and adopting deep learning models is utilized. In use, a query is provided to the spellcheck module to generate results in the form of a ranked list of generated candidate entries that may be an entry a user accidentally misspelled.

Type: Application

Filed: October 18, 2016

Publication date: April 19, 2018

Inventors: Alfio M. Gliozzo, Piero Molino
Greedy Active Learning for Reducing User Interaction

Publication number: 20180032901

Abstract: A method, system and computer-usable medium are disclosed for reducing user interaction when training an active learning system. Source input containing unlabeled instances and an input category are received. A Latent Semantic Analysis (LSA) similarity score, and a search engine score, are generated for each unlabeled instance, which in turn are used with the input category to rank the unlabeled instances. If a first threshold for negative instances has been met, a first unlabeled instance, having the highest ranking, is selected for annotation from the ranked collection of unlabeled instances and provided to a user for annotation with a positive label. If a second threshold for positive instances has been met, then second unlabeled instance, having the lowest ranking, is selected for annotation from the ranked collection of unannotated instances and automatically annotated with a negative label. The annotated instances are then used to train an active learning system.

Type: Application

Filed: July 27, 2016

Publication date: February 1, 2018

Inventors: Md Faisal M. Chowdhury, Sarthak Dash, Alfio M. Gliozzo
Greedy Active Learning for Reducing Labeled Data Imbalances

Publication number: 20180032900

Abstract: A method, system and computer-usable medium are disclosed for reducing labeled data imbalances when training an active learning system. The ratio of instances having positive labels or negative labels in a collection of labeled instances associated with an input category used for learning is determined. A first instance for annotation is selected from a collection of unlabeled instances if a first threshold for negative instances, and a first threshold confidence level of being a positive instance of the input category, have been met. A second instance for annotation is selected if a second threshold for positive instances, and a second threshold confidence level of being a negative instance of the input category, have been met. The first and second instances are respectively annotated with a positive and negative label and added to the collection of labeled instances, which are then used for training.

Type: Application

Filed: July 27, 2016

Publication date: February 1, 2018

Inventors: Md Faisal M. Chowdhury, Sarthak Dash, Alfio M. Gliozzo

1 2 next