Patents by Inventor Md Faisal Mahbub Chowdhury

Md Faisal Mahbub Chowdhury has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220067539
    Abstract: A method, a computer program product, and a computer system induce knowledge from a knowledge graph. The method includes receiving a request indicative of a domain. The method includes determining a corpus corresponding to the domain and determining a quality of the corpus in generating the knowledge graph relative to a quality threshold. If the quality threshold is not met, the method includes determining a candidate expansion corpus to incorporate further data therefrom into the corpus relative to an expansion threshold. If the expansion threshold is met, the method includes generating an expanded corpus by expanding the corpus with the further data. The method includes generating the knowledge graph based on the expanded corpus from which the knowledge is induced and generating a response to the request based on the knowledge graph.
    Type: Application
    Filed: September 1, 2020
    Publication date: March 3, 2022
    Inventors: NANDANA MIHINDUKULASOORIYA, Md Faisal Mahbub Chowdhury, Yu Deng, Ruchi Mahindru, Nicolas Rodolfo Fauceglia, Alfio Massimiliano Gliozzo
  • Publication number: 20220004711
    Abstract: An approach to induction of unknown terms into a term taxonomy graph may be provided. The approach may include analyzing a domain specific corpus to generate a term taxonomy graph using a term taxonomy graph generation model with a term knowledge base and determining which terms within the domain specific corpus are out of vocabulary (OOV) terms. The approach may also analyze the terms in the domain specific corpus with a semantic representation model to generate feature vectors of the OOV terms and terms known within the generated term taxonomy graph. The approach may determine if an OOV can be a hyponym of a term within the term taxonomy graph based on the feature vectors and insert the OOV term into the graph at the appropriate location.
    Type: Application
    Filed: July 1, 2020
    Publication date: January 6, 2022
    Inventors: Feifei Pan, Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
  • Publication number: 20210383205
    Abstract: A system, computer program product, and method are provided for employing a graph neural network (GNN) to construct a taxonomy. The GNN is subject to a training cycle and an inference cycle. The training cycle encodes cross-domain terms pairs from a set of noisy cross domain pairs extracted from a corpora, and outputs a preliminary taxonomy. The inference cycle identifies candidate term pairs and selectively subjects the candidate term pairs to selective filtering to produce a system predicted taxonomy from the preliminary taxonomy.
    Type: Application
    Filed: June 3, 2020
    Publication date: December 9, 2021
    Applicant: International Business Machines Corporation
    Inventors: Chao Shang, Sarthak Dash, Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo
  • Publication number: 20210342552
    Abstract: An embodiment of the present invention generates natural language content from a set of keywords in accordance with a template. Keyword vectors representing a context for the keywords are generated. The keywords are associated with language tags, while the template includes a series of language tags indicating an arrangement for the generated natural language content. Template vectors are generated from the series of language tags of the template and represent a context for the template. Contributions from the contexts for the keywords and the template are determined based on a comparison of the series of language tags of the template with the associated language tags of the keywords. One or more words for each language tag of the template are generated to produce the natural language content based on combined contributions from the contexts for the keywords and the template.
    Type: Application
    Filed: May 1, 2020
    Publication date: November 4, 2021
    Inventors: Abhijit Mishra, Md Faisal Mahbub Chowdhury, Sagar Manohar, Dan Gutfreund
  • Publication number: 20210326636
    Abstract: One embodiment of the invention provides a method for terminology ranking for use in natural language processing. The method comprises receiving a list of terms extracted from a corpus, where the list comprises a ranking of the terms based on frequencies of the terms across the corpus. The method further comprises accessing a domain ontology associated with the corpus, and re-ranking the list based on the domain ontology. The resulting re-ranked list comprises a different ranking of the terms based on relevance of the terms using knowledge from the domain ontology. The method further comprises generating clusters of terms via a trained model adapted to the corpus, and boosting a rank of at least one term of the re-ranked list based on the clusters to increase a relevance of the at least one term using knowledge from the trained model.
    Type: Application
    Filed: April 16, 2020
    Publication date: October 21, 2021
    Inventors: Nandana Mihindukulasooriya, Ruchi Mahindru, Md Faisal Mahbub Chowdhury, Yu Deng, Alfio Massimiliano Gliozzo, Sarthak Dash, Nicolas Rodolfo Fauceglia, Gaetano Rossiello
  • Publication number: 20210303800
    Abstract: One embodiment of the present invention provides a method comprising receiving a text corpus, and generating a first list of triples based on the text corpus. Each triple of the first list comprises a first term representing a candidate hyponym, a second term representing a candidate hypernym, and a frequency value indicative of a number of times a hypernymy relation is observed between the candidate hyponym and the candidate hypernym in the text corpus. The method further comprises training a neural network for hypernym induction based on the first list. The trained neural network is a strict partial order network (SPON) model.
    Type: Application
    Filed: June 9, 2021
    Publication date: September 30, 2021
    Inventors: Sarthak Dash, Alfio Massimiliano Gliozzo, Md Faisal Mahbub Chowdhury
  • Patent number: 11068665
    Abstract: One embodiment of the present invention provides a method comprising receiving a text corpus, and generating a first list of triples based on the text corpus. Each triple of the first list comprises a first term representing a candidate hyponym, a second term representing a candidate hypernym, and a frequency value indicative of a number of times a hypernymy relation is observed between the candidate hyponym and the candidate hypernym in the text corpus. The method further comprises training a neural network for hypernym induction based on the first list. The trained neural network is a strict partial order network (SPON) model.
    Type: Grant
    Filed: September 18, 2019
    Date of Patent: July 20, 2021
    Assignee: International Business Machines Corporation
    Inventors: Sarthak Dash, Alfio Massimiliano Gliozzo, Md Faisal Mahbub Chowdhury
  • Patent number: 11055491
    Abstract: Computer-implemented methods, computer systems and computer program products for providing geographic location specific models for information extraction and knowledge discovery are provided. Aspects include receiving a body of input text using a processor having natural language processing functionality. Aspects also include using information extraction functionality of the processor to extract preliminary information including a relational table from the body of input text. Aspects also include determining one or more geographical contexts associated with the input text based on the preliminary information. Aspects also include determining inferred information based on the preliminary information and the one or more geographical contexts associated with the input text. Aspect also include augmenting the relational table with the inferred information.
    Type: Grant
    Filed: February 5, 2019
    Date of Patent: July 6, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Md Faisal Mahbub Chowdhury, Michael Robert Glass
  • Patent number: 11003701
    Abstract: A query-focused faceted structure generation method, system, and computer program product for generating a query-focused faceted structure from a taxonomy for searching a document corpus, including augmenting taxonomy types with new instances where the instances comprise entities within a proximity of existing instances of taxonomy types in a local embedding of entities parsed from the document corpus, ranking each instance in the augmented taxonomy with respect to its type as a function of both a distance from an instance to a query in a global embedding vector space of the entities trained from the document corpus and a distance of an instance to a type in the local embedding, and ranking the taxonomy types using expanded instances in the document corpus for each type.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: May 11, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Biying Kong, Nidhi Rajshree, Alfio Massimiliano Gliozzo, Nicolas Rodolfo Fauceglia, Robert G. Farrell, Md Faisal Mahbub Chowdhury, Anish Mathur
  • Publication number: 20210125058
    Abstract: Training a machine learning model such as a neural network, which can automatically extract a hypernym from unstructured data, is disclosed. A preliminary candidate list of hyponym-hypernym pairs can be parsed from the corpus. A preliminary super-term-sub-term glossary can be generated from the corpus, the preliminary super-term-sub-term glossary containing one or more super-term-sub-term pairs. A super-term-sub-term pair can be filtered from the preliminary super-term-sub-term glossary, responsive to detecting that the super-term-sub-term pair is not a candidate for hyponym-hypernym pair, to generate a final super-term-sub-term glossary. The preliminary candidate list of hyponym-hypernym pairs and the final super-term-sub-term glossary can be combined to generate a final list of hyponym-hypernym pairs. An artificial neural network can be trained using the final list of hyponym-hypernym pairs as a training data set, the artificial neural network trained to identify a hypernym given new text data.
    Type: Application
    Filed: October 29, 2019
    Publication date: April 29, 2021
    Inventors: Md Faisal Mahbub Chowdhury, Robert G. Farrell, Nicholas Brady Garvan Monath, Michael Robert Glass, Md Arafat Sultan
  • Publication number: 20210081500
    Abstract: One embodiment of the present invention provides a method comprising receiving a text corpus, and generating a first list of triples based on the text corpus. Each triple of the first list comprises a first term representing a candidate hyponym, a second term representing a candidate hypernym, and a frequency value indicative of a number of times a hypernymy relation is observed between the candidate hyponym and the candidate hypernym in the text corpus. The method further comprises training a neural network for hypernym induction based on the first list. The trained neural network is a strict partial order network (SPON) model.
    Type: Application
    Filed: September 18, 2019
    Publication date: March 18, 2021
    Inventors: Sarthak Dash, Alfio Massimiliano Gliozzo, Md Faisal Mahbub Chowdhury
  • Publication number: 20200349203
    Abstract: A query-focused faceted structure generation method, system, and computer program product for generating a query-focused faceted structure from a taxonomy for searching a document collection, including ingesting a document corpus, generating a vector space representation of a query and instances from a taxonomy of the document corpus, and producing a dynamic structure of a relevant facet categories and facet values using a two-vector space representation from the generated vector space representation.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Biying Kong, Nidhi Rajshree, Alfio Massimiliano Gliozzo, Nicolas Rodolfo Fauceglia, Robert G. Farrell, Md Faisal Mahbub Chowdhury, Anish Mathur
  • Publication number: 20200349179
    Abstract: A query-focused faceted structure generation method, system, and computer program product for generating a query-focused faceted structure from a taxonomy for searching a document corpus, including augmenting taxonomy types with new instances where the instances comprise entities within a proximity of existing instances of taxonomy types in a local embedding of entities parsed from the document corpus, ranking each instance in the augmented taxonomy with respect to its type as a function of both a distance from an instance to a query in a global embedding vector space of the entities trained from the document corpus and a distance of an instance to a type in the local embedding, and ranking the taxonomy types using expanded instances in the document corpus for each type.
    Type: Application
    Filed: April 30, 2019
    Publication date: November 5, 2020
    Inventors: Biying Kong, Nidhi Rajshree, Alfio Massimiliano Gliozzo, Nicolas Rodolfo Fauceglia, Robert G. Farrell, Md Faisal Mahbub Chowdhury, Anish Mathur
  • Patent number: 10740559
    Abstract: A terminology extraction method, system, and computer program product include extracting terminology specific to a domain from a corpus of domain-specific text, where no external general domain reference corpus is required. The method assumes that terms which share common noun token(s) in a domain corpus are likely to be very related, that terms which are very related in a domain are likely to be equally or similarly important even though there might be large differences among their term frequencies, and that an abbreviation and its corresponding expansion have equal importance as terms.
    Type: Grant
    Filed: March 27, 2017
    Date of Patent: August 11, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo, Sharon Mary Trewin
  • Publication number: 20200250275
    Abstract: Computer-implemented methods, computer systems and computer program products for providing geographic location specific models for information extraction and knowledge discovery are provided. Aspects include receiving a body of input text using a processor having natural language processing functionality. Aspects also include using information extraction functionality of the processor to extract preliminary information including a relational table from the body of input text. Aspects also include determining one or more geographical contexts associated with the input text based on the preliminary information. Aspects also include determining inferred information based on the preliminary information and the one or more geographical contexts associated with the input text. Aspect also include augmenting the relational table with the inferred information.
    Type: Application
    Filed: February 5, 2019
    Publication date: August 6, 2020
    Inventors: Md Faisal Mahbub Chowdhury, Michael Robert Glass
  • Patent number: 10275454
    Abstract: According to an aspect, a term saliency model is trained to identify salient terms that provide supporting evidence of a candidate answer in a question answering computer system based on a training dataset. The question answering computer system can perform term saliency weighting of a candidate passage to identify one or more salient terms and term weights in the candidate passage based on the term saliency model. The one or more salient terms and term weights can be provided to at least one passage scorer of the question answering computer system to determine whether the candidate passage is justified as providing supporting evidence of the candidate answer.
    Type: Grant
    Filed: October 13, 2014
    Date of Patent: April 30, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Md Faisal Mahbub Chowdhury, Alfio M. Gliozzo, Adam Lally
  • Publication number: 20180276196
    Abstract: A terminology extraction method, system, and computer program product include extracting terminology specific to a domain from a corpus of domain-specific text, where no external general domain reference corpus is required. The method assumes that terms which share common noun token(s) in a domain corpus are likely to be very related, that terms which are very related in a domain are likely to be equally or similarly important even though there might be large differences among their term frequencies, and that an abbreviation and its corresponding expansion have equal importance as terms.
    Type: Application
    Filed: March 27, 2017
    Publication date: September 27, 2018
    Inventors: Md Faisal Mahbub Chowdhury, Alfio Massimiliano Gliozzo, Sharon Mary Trewin
  • Publication number: 20160104075
    Abstract: According to an aspect, a term saliency model is trained to identify salient terms that provide supporting evidence of a candidate answer in a question answering computer system based on a training dataset. The question answering computer system can perform term saliency weighting of a candidate passage to identify one or more salient terms and term weights in the candidate passage based on the term saliency model. The one or more salient terms and term weights can be provided to at least one passage scorer of the question answering computer system to determine whether the candidate passage is justified as providing supporting evidence of the candidate answer.
    Type: Application
    Filed: October 13, 2014
    Publication date: April 14, 2016
    Inventors: Md Faisal Mahbub Chowdhury, Alfio M. Gliozzo, Adam P. Lally