Patents by Inventor Saneem Ahmed Chemmengath

Saneem Ahmed Chemmengath has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Artificial intelligence factsheet generation for speech recognition

Patent number: 12367859

Abstract: A method, system, and computer program product for automated artificial intelligence (AI) factsheet generation for modeling and model customization in speech to text (STT) services. The method receives audio data for a user. The audio data contains human speech. Text data is generated, using a first speech to text model, to represent the human speech of the audio data. A set of transcription errors of the first speech to text model are identified. A set of AI factsheets are generated to describe model metadata for the first speech to text model. Based on the set of transcription errors and the set of AI factsheets, the method generates a second speech to text model customized to the user.

Type: Grant

Filed: June 27, 2022

Date of Patent: July 22, 2025

Assignee: International Business Machines Corporation

Inventors: Shreya Khare, Ashish R. Mittal, Saneem Ahmed Chemmengath, Samarth Bharadwaj, Karthik Sankaranarayanan
Multi-instance, multi-answer training for table and text question answering

Patent number: 12210538

Abstract: Techniques for enhanced table and text question answering based on multi-instance, multi-answer training are presented. An answer extractor component can determine answer scores associated with candidate answer data items based on analysis of a set of data, comprising row data items of a table and passage data items associated with the table, and a context of a query of the set of data. The answer extractor component can be trained based on application of denoised single-instance and multiple-instance answer matching data associated with contexts to an answer extractor model to generate a trained answer extractor model of the answer extractor component. A query response component can determine a correct answer data item responsive to the query from the candidate answer data items based on the answer scores associated with the candidate answer data items, wherein the candidate answer data items can be reranked based on reweighted answer scores.

Type: Grant

Filed: November 8, 2022

Date of Patent: January 28, 2025

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Vishwajeet Kumar, Saneem Ahmed Chemmengath, Jaydeep Sen
Natural language question answering using non-relational tables

Patent number: 12182508

Abstract: A question answering bot that digests non-relational data tables is provided. A processor receives a question regarding a non-relational data table. A processor extracts at least one feature of the question using a natural language processing (NLP) model. A processor extracts at least one similar feature of the non-relational data table to the extracted at least one feature of the question. A processor determines at least one relevant cell in the non-relational data table based on the at least one feature of the question and the at least one similar feature of the non-relational data table. A processor provides an answer to the question, where the answer is based on the at least one relevant cell.

Type: Grant

Filed: March 25, 2022

Date of Patent: December 31, 2024

Assignee: International Business Machines Corporation

Inventors: Vishwajeet Kumar, Jaydeep Sen, Samarth Bharadwaj, Saneem Ahmed Chemmengath, Ioannis Katsis, Mustafa Canim
SYSTEMS AND METHODS TO BUILD ONEQG: A UNIFIED QUESTION GENERATION SYSTEM ACROSS MODALITIES

Publication number: 20240386218

Abstract: One or more systems, devices, computer program products and/or computer-implemented methods of use provided herein relate to building a unified question generation system across languages and modalities. The computer-implemented system can comprise a memory that can store computer executable components. The computer-implemented system can further comprise a processor that can execute the computer executable components stored in the memory, wherein the computer executable components can comprise a training component that can train a unified question generation model to generate questions in a language from a first modality in the language using training data comprising one or more second modalities in the language different from the first modality, wherein the first modality and the one or more second modalities can include at least one of one or more tables, one or more passages, or a combination of the one or more tables and the one or more passages.

Type: Application

Filed: May 15, 2023

Publication date: November 21, 2024

Inventors: Vishwajeet Kumar, Jaydeep Sen, Saneem Ahmed Chemmengath, Rudra V Murthy
Retrieval Aware Question Generation

Publication number: 20240370471

Abstract: Retrieval aware natural language question generation for open domain document retrieval is provided. In one aspect, a system for retrieval aware question generation includes: a question decontextualizer configured to decontextualize a question generated from a context of a target document by adding terms from the context into the question itself to create a decontextualized question, where the decontextualized question alone enables open domain document retrieval without a need for also providing the context. The system can also include a detect document identifier configured to find the terms in the context; and a retriever configured to retrieve documents from the corpus of documents using the decontextualized question. A method for retrieval aware question generation using the present system is also provided.

Type: Application

Filed: May 5, 2023

Publication date: November 7, 2024

Inventors: Saneem Ahmed Chemmengath, Vishwajeet Kumar, Jaydeep Sen
QUESTION GENERATION OVER TABLES AND TEXT

Publication number: 20240330723

Abstract: One or more systems, devices, computer program products and/or computer-implemented methods of use provided herein relate to a process to facilitate a Question Generation System. A system can comprise a memory that stores computer executable components, and a processor that executes the computer executable components stored in the memory, wherein the computer executable components can comprise a receiving component that receives a corpus of documents that contain Tables (Ts) and Passages (Ps) for performing natural language processing (NLP); an executing component that executes the NLP by employing the tables (sT) and passages (Ps) as primary inputs; and a query component that generates an output Question (Q) based on a subset of the tables Ts and passages (Ps).

Type: Application

Filed: March 31, 2023

Publication date: October 3, 2024

Inventors: Saneem Ahmed Chemmengath, Vishwajeet Kumar, Jaydeep Sen, Rudra V Murthy
Contextual dialogue framework over dynamic tables

Patent number: 12050877

Abstract: Methods, systems, and computer program products for providing a contextual dialogue framework over dynamic tables are provided herein. A computer-implemented method includes maintaining a context space for a natural language conversation of a user, wherein the context space comprises a dynamic set of one or more tables used for processing at least one query of the natural language conversation; obtaining an additional table associated with an additional query of the natural language conversation; discovering one or more implicit links between the additional table and the dynamic set of tables; updating the context space with the one or more implicit links; and answering the additional query based at least in part on the updated context space.

Type: Grant

Filed: December 6, 2021

Date of Patent: July 30, 2024

Assignee: International Business Machines Corporation

Inventors: Jaydeep Sen, Samarth Bharadwaj, Saneem Ahmed Chemmengath, Vishwajeet Kumar
MULTI-INSTANCE, MULTI-ANSWER TRAINING FOR TABLE AND TEXT QUESTION ANSWERING

Publication number: 20240160634

Abstract: Techniques for enhanced table and text question answering based on multi-instance, multi-answer training are presented. An answer extractor component can determine answer scores associated with candidate answer data items based on analysis of a set of data, comprising row data items of a table and passage data items associated with the table, and a context of a query of the set of data. The answer extractor component can be trained based on application of denoised single-instance and multiple-instance answer matching data associated with contexts to an answer extractor model to generate a trained answer extractor model of the answer extractor component. A query response component can determine a correct answer data item responsive to the query from the candidate answer data items based on the answer scores associated with the candidate answer data items, wherein the candidate answer data items can be reranked based on reweighted answer scores.

Type: Application

Filed: November 8, 2022

Publication date: May 16, 2024

Inventors: Vishwajeet Kumar, Saneem Ahmed Chemmengath, Jaydeep Sen
Secure ensemble training and inference using heterogeneous private machine learning models

Patent number: 11861476

Abstract: One embodiment provides a method, including: receiving a query from a user; providing the query to data owners, wherein each of the data owners has a local machine learning model and wherein the plurality of data owners train a meta-model; secret sharing model output from the data owners between the other data owners, wherein the model output comprises an output responsive to the query computed using the local machine learning model; receiving, from each of the plurality of data owners, a set of meta-features corresponding to the query; and generating a response to the query, wherein the generating comprises determining, by evaluating the meta-model using the set of meta-features received from each of the plurality of data owners, weights for outputs from the local machine learning models and aggregating the outputs in view of the weights.

Type: Grant

Filed: August 18, 2021

Date of Patent: January 2, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dhinakaran Vinayagamurthy, Sandeep Nishad, Harsh Chaudhari, Pankaj Satyanarayan Dayama, Saneem Ahmed Chemmengath
ARTIFICIAL INTELLIGENCE FACTSHEET GENERATION FOR SPEECH RECOGNITION

Publication number: 20230419950

Abstract: A method, system, and computer program product for automated artificial intelligence (AI) factsheet generation for modeling and model customization in speech to text (STT) services. The method receives audio data for a user. The audio data contains human speech. Text data is generated, using a first speech to text model, to represent the human speech of the audio data. A set of transcription errors of the first speech to text model are identified. A set of AI factsheets are generated to describe model metadata for the first speech to text model. Based on the set of transcription errors and the set of AI factsheets, the method generates a second speech to text model customized to the user.

Type: Application

Filed: June 27, 2022

Publication date: December 28, 2023

Inventors: Shreya Khare, Ashish R. Mittal, Saneem Ahmed Chemmengath, Samarth Bharadwaj, Karthik Sankaranarayanan
SYSTEM AND METHOD FOR GENERATING CONTRASTIVE EXPLANATIONS FOR TEXT GUIDED BY ATTRIBUTES

Publication number: 20230409832

Abstract: A method, computer program product and system are provided to generate perturbed text is provided. A processor receives a string of text from a user. A processor determines one or more classifications for at least one word in the string of text by a classification model. A processor determines a plurality of perturbations of the at least one word based on the one or more classifications, where the plurality of perturbations do not share the same one or more classifications as the least one word in the string of text. A processor selects a perturbation of the string of text based on (i) an edit distance between the string of text and the plurality of perturbations, and (ii) a fluency metric for each of the plurality of perturbations. A processor provides the perturbation of the string of text to the user.

Type: Application

Filed: June 16, 2022

Publication date: December 21, 2023

Inventors: Saneem Ahmed Chemmengath, Amar Prakash Azad, Ronny Luss, Amit Dhurandhar
NATURAL LANGUAGE QUESTION ANSWERING USING NON-RELATIONAL TABLES

Publication number: 20230306199

Abstract: A question answering bot that digests non-relational data tables is provided. A processor receives a question regarding a non-relational data table. A processor extracts at least one feature of the question using a natural language processing (NLP) model. A processor extracts at least one similar feature of the non-relational data table to the extracted at least one feature of the question. A processor determines at least one relevant cell in the non-relational data table based on the at least one feature of the question and the at least one similar feature of the non-relational data table. A processor provides an answer to the question, where the answer is based on the at least one relevant cell.

Type: Application

Filed: March 25, 2022

Publication date: September 28, 2023

Inventors: Vishwajeet Kumar, Jaydeep Sen, Samarth Bharadwaj, Saneem Ahmed Chemmengath, Ioannis Katsis, Mustafa Canim
Confidential information identification based upon communication recipient

Patent number: 11709962

Abstract: One embodiment provides a method, including: receiving an indication of an addition of a new participant in a textual communication between at least two existing participants; identifying at least one confidential topic contained within the textual communication by (i) parsing the textual communication and (ii) identifying at least one topic contained within the textual communication; the identifying comprising (i) accessing a confidentiality graph comprising (a) nodes representing participants and (b) edges representing confidential concepts that are acceptable discussion topics between participants connected by a corresponding edge and (ii) determining that an edge corresponding to the at least one confidential topic does not connect the new participant with both of the existing participants; and alerting one of the existing participants that the at least one confidential topic is included in the textual communication to be sent to the new participant.

Type: Grant

Filed: December 11, 2019

Date of Patent: July 25, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Pooja Aggarwal, Prateeti Mohapatra, Saneem Ahmed Chemmengath, Kuntal Dey
AUTOMATED FEW-SHOT LEARNING TECHNIQUES FOR ARTIFICIAL INTELLIGENCE-BASED QUERY ANSWERING SYSTEMS

Publication number: 20230186147

Abstract: Methods, systems, and computer program products for automated few-shot learning techniques for artificial intelligence-based query answering systems are provided herein. A computer-implemented method includes obtaining multiple sets of queries and answers associated with one or more tables; determining a level of complexity attributed to at least a portion of the queries from the multiple sets of queries and answers; determining, based at least in part on the determined level of complexity attributed to the at least a portion of the queries, one or more new queries for use in training at least one artificial intelligence-based query answering system; facilitating annotation of the one or more new queries; training the at least one artificial intelligence-based query answering system using at least a portion of the one or more annotated new queries; and performing at least one automated action using the at least one trained artificial intelligence-based query answering system.

Type: Application

Filed: December 13, 2021

Publication date: June 15, 2023

Inventors: Jaydeep Sen, Saneem Ahmed Chemmengath, Vishwajeet Kumar, Samarth Bharadwaj
CONTEXTUAL DIALOGUE FRAMEWORK OVER DYNAMIC TABLES

Publication number: 20230177277

Abstract: Methods, systems, and computer program products for providing a contextual dialogue framework over dynamic tables are provided herein. A computer-implemented method includes maintaining a context space for a natural language conversation of a user, wherein the context space comprises a dynamic set of one or more tables used for processing at least one query of the natural language conversation; obtaining an additional table associated with an additional query of the natural language conversation; discovering one or more implicit links between the additional table and the dynamic set of tables; updating the context space with the one or more implicit links; and answering the additional query based at least in part on the updated context space.

Type: Application

Filed: December 6, 2021

Publication date: June 8, 2023

Inventors: Jaydeep Sen, Samarth Bharadwaj, Saneem Ahmed Chemmengath, Vishwajeet Kumar
AUTOMATICALLY GENERATING FACTSHEETS FOR ARTIFICIAL INTELLIGENCE-BASED QUESTION ANSWERING SYSTEMS

Publication number: 20230169363

Abstract: Methods, systems, and computer program products for automatically generating factsheets for artificial intelligence-based question answering systems are provided herein.

Type: Application

Filed: November 30, 2021

Publication date: June 1, 2023

Inventors: Jaydeep Sen, Saneem Ahmed Chemmengath, Vishwajeet Kumar, Samarth Bharadwaj
SECURE ENSEMBLE TRAINING AND INFERENCE USING HETEROGENEOUS PRIVATE MACHINE LEARNING MODELS

Publication number: 20230058219

Abstract: One embodiment provides a method, including: receiving a query from a user; providing the query to data owners, wherein each of the data owners has a local machine learning model and wherein the plurality of data owners train a meta-model; secret sharing model output from the data owners between the other data owners, wherein the model output comprises an output responsive to the query computed using the local machine learning model; receiving, from each of the plurality of data owners, a set of meta-features corresponding to the query; and generating a response to the query, wherein the generating comprises determining, by evaluating the meta-model using the set of meta-features received from each of the plurality of data owners, weights for outputs from the local machine learning models and aggregating the outputs in view of the weights.

Type: Application

Filed: August 18, 2021

Publication date: February 23, 2023

Inventors: Dhinakaran Vinayagamurthy, Sandeep Nishad, Harsh Chaudhari, Pankaj Satyanarayan Dayama, Saneem Ahmed Chemmengath
SELF-SUPERVISION IN TABLE QUESTION ANSWERING

Publication number: 20220309107

Abstract: Methods, systems, and computer program products for self-supervision in table question answering are provided herein. A computer-implemented method includes obtaining a table comprising a plurality of entries, wherein each entry corresponds to a particular column and particular row of the table; identifying one or more of the entries in the table that correspond to a target answer of a natural language query; generating an intermediate representation of the table comprising the rows corresponding to the identified one or more entries, wherein the intermediate representation masks each of the identified one or more entries; and generating a set of natural language question and answer pairs based on the intermediate representation.

Type: Application

Filed: March 29, 2021

Publication date: September 29, 2022

Inventors: Jaydeep Sen, Saneem Ahmed Chemmengath, Samarth Bharadwaj, Vishwajeet Kumar, Mustafa Canim
CONFIDENTIAL INFORMATION IDENTIFICATION BASED UPON COMMUNICATION RECIPIENT

Publication number: 20210182420

Abstract: One embodiment provides a method, including: receiving an indication of an addition of a new participant in a textual communication between at least two existing participants; identifying at least one confidential topic contained within the textual communication by (i) parsing the textual communication and (ii) identifying at least one topic contained within the textual communication; the identifying comprising (i) accessing a confidentiality graph comprising (a) nodes representing participants and (b) edges representing confidential concepts that are acceptable discussion topics between participants connected by a corresponding edge and (ii) determining that an edge corresponding to the at least one confidential topic does not connect the new participant with both of the existing participants; and alerting one of the existing participants that the at least one confidential topic is included in the textual communication to be sent to the new participant.

Type: Application

Filed: December 11, 2019

Publication date: June 17, 2021

Inventors: Pooja Aggarwal, Prateeti Mohapatra, Saneem Ahmed Chemmengath, Kuntal Dey