Patents by Inventor Diptikalyan Saha
Diptikalyan Saha has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
LOCALIZATION-BASED TEST GENERATION FOR INDIVIDUAL FAIRNESS TESTING OF ARTIFICIAL INTELLIGENCE MODELS
Publication number: 20220343179Abstract: Methods, systems, and computer program products for localization-based test generation for individual fairness testing of AI models are provided herein. A computer-implemented method includes obtaining at least one artificial intelligence model and training data related to the at least one artificial intelligence model; identifying one or more boundary regions associated with the at least one artificial intelligence model based at least in part on results of processing at least a portion of the training data using the at least one artificial model; generating, in accordance with at least one of the one or more identified boundary regions, one or more synthetic data points for inclusion with the training data; and executing one or more fairness tests on the at least one artificial intelligence model using at least a portion of the one or more generated synthetic data points and at least a portion of the training data.Type: ApplicationFiled: April 26, 2021Publication date: October 27, 2022Inventors: Diptikalyan Saha, Aniya Aggarwal, Sandeep Hans -
Publication number: 20220335217Abstract: Methods, systems, and computer program products for detecting contextual bias in text are provided herein. A computer-implemented method includes identifying, by a machine learning network, a protected attribute in one or more data samples; processing the identified data samples using a first sub-network of the machine learning network, wherein the first sub-network is configured to determine a plurality of contexts of the protected attribute across the identified data samples; determining an impact of each of the plurality of contexts on a second sub-network of the machine learning network, wherein the second sub-network of the machine learning network is configured to classify a given data sample into one of a plurality of classes; and adjusting the second sub-network of the machine learning to account for the impact of at least one of the plurality of contexts on the second sub-network.Type: ApplicationFiled: April 19, 2021Publication date: October 20, 2022Inventors: Naveen Panwar, Nishtha Madaan, Deepak Vijaykeerthy, Pranay Kumar Lohia, Diptikalyan Saha
-
Patent number: 11475331Abstract: A source of bias identification (SoBI) tool is provided that identifies sources of bias in a dataset. A bias detection operation is performed on results of a computer model, based on an input dataset, to generate groupings of values for a protected attribute corresponding to a detected bias in the operation of the computer model. The SoBI tool generates a plurality of sub-groups for each grouping of values. Each sub-group comprises an individual value, or a sub-range, for the protected attribute. The SoBI tool analyzes each of the sub-groups in the plurality of sub-groups, based on at least one source of bias identification criterion, to identify one or more sources of bias in the input dataset. The SoBI tool outputs a bias notification to an authorized computing device specifying the one or more sources of bias in the input dataset.Type: GrantFiled: June 25, 2020Date of Patent: October 18, 2022Assignee: International Business Machines CorporationInventors: Manish Anand Bhide, Pranay Kumar Lohia, Diptikalyan Saha, Madhavi Katari
-
Patent number: 11455554Abstract: Methods, systems, and computer program products for improving trustworthiness of artificial intelligence models in presence of anomalous data are provided herein. A method includes obtaining a machine learning model and a set of training data; determining one or more anomalous data points in said set of training data; for a given one of said anomalous data points, identifying attributes that decrease confidence with respect to at least one output of said machine learning model; determining that a root cause of said decreased confidence corresponds to one of: a class imbalance issue related to said at least one attribute, a confused class issue related to said at least one attribute, a low density issue related to said at least one attribute, and an adversarial issue related to said at least one attribute; and performing step(s) to improve said confidence based at least in part on said determined root cause.Type: GrantFiled: November 25, 2019Date of Patent: September 27, 2022Assignee: International Business Machines CorporationInventors: Pranay Kumar Lohia, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Rema Ananthanarayanan, Samiulla Zakir Hussain Shaikh, Sandeep Hans
-
Publication number: 20220261535Abstract: Methods, systems, and computer program products for automatically modifying responses from generative models using artificial intelligence techniques are provided herein. A computer-implemented method includes obtaining data pertaining to at least one conversation involving at least one automated conversation exchange software program and at least one user; identifying, among words proposed by the at least one automated conversation exchange software program in connection with the at least one conversation, words qualifying as belonging to one or more predetermined categories by processing the obtained data using artificial intelligence techniques; determining, by processing the identified words and at least one word-based data source, one or more alternate words; modifying at least a portion of the proposed words by replacing at least a portion of the identified words with at least a portion of the one or more alternate words; and performing at least one automated action based on the modifying.Type: ApplicationFiled: February 18, 2021Publication date: August 18, 2022Inventors: Nishtha Madaan, Naveen Panwar, Deepak Vijaykeerthy, Pranay Kumar Lohia, Diptikalyan Saha
-
Publication number: 20220237415Abstract: Methods, systems, and computer program products for priority-based, accuracy-controlled individual fairness of unstructured text are provided herein. A method includes identifying one or more samples in a set of data used to train a machine learning model having at least one attribute; generating counterfactual samples for each of the one or more identified samples; calculating scores for the one or more identified samples based at least in part on output of the machine learning model with respect to the counterfactual samples, wherein the scores indicate a relative level of bias between the one or more identified samples corresponding to the at least one attribute; creating an enhanced set of data at least in part by supplementing at least a portion of the identified samples with the corresponding counterfactual samples based on the calculated scores; and training the machine learning model using the enhanced set of data.Type: ApplicationFiled: January 28, 2021Publication date: July 28, 2022Inventors: Pranay Kumar Lohia, Deepak Vijaykeerthy, Diptikalyan Saha, Nishtha Madaan, Naveen Panwar
-
Publication number: 20220237074Abstract: A system, computer program product, and method are presented for providing replacement data for data in a time series data stream that has issues indicative of errors, where the data issues and the replacement data are related to one or more KPIs. The method includes determining one or more predicted replacement values for potentially erroneous data instances in the time series data stream. The method further includes resolving the potentially erroneous data instances with one predicted replacement value of the one or more predicted replacement values in the time series data stream.Type: ApplicationFiled: March 29, 2022Publication date: July 28, 2022Inventors: Vitobha Munigala, Diptikalyan Saha, Sattwati Kundu, Geetha Adinarayan
-
Patent number: 11379347Abstract: Methods, systems and computer program products for automated test case generation are provided herein. A computer-implemented method includes selecting sample input data as a test case for a system under test, executing the test case on the system under test to obtain a result, and applying the result to a local explainer function to obtain at least a portion of a corresponding decision tree. The method further includes determining at least one path constraint from the decision tree, solving the path constraint to obtain a solution, and generating at least one other test case for the system under test based at least in part on the solution of the path constraint. The steps of the method are illustratively repeated in each of one or more additional iterations until at least one designated stopping criterion is met. The resulting test cases form a test suite for testing of a deep neural network (DNN) or other system.Type: GrantFiled: December 28, 2020Date of Patent: July 5, 2022Assignee: International Business Machines CorporationInventors: Diptikalyan Saha, Aniya Aggarwal, Pranay Lohia, Kuntal Dey
-
Patent number: 11321304Abstract: Methods, systems, and computer program products for domain aware explainable anomaly and drift detection for multi-variate raw data using a constraint repository are provided herein. A computer-implemented method includes obtaining a set of data and information indicative of a domain of said set of data; obtaining constraints from a domain-indexed constraint repository based on said set of data and said information, wherein the domain-indexed constraint repository comprises a knowledge graph having a plurality of nodes, wherein each node comprises an attribute associated with at least one of a plurality of domains and constraints corresponding to the attribute; detecting anomalies in said set of data based on whether portions of said set of data violate said retrieved constraints; generating an explanation corresponding to each of the anomalies that describe the attributes corresponding to the violated constraints; and outputting an indication of the anomalies and the corresponding explanation.Type: GrantFiled: September 27, 2019Date of Patent: May 3, 2022Assignee: International Business Machines CorporationInventors: Sandeep Hans, Samiulla Zakir Hussain Shaikh, Rema Ananthanarayanan, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Pranay Kumar Lohia, Manish Anand Bhide, Sameep Mehta
-
Patent number: 11314584Abstract: A system, computer program product, and method are presented for providing confidence values for replacement data for data that has issues indicative of errors, where the data issues, the replacement data, and confidence values are related to one or more KPIs. The method includes identifying one or more potentially erroneous data instances and determining one or more predicted replacement values for the potentially erroneous data instances. The method further includes determining a confidence value for each predicted replacement value and resolving the one or more potentially erroneous data instances with one predicted replacement value of the one or more predicted replacement values. The method also includes generating an explanatory basis for the resolution of the one or more potentially erroneous data instances.Type: GrantFiled: November 25, 2020Date of Patent: April 26, 2022Assignee: International Business Machines CorporationInventors: Vitobha Munigala, Diptikalyan Saha, Sattwati Kundu, Geetha Adinarayan
-
Patent number: 11302096Abstract: Methods, systems, and computer program products for determining model-related bias associated with training data are provided herein. A computer-implemented method includes obtaining, via execution of a first model, class designations attributed to data points used to train the first model; identifying any of the data points associated with an inaccurate class designation and/or a low-confidence class designation; training a second model using the data points from the dataset, but excluding the identified data points; determining bias related to at least a portion of those data points used to train the second model by: modifying one or more of the data points used to train the second model; executing the first model using the modified data points; and identifying a change to one or more class designations attributed to the modified data points as compared to before the modifying; and outputting identifying information pertaining to the determined bias.Type: GrantFiled: November 21, 2019Date of Patent: April 12, 2022Assignee: International Business Machines CorporationInventors: Pranay Kumar Lohia, Diptikalyan Saha, Manish Anand Bhide, Sameep Mehta
-
Patent number: 11294907Abstract: One embodiment provides a method, including: receiving a query from a user; identifying that a desired definition of the at least one term is unknown, by determining that the at least one term does not map to a term having a known definition; receiving the definition of the at least one term from the user; adding the definition to a domain grammar comprising (i) domain-specific terminology and (ii) definitions corresponding to the terms within the domain grammar, wherein the adding comprises (a) extracting expressions from the requested definition and (b) adding, for the at least one term, the expressions into a structured format within the domain grammar; combining (iii) the requested definition and (iv) terms from the parsed query having previously known definitions into a complete query; and providing a response to the query by executing the complete query on a knowledge store.Type: GrantFiled: March 5, 2020Date of Patent: April 5, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jaydeep Sen, Ashish Mittal, Diptikalyan Saha, Karthik Sankaranarayanan
-
Publication number: 20220083897Abstract: A method, system, and computer program product for explaining predictions made by black box time series models. The method may include identifying a black box time series model. The method may also include predicting one or more time instances using the black box time series model. The method may also include selecting a predicted time instance from the predicted data. The method may also include receiving training data for the black box time series model. The method may also include generating a set of white box time series models similar to the black box time series model. The method may also include selecting a preferred white box time series model. The method may also include analyzing behavior of the preferred white box time series model. The method may also include generating an explanation illustrating why the black box time series model forecasted the predicted time instance.Type: ApplicationFiled: September 11, 2020Publication date: March 17, 2022Inventors: Diptikalyan Saha, Philips George John, Vitobha Munigala
-
Publication number: 20210406712Abstract: A source of bias identification (SoBI) tool is provided that identifies sources of bias in a dataset. A bias detection operation is performed on results of a computer model, based on an input dataset, to generate groupings of values for a protected attribute corresponding to a detected bias in the operation of the computer model. The SoBI tool generates a plurality of sub-groups for each grouping of values. Each sub-group comprises an individual value, or a sub-range, for the protected attribute. The SoBI tool analyzes each of the sub-groups in the plurality of sub-groups, based on at least one source of bias identification criterion, to identify one or more sources of bias in the input dataset. The SoBI tool outputs a bias notification to an authorized computing device specifying the one or more sources of bias in the input dataset.Type: ApplicationFiled: June 25, 2020Publication date: December 30, 2021Inventors: Manish Anand Bhide, Pranay Kumar Lohia, Diptikalyan Saha, Madhavi Katari
-
Patent number: 11200222Abstract: Embodiments are disclosed for correcting a natural language interface database (NLIDB) system. The techniques include receiving feedback indicating that an answer provided in response to a question for an NLIDB system is inaccurate. The techniques further include finding an ontology element for a datastore of the NLIDB system that matches to the feedback. The techniques also include selecting candidate annotations for the NLIDB system based on the ontology element and a data type of the ontology element. Additionally, the techniques include generating a question-answer (QA) pair for each of the candidate annotations. Further, the techniques include adding one of the candidate annotations to annotations for a natural language query (NLQ) engine of the NLIDB system based on a client verification of the QA pair.Type: GrantFiled: April 24, 2019Date of Patent: December 14, 2021Assignee: International Business Machines CorporationInventors: Jaydeep Sen, Diptikalyan Saha, Karthik Sankaranarayanan, Ashish Mittal, Manasa Jammi
-
Patent number: 11163960Abstract: Techniques for the automatic semantic analysis and comparison of chatbot capabilities are disclosed. A first chatbot specification associated with a first chatbot is obtained that includes a first plurality of characteristics arranged in a plurality of categories. A second chatbot specification associated with a second chatbot is obtained that includes a second plurality of characteristics arranged in the plurality of categories. One or more differences between the first plurality of characteristics and the second plurality of characteristics for each of the plurality of categories are identified based at least in part on the first plurality of characteristics and the second plurality of characteristics. A natural language expression corresponding to the identified one or more differences is generated and presented to a user via a graphical user interface.Type: GrantFiled: April 18, 2019Date of Patent: November 2, 2021Assignee: International Business Machines CorporationInventors: Diptikalyan Saha, Rema Ananthanarayanan
-
Publication number: 20210286945Abstract: According to one embodiment of the present invention, a system for modifying content associated with an item comprises at least one processor. Features of interest of the item to a plurality of different groups are determined based on user comments produced by members of the plurality of different groups. The members within each group have a common characteristic. The features of interest to each group within the content associated with the item are identified, and the content associated with the item is modified by balancing the features of interest to the plurality of different groups within the content associated with the item. Embodiments of the present invention further include a method and computer program product for modifying content associated with an item in substantially the same manner described above.Type: ApplicationFiled: March 13, 2020Publication date: September 16, 2021Inventors: Seema Nagar, Kuntal Dey, Nishtha Madaan, Manish Anand Bhide, Sameep Mehta, Diptikalyan Saha
-
Publication number: 20210279243Abstract: One embodiment provides a method, including: receiving a query from a user; identifying that a desired definition of the at least one term is unknown, by determining that the at least one term does not map to a term having a known definition; receiving the definition of the at least one term from the user; adding the definition to a domain grammar comprising (i) domain-specific terminology and (ii) definitions corresponding to the terms within the domain grammar, wherein the adding comprises (a) extracting expressions from the requested definition and (b) adding, for the at least one term, the expressions into a structured format within the domain grammar; combining (iii) the requested definition and (iv) terms from the parsed query having previously known definitions into a complete query; and providing a response to the query by executing the complete query on a knowledge store.Type: ApplicationFiled: March 5, 2020Publication date: September 9, 2021Inventors: Jaydeep Sen, Ashish Mittal, Diptikalyan Saha, Karthik Sankaranarayanan
-
Publication number: 20210279607Abstract: A computer-implemented method according to one embodiment includes identifying an occurrence of accuracy drift by a trained model; identifying data associated with the accuracy drift, utilizing a drift detection model (DDM) constructed for the trained model; applying the data associated with the accuracy drift to a decision tree to determine a feature space and specific subset of the data causing the accuracy drift; analyzing a distribution of features within the feature space for the specific subset of the data causing the accuracy drift to determine specific features of the data causing the accuracy drift; and returning the specific features of the data causing the accuracy drift.Type: ApplicationFiled: March 9, 2020Publication date: September 9, 2021Inventors: Manish Anand Bhide, Pranay Kumar Lohia, Diptikalyan Saha, Madhavi Katari
-
Publication number: 20210248455Abstract: Methods, systems, and computer program products for generating explanations for a semantic parser are provided herein. A computer-implemented method includes providing to a generative model (i) at least one query and (ii) a context of at least one dataset applicable to the at least one query, wherein the generative model generates a plurality of perturbations for the at least one input query based on the context; providing the plurality of perturbations as inputs to a context aware sequence-to-sequence model, thereby obtaining a plurality of outputs; and generating, for (i) an additional query provided as input to the context aware sequence-to-sequence model and (ii) a context applicable to the additional query, an explanation indicative of one or more parts of the additional query that contributes to an output corresponding to the additional query, based at least in part on the plurality of outputs corresponding to the perturbations.Type: ApplicationFiled: February 6, 2020Publication date: August 12, 2021Inventors: Rachamalla Anirudh Reddy, Pranay Kumar Lohia, Samiulla Zakir Hussain Shaikh, Diptikalyan Saha, Sameep Mehta