Patents by Inventor Sandeep Hans
Sandeep Hans has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12045317Abstract: An example system includes a processor to receive a set of features, a set of relations between the features, and a set of target features. Each of the target features is associated with a number of the relations. The processor can generate a hypergraph based on the features and the relations. The processor also can select a subset of features based on a transitive closure of the hypergraph for each of the target features. The processor can transmit the selected subset of features.Type: GrantFiled: November 23, 2021Date of Patent: July 23, 2024Assignee: International Business Machines CorporationInventors: Eliran Roffe, Sandeep Hans, Eitan Daniel Farchi, Diptikalyan Saha
-
Patent number: 12026087Abstract: Methods, systems, and computer program products for automatically testing AI models in connection with enterprise-related properties are provided herein.Type: GrantFiled: November 29, 2021Date of Patent: July 2, 2024Assignee: International Business Machines CorporationInventors: Sandeep Hans, Diptikalyan Saha, Aniya Aggarwal
-
Patent number: 11886385Abstract: An embodiment for identifying and sorting duplicate datasets within a large pool of heterogeneous datasets may include received a plurality of heterogeneous datasets. The embodiment may automatically compare schema information and metadata within each of the received plurality of heterogeneous datasets to generate name-based similarity scores for each dataset. The embodiment may also automatically compare data distribution information within each of the received plurality of heterogeneous datasets to generate a plurality of data distribution similarity scores for each heterogeneous dataset. The embodiment may further include automatically calculating an overall distance metric using the name-based similarity scores and plurality of data distribution similarity scores. The embodiment may also include based on the calculate overall distance metric, automatically generating distance graphs that identifying clusters of similar datasets and illustrate inferred lineage for the clusters of similar datasets.Type: GrantFiled: June 2, 2022Date of Patent: January 30, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Praduemn K. Goyal, Sandeep Hans, Samiulla Zakir Hussain Shaikh, Diptikalyan Saha
-
Publication number: 20230394011Abstract: An embodiment for identifying and sorting duplicate datasets within a large pool of heterogeneous datasets may include received a plurality of heterogeneous datasets. The embodiment may automatically compare schema information and metadata within each of the received plurality of heterogeneous datasets to generate name-based similarity scores for each dataset. The embodiment may also automatically compare data distribution information within each of the received plurality of heterogeneous datasets to generate a plurality of data distribution similarity scores for each heterogeneous dataset. The embodiment may further include automatically calculating an overall distance metric using the name-based similarity scores and plurality of data distribution similarity scores. The embodiment may also include based on the calculate overall distance metric, automatically generating distance graphs that identifying clusters of similar datasets and illustrate inferred lineage for the clusters of similar datasets.Type: ApplicationFiled: June 2, 2022Publication date: December 7, 2023Inventors: Praduemn K. Goyal, Sandeep Hans, Samiulla Zakir Hussain Shaikh, Diptikalyan Saha
-
Publication number: 20230168994Abstract: Methods, systems, and computer program products for automatically testing AI models in connection with enterprise-related properties are provided herein.Type: ApplicationFiled: November 29, 2021Publication date: June 1, 2023Inventors: Sandeep Hans, Diptikalyan Saha, Aniya Aggarwal
-
Publication number: 20230025731Abstract: A computer-implemented method comprising, automatically: analyzing a machine learning dataset which comprises multiple datapoints, to deduce constraints on features of the datapoints; generating a first set of CSP (Constraint Satisfaction Problem) rules expressing the constraints; based on a machine learning model which was trained on the dataset, generating a second set of CSP rules that define one or more perturbation candidates among the features of one of the datapoints; formulating a CSP based on the first and second sets of CSP rules; solving the formulated CSP using a solver; and using the solution of the CSP as a counterfactual explanation of a prediction made by the machine learning model with respect to the one datapoint.Type: ApplicationFiled: July 19, 2021Publication date: January 26, 2023Inventors: Michael Vinov, Oleg Blinder, Diptikalyan Saha, Sandeep Hans, Aniya Aggarwal, Omer Yehuda Boehm, Eyal Bin
-
LOCALIZATION-BASED TEST GENERATION FOR INDIVIDUAL FAIRNESS TESTING OF ARTIFICIAL INTELLIGENCE MODELS
Publication number: 20220343179Abstract: Methods, systems, and computer program products for localization-based test generation for individual fairness testing of AI models are provided herein. A computer-implemented method includes obtaining at least one artificial intelligence model and training data related to the at least one artificial intelligence model; identifying one or more boundary regions associated with the at least one artificial intelligence model based at least in part on results of processing at least a portion of the training data using the at least one artificial model; generating, in accordance with at least one of the one or more identified boundary regions, one or more synthetic data points for inclusion with the training data; and executing one or more fairness tests on the at least one artificial intelligence model using at least a portion of the one or more generated synthetic data points and at least a portion of the training data.Type: ApplicationFiled: April 26, 2021Publication date: October 27, 2022Inventors: Diptikalyan Saha, Aniya Aggarwal, Sandeep Hans -
Patent number: 11455554Abstract: Methods, systems, and computer program products for improving trustworthiness of artificial intelligence models in presence of anomalous data are provided herein. A method includes obtaining a machine learning model and a set of training data; determining one or more anomalous data points in said set of training data; for a given one of said anomalous data points, identifying attributes that decrease confidence with respect to at least one output of said machine learning model; determining that a root cause of said decreased confidence corresponds to one of: a class imbalance issue related to said at least one attribute, a confused class issue related to said at least one attribute, a low density issue related to said at least one attribute, and an adversarial issue related to said at least one attribute; and performing step(s) to improve said confidence based at least in part on said determined root cause.Type: GrantFiled: November 25, 2019Date of Patent: September 27, 2022Assignee: International Business Machines CorporationInventors: Pranay Kumar Lohia, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Rema Ananthanarayanan, Samiulla Zakir Hussain Shaikh, Sandeep Hans
-
Patent number: 11321304Abstract: Methods, systems, and computer program products for domain aware explainable anomaly and drift detection for multi-variate raw data using a constraint repository are provided herein. A computer-implemented method includes obtaining a set of data and information indicative of a domain of said set of data; obtaining constraints from a domain-indexed constraint repository based on said set of data and said information, wherein the domain-indexed constraint repository comprises a knowledge graph having a plurality of nodes, wherein each node comprises an attribute associated with at least one of a plurality of domains and constraints corresponding to the attribute; detecting anomalies in said set of data based on whether portions of said set of data violate said retrieved constraints; generating an explanation corresponding to each of the anomalies that describe the attributes corresponding to the violated constraints; and outputting an indication of the anomalies and the corresponding explanation.Type: GrantFiled: September 27, 2019Date of Patent: May 3, 2022Assignee: International Business Machines CorporationInventors: Sandeep Hans, Samiulla Zakir Hussain Shaikh, Rema Ananthanarayanan, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Pranay Kumar Lohia, Manish Anand Bhide, Sameep Mehta
-
Publication number: 20210158183Abstract: Methods, systems, and computer program products for improving trustworthiness of artificial intelligence models in presence of anomalous data are provided herein. A method includes obtaining a machine learning model and a set of training data; determining one or more anomalous data points in said set of training data; for a given one of said anomalous data points, identifying attributes that decrease confidence with respect to at least one output of said machine learning model; determining that a root cause of said decreased confidence corresponds to one of: a class imbalance issue related to said at least one attribute, a confused class issue related to said at least one attribute, a low density issue related to said at least one attribute, and an adversarial issue related to said at least one attribute; and performing step(s) to improve said confidence based at least in part on said determined root cause.Type: ApplicationFiled: November 25, 2019Publication date: May 27, 2021Inventors: Pranay Kumar Lohia, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Rema Ananthanarayanan, Samiulla Zakir Hussain Shaikh, Sandeep Hans
-
Publication number: 20210097052Abstract: Methods, systems, and computer program products for domain aware explainable anomaly and drift detection for multi-variate raw data using a constraint repository are provided herein. A computer-implemented method includes obtaining a set of data and information indicative of a domain of said set of data; obtaining constraints from a domain-indexed constraint repository based on said set of data and said information, wherein the domain-indexed constraint repository comprises a knowledge graph having a plurality of nodes, wherein each node comprises an attribute associated with at least one of a plurality of domains and constraints corresponding to the attribute; detecting anomalies in said set of data based on whether portions of said set of data violate said retrieved constraints; generating an explanation corresponding to each of the anomalies that describe the attributes corresponding to the violated constraints; and outputting an indication of the anomalies and the corresponding explanation.Type: ApplicationFiled: September 27, 2019Publication date: April 1, 2021Inventors: Sandeep Hans, Samiulla Zakir Hussain Shaikh, Rema Ananthanarayanan, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Pranay Kumar Lohia, Manish Anand Bhide, Sameep Mehta
-
Patent number: 9367586Abstract: A data validation service includes providing a user interface to a subscriber of the service via a computer device of the subscriber, receiving, via the user interface, a data validation rule specified by the subscriber and an address of a database subject to the data validation, and generating a configuration file that includes the address of the database and an address of a location of executable code corresponding to the data validation rule. The data validation service also includes transmitting the configuration file and remote methods to the computer device over the network. The remote methods are configured to execute the data validation rule with respect to the data and compile results of the execution.Type: GrantFiled: July 31, 2012Date of Patent: June 14, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sandeep Hans, Sameep Mehta, Soujanya Soni
-
Publication number: 20130006692Abstract: Methods and arrangements for generating process recommendations. Customer information is assimilated, the customer information including a number of present customers. An efficiency matrix is assimilated, which indicates efficiency for each of a plurality of actual resources with respect to each of a plurality of services. At least one hypothetical resource is created, which incorporates a best efficiency from among the actual resources with respect to each of the plurality of services, and a scheduling policy is assimilated. A customer queue is generated with respect to each hypothetical resource and in accordance with the at least one scheduling policy. Each hypothetical resource is mapped to each actual resource and a minimum parameter increase from among pairs comprising a hypothetical resource and an actual resource is determined.Type: ApplicationFiled: June 28, 2011Publication date: January 3, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sandeep Hans, Sameep Mehta, Gyana Ranjan Parija, Pimplikar Rakesh Rameshrao
-
Publication number: 20120310904Abstract: A data validation service includes providing a user interface to a subscriber of the service via a computer device of the subscriber, receiving a data validation rule specified by the subscriber and an address of a database subject to the data validation, and generating a configuration file that includes the address of the database. The service also includes transmitting the configuration file and a thin client application to the computer device over a network, the thin client application configured to read the configuration file and pull data from the database. The service further includes receiving the data from the computer device via the network, performing the data validation by executing the data validation rule with respect to the data, and compiling results of the data validation and providing the results to the computer device.Type: ApplicationFiled: June 1, 2011Publication date: December 6, 2012Applicant: INTERNATIONAL BUSINESS MACHINE CORPORATIONInventors: Sandeep Hans, Sameep Mehta, Soujanya Soni
-
Publication number: 20120310905Abstract: A data validation service includes providing a user interface to a subscriber of the service via a computer device of the subscriber, receiving, via the user interface, a data validation rule specified by the subscriber and an address of a database subject to the data validation, and generating a configuration file that includes the address of the database and an address of a location of executable code corresponding to the data validation rule. The data validation service also includes transmitting the configuration file and remote methods to the computer device over the network. The remote methods are configured to execute the data validation rule with respect to the data and compile results of the execution.Type: ApplicationFiled: July 31, 2012Publication date: December 6, 2012Applicant: INTERNATIONAL BUSINESS MACHINE CORPORATIONInventors: Sandeep Hans, Sameep Mehta, Soujanya Soni