Patents by Inventor Sandeep Hans

Sandeep Hans has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12045317
    Abstract: An example system includes a processor to receive a set of features, a set of relations between the features, and a set of target features. Each of the target features is associated with a number of the relations. The processor can generate a hypergraph based on the features and the relations. The processor also can select a subset of features based on a transitive closure of the hypergraph for each of the target features. The processor can transmit the selected subset of features.
    Type: Grant
    Filed: November 23, 2021
    Date of Patent: July 23, 2024
    Assignee: International Business Machines Corporation
    Inventors: Eliran Roffe, Sandeep Hans, Eitan Daniel Farchi, Diptikalyan Saha
  • Patent number: 12026087
    Abstract: Methods, systems, and computer program products for automatically testing AI models in connection with enterprise-related properties are provided herein.
    Type: Grant
    Filed: November 29, 2021
    Date of Patent: July 2, 2024
    Assignee: International Business Machines Corporation
    Inventors: Sandeep Hans, Diptikalyan Saha, Aniya Aggarwal
  • Patent number: 11886385
    Abstract: An embodiment for identifying and sorting duplicate datasets within a large pool of heterogeneous datasets may include received a plurality of heterogeneous datasets. The embodiment may automatically compare schema information and metadata within each of the received plurality of heterogeneous datasets to generate name-based similarity scores for each dataset. The embodiment may also automatically compare data distribution information within each of the received plurality of heterogeneous datasets to generate a plurality of data distribution similarity scores for each heterogeneous dataset. The embodiment may further include automatically calculating an overall distance metric using the name-based similarity scores and plurality of data distribution similarity scores. The embodiment may also include based on the calculate overall distance metric, automatically generating distance graphs that identifying clusters of similar datasets and illustrate inferred lineage for the clusters of similar datasets.
    Type: Grant
    Filed: June 2, 2022
    Date of Patent: January 30, 2024
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Praduemn K. Goyal, Sandeep Hans, Samiulla Zakir Hussain Shaikh, Diptikalyan Saha
  • Publication number: 20230394011
    Abstract: An embodiment for identifying and sorting duplicate datasets within a large pool of heterogeneous datasets may include received a plurality of heterogeneous datasets. The embodiment may automatically compare schema information and metadata within each of the received plurality of heterogeneous datasets to generate name-based similarity scores for each dataset. The embodiment may also automatically compare data distribution information within each of the received plurality of heterogeneous datasets to generate a plurality of data distribution similarity scores for each heterogeneous dataset. The embodiment may further include automatically calculating an overall distance metric using the name-based similarity scores and plurality of data distribution similarity scores. The embodiment may also include based on the calculate overall distance metric, automatically generating distance graphs that identifying clusters of similar datasets and illustrate inferred lineage for the clusters of similar datasets.
    Type: Application
    Filed: June 2, 2022
    Publication date: December 7, 2023
    Inventors: Praduemn K. Goyal, Sandeep Hans, Samiulla Zakir Hussain Shaikh, Diptikalyan Saha
  • Publication number: 20230168994
    Abstract: Methods, systems, and computer program products for automatically testing AI models in connection with enterprise-related properties are provided herein.
    Type: Application
    Filed: November 29, 2021
    Publication date: June 1, 2023
    Inventors: Sandeep Hans, Diptikalyan Saha, Aniya Aggarwal
  • Publication number: 20230025731
    Abstract: A computer-implemented method comprising, automatically: analyzing a machine learning dataset which comprises multiple datapoints, to deduce constraints on features of the datapoints; generating a first set of CSP (Constraint Satisfaction Problem) rules expressing the constraints; based on a machine learning model which was trained on the dataset, generating a second set of CSP rules that define one or more perturbation candidates among the features of one of the datapoints; formulating a CSP based on the first and second sets of CSP rules; solving the formulated CSP using a solver; and using the solution of the CSP as a counterfactual explanation of a prediction made by the machine learning model with respect to the one datapoint.
    Type: Application
    Filed: July 19, 2021
    Publication date: January 26, 2023
    Inventors: Michael Vinov, Oleg Blinder, Diptikalyan Saha, Sandeep Hans, Aniya Aggarwal, Omer Yehuda Boehm, Eyal Bin
  • Publication number: 20220343179
    Abstract: Methods, systems, and computer program products for localization-based test generation for individual fairness testing of AI models are provided herein. A computer-implemented method includes obtaining at least one artificial intelligence model and training data related to the at least one artificial intelligence model; identifying one or more boundary regions associated with the at least one artificial intelligence model based at least in part on results of processing at least a portion of the training data using the at least one artificial model; generating, in accordance with at least one of the one or more identified boundary regions, one or more synthetic data points for inclusion with the training data; and executing one or more fairness tests on the at least one artificial intelligence model using at least a portion of the one or more generated synthetic data points and at least a portion of the training data.
    Type: Application
    Filed: April 26, 2021
    Publication date: October 27, 2022
    Inventors: Diptikalyan Saha, Aniya Aggarwal, Sandeep Hans
  • Patent number: 11455554
    Abstract: Methods, systems, and computer program products for improving trustworthiness of artificial intelligence models in presence of anomalous data are provided herein. A method includes obtaining a machine learning model and a set of training data; determining one or more anomalous data points in said set of training data; for a given one of said anomalous data points, identifying attributes that decrease confidence with respect to at least one output of said machine learning model; determining that a root cause of said decreased confidence corresponds to one of: a class imbalance issue related to said at least one attribute, a confused class issue related to said at least one attribute, a low density issue related to said at least one attribute, and an adversarial issue related to said at least one attribute; and performing step(s) to improve said confidence based at least in part on said determined root cause.
    Type: Grant
    Filed: November 25, 2019
    Date of Patent: September 27, 2022
    Assignee: International Business Machines Corporation
    Inventors: Pranay Kumar Lohia, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Rema Ananthanarayanan, Samiulla Zakir Hussain Shaikh, Sandeep Hans
  • Patent number: 11321304
    Abstract: Methods, systems, and computer program products for domain aware explainable anomaly and drift detection for multi-variate raw data using a constraint repository are provided herein. A computer-implemented method includes obtaining a set of data and information indicative of a domain of said set of data; obtaining constraints from a domain-indexed constraint repository based on said set of data and said information, wherein the domain-indexed constraint repository comprises a knowledge graph having a plurality of nodes, wherein each node comprises an attribute associated with at least one of a plurality of domains and constraints corresponding to the attribute; detecting anomalies in said set of data based on whether portions of said set of data violate said retrieved constraints; generating an explanation corresponding to each of the anomalies that describe the attributes corresponding to the violated constraints; and outputting an indication of the anomalies and the corresponding explanation.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: May 3, 2022
    Assignee: International Business Machines Corporation
    Inventors: Sandeep Hans, Samiulla Zakir Hussain Shaikh, Rema Ananthanarayanan, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Pranay Kumar Lohia, Manish Anand Bhide, Sameep Mehta
  • Publication number: 20210158183
    Abstract: Methods, systems, and computer program products for improving trustworthiness of artificial intelligence models in presence of anomalous data are provided herein. A method includes obtaining a machine learning model and a set of training data; determining one or more anomalous data points in said set of training data; for a given one of said anomalous data points, identifying attributes that decrease confidence with respect to at least one output of said machine learning model; determining that a root cause of said decreased confidence corresponds to one of: a class imbalance issue related to said at least one attribute, a confused class issue related to said at least one attribute, a low density issue related to said at least one attribute, and an adversarial issue related to said at least one attribute; and performing step(s) to improve said confidence based at least in part on said determined root cause.
    Type: Application
    Filed: November 25, 2019
    Publication date: May 27, 2021
    Inventors: Pranay Kumar Lohia, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Rema Ananthanarayanan, Samiulla Zakir Hussain Shaikh, Sandeep Hans
  • Publication number: 20210097052
    Abstract: Methods, systems, and computer program products for domain aware explainable anomaly and drift detection for multi-variate raw data using a constraint repository are provided herein. A computer-implemented method includes obtaining a set of data and information indicative of a domain of said set of data; obtaining constraints from a domain-indexed constraint repository based on said set of data and said information, wherein the domain-indexed constraint repository comprises a knowledge graph having a plurality of nodes, wherein each node comprises an attribute associated with at least one of a plurality of domains and constraints corresponding to the attribute; detecting anomalies in said set of data based on whether portions of said set of data violate said retrieved constraints; generating an explanation corresponding to each of the anomalies that describe the attributes corresponding to the violated constraints; and outputting an indication of the anomalies and the corresponding explanation.
    Type: Application
    Filed: September 27, 2019
    Publication date: April 1, 2021
    Inventors: Sandeep Hans, Samiulla Zakir Hussain Shaikh, Rema Ananthanarayanan, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Pranay Kumar Lohia, Manish Anand Bhide, Sameep Mehta
  • Patent number: 9367586
    Abstract: A data validation service includes providing a user interface to a subscriber of the service via a computer device of the subscriber, receiving, via the user interface, a data validation rule specified by the subscriber and an address of a database subject to the data validation, and generating a configuration file that includes the address of the database and an address of a location of executable code corresponding to the data validation rule. The data validation service also includes transmitting the configuration file and remote methods to the computer device over the network. The remote methods are configured to execute the data validation rule with respect to the data and compile results of the execution.
    Type: Grant
    Filed: July 31, 2012
    Date of Patent: June 14, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sandeep Hans, Sameep Mehta, Soujanya Soni
  • Publication number: 20130006692
    Abstract: Methods and arrangements for generating process recommendations. Customer information is assimilated, the customer information including a number of present customers. An efficiency matrix is assimilated, which indicates efficiency for each of a plurality of actual resources with respect to each of a plurality of services. At least one hypothetical resource is created, which incorporates a best efficiency from among the actual resources with respect to each of the plurality of services, and a scheduling policy is assimilated. A customer queue is generated with respect to each hypothetical resource and in accordance with the at least one scheduling policy. Each hypothetical resource is mapped to each actual resource and a minimum parameter increase from among pairs comprising a hypothetical resource and an actual resource is determined.
    Type: Application
    Filed: June 28, 2011
    Publication date: January 3, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sandeep Hans, Sameep Mehta, Gyana Ranjan Parija, Pimplikar Rakesh Rameshrao
  • Publication number: 20120310904
    Abstract: A data validation service includes providing a user interface to a subscriber of the service via a computer device of the subscriber, receiving a data validation rule specified by the subscriber and an address of a database subject to the data validation, and generating a configuration file that includes the address of the database. The service also includes transmitting the configuration file and a thin client application to the computer device over a network, the thin client application configured to read the configuration file and pull data from the database. The service further includes receiving the data from the computer device via the network, performing the data validation by executing the data validation rule with respect to the data, and compiling results of the data validation and providing the results to the computer device.
    Type: Application
    Filed: June 1, 2011
    Publication date: December 6, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINE CORPORATION
    Inventors: Sandeep Hans, Sameep Mehta, Soujanya Soni
  • Publication number: 20120310905
    Abstract: A data validation service includes providing a user interface to a subscriber of the service via a computer device of the subscriber, receiving, via the user interface, a data validation rule specified by the subscriber and an address of a database subject to the data validation, and generating a configuration file that includes the address of the database and an address of a location of executable code corresponding to the data validation rule. The data validation service also includes transmitting the configuration file and remote methods to the computer device over the network. The remote methods are configured to execute the data validation rule with respect to the data and compile results of the execution.
    Type: Application
    Filed: July 31, 2012
    Publication date: December 6, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINE CORPORATION
    Inventors: Sandeep Hans, Sameep Mehta, Soujanya Soni