Patents by Inventor Sameep Mehta

Sameep Mehta has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DOMAIN AWARE EXPLAINABLE ANOMALY AND DRIFT DETECTION FOR MULTI-VARIATE RAW DATA USING A CONSTRAINT REPOSITORY

Publication number: 20210097052

Abstract: Methods, systems, and computer program products for domain aware explainable anomaly and drift detection for multi-variate raw data using a constraint repository are provided herein. A computer-implemented method includes obtaining a set of data and information indicative of a domain of said set of data; obtaining constraints from a domain-indexed constraint repository based on said set of data and said information, wherein the domain-indexed constraint repository comprises a knowledge graph having a plurality of nodes, wherein each node comprises an attribute associated with at least one of a plurality of domains and constraints corresponding to the attribute; detecting anomalies in said set of data based on whether portions of said set of data violate said retrieved constraints; generating an explanation corresponding to each of the anomalies that describe the attributes corresponding to the violated constraints; and outputting an indication of the anomalies and the corresponding explanation.

Type: Application

Filed: September 27, 2019

Publication date: April 1, 2021

Inventors: Sandeep Hans, Samiulla Zakir Hussain Shaikh, Rema Ananthanarayanan, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Pranay Kumar Lohia, Manish Anand Bhide, Sameep Mehta
Stolen machine learning model identification

Patent number: 10936704

Abstract: One embodiment provides a method, including: assigning a machine learning model signature to a machine learning model, wherein the machine learning model signature is generated using (i) data points and (ii) corresponding data labels from training data; receiving input comprising identification of a target machine learning model; acquiring a target signature for the target machine learning model by generating a signature for the target machine learning model using (i) data points from the assigned machine learning model signature and (ii) labels assigned to those data points by the target machine learning model; determining a stolen score by comparing the target signature to the machine learning model signature and identifying the number of data labels that match between the target signature and the machine learning model signature; and classifying the target machine learning model as stolen based upon the stolen score reaching a predetermined threshold.

Type: Grant

Filed: February 21, 2018

Date of Patent: March 2, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sameep Mehta, Rakesh R. Pimplikar, Karibik Sankaranarayanan
TRAINING ARTIFICIAL INTELLIGENCE MODELS USING ACTIVE LEARNING

Publication number: 20210035014

Abstract: Aspects of the present invention provide an approach for reducing bias in active learning. In an embodiment, a data point is selected from a training dataset for a current training iteration while monitoring for data bias at each addition of data to a virtual training dataset. In addition, a machine learning model is examined for bias after adding the selected data point to the virtual training dataset. When data bias and/or model bias is detected, the data point is considered for potential label modification. The selected data point is modified and, if the raw value of the modified data point is within a predefined tolerance and within a bin of a desired class, the modified data point having a label of the target class is retained. Otherwise, it can be discarded.

Type: Application

Filed: July 31, 2019

Publication date: February 4, 2021

Inventors: Kuntal Dey, Sameep Mehta, Manish Anand Bhide
ANNOTATION TASK INSTRUCTION GENERATION

Publication number: 20210034698

Abstract: One embodiment provides a method, including: receiving, from a client, (i) a task of annotating information, (ii) a set of instructions for performing the task, and (iii) client annotations for a subset of the information within the task; assigning the subset to a plurality of annotators; obtaining (i) annotator annotations for the subset and (ii) a response time for providing the annotator annotation for each piece of information within the subset; identifying improvements to the set of instructions by (i) comparing the annotator annotations to the client annotations and (ii) identifying discrepancies made by the annotators in view of the response time; and generating a new set of instructions, wherein the generating comprises (i) identifying at least one feature of the information that distinguishes correctly annotated information from incorrectly annotated information and (ii) generating an instruction from the at least one feature.

Type: Application

Filed: July 31, 2019

Publication date: February 4, 2021

Inventors: Shashank Mujumdar, Nitin Gupta, Arvind Agarwal, Sameep Mehta
Generating a Framework for Prioritizing Machine Learning Model Offerings Via a Platform

Publication number: 20210012404

Abstract: Methods, systems, and computer program products for generating a framework for prioritizing machine learning model offerings via a platform are provided herein. A computer-implemented method includes processing, via a computing platform, a machine learning model input by a first user and metadata corresponding to the machine learning model input by the first user; automatically comparing, via the computing platform, the metadata corresponding to the machine learning model with metadata corresponding to one or more existing machine learning models stored by the computing platform; automatically calculating, via the computing platform, initial pricing information for the machine learning model based on the comparison; and outputting, via an interactive user interface of the computing platform, the machine learning model to one or more additional users for purchase in accordance with the calculated initial pricing information.

Type: Application

Filed: July 8, 2019

Publication date: January 14, 2021

Inventors: Kalapriya Kannan, Samiulla Zakir Hussain Shaikh, Pranay Kumar Lohia, Vijay Arya, Sameep Mehta
Information set purchase recommendations

Patent number: 10885571

Abstract: One embodiment provides a method, including: receiving, at a data service provider, a request from an information purchaser, wherein the request comprises (i) a budget identifying an amount of money to be spent on information and (ii) an objective function identifying a type of information that the information purchaser is requesting; accessing at least a subset of at least one information set of at least one information seller, wherein each of the at least one information sets comprises an information set available for purchase from the information seller; identifying whether at least one accessed information set that fulfills the received request; and providing, if at least one accessed information set fulfills the received request, a recommendation of an information set for purchase by the information purchaser, wherein the provided recommendation comprises at least one of the identified information sets that fulfills the received request.

Type: Grant

Filed: May 16, 2018

Date of Patent: January 5, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Akshar Kaul, Manish Kesarwani, Gagandeep Singh, Sameep Mehta
ENCRYPTION SCHEME RECOMMENDATION

Publication number: 20200387518

Abstract: One embodiment provides a method, including: receiving, from a user, a dataset for encryption before its storage at a data storage location, wherein the dataset comprises a plurality of portions; identifying (i) attributes of the dataset and (ii) dataset dependencies; generating a recommendation for an encryption scheme to be used for the dataset, wherein the generating comprises (i) generating, based upon the attributes and the dataset dependencies, a recommendation of an encryption scheme for each portion of the dataset and (ii) identifying, based upon the dataset dependencies, a key label for each portion of the dataset, wherein the key label identified for a portion of the dataset that is dependent on another portion of the dataset is the same as the key label identified for said another portion of the dataset; and providing, to the user, (i) the generated recommendation and (ii) a description identifying reasons for the generated recommendation.

Type: Application

Filed: June 6, 2019

Publication date: December 10, 2020

Inventors: Manish Kesarwani, Akshar Kaul, Gagandeep Singh, Sameep Mehta, Hong Min, James Willis Pickel
DEEP LEARNING MODEL INSIGHTS USING PROVENANCE DATA

Publication number: 20200380367

Abstract: A method, computer system, and a computer program product for generating deep learning model insights using provenance data is provided. Embodiments of the present invention may include collecting provenance data. Embodiments of the present invention may include generating model insights based on the collected provenance data. Embodiments of the present invention may include generating a training model based on the generated model insights. Embodiments of the present invention may include reducing the training model size. Embodiments of the present invention may include creating a final trained model.

Type: Application

Filed: June 3, 2019

Publication date: December 3, 2020

Inventors: Nitin Gupta, HIMANSHU GUPTA, Rajmohan Chandrahasan, Sameep Mehta, Pranay Kumar Lohia
MODEL QUALITY AND RELATED MODELS USING PROVENANCE DATA

Publication number: 20200372398

Abstract: A method, computer system, and a computer program product for utilizing provenance data to improve machine learning is provided. Embodiments of the present invention may include collecting provenance data. Embodiments of the present invention may include identifying model quality improvements based on the collected provenance data. Embodiments of the present invention may include identifying related models based on the collected provenance data. Embodiments of the present invention may include recommending model quality improvements to a user.

Type: Application

Filed: May 22, 2019

Publication date: November 26, 2020

Inventors: Samiulla Zakir Hussain Shaikh, HIMANSHU GUPTA, Rajmohan Chandrahasan, Sameep Mehta, Manish Anand Bhide
AUTOMATIC SUMMARIZATION WITH BIAS MINIMIZATION

Publication number: 20200372056

Abstract: A processor may receive a record. The record may include one or more segments of text. The processor may tag each segment of text with an indicator. The indicator may denote a specific instance of bias in each of a respective segment of text. The processor may automatically generate a summary of the record. The summary of the record may include a set of segments of text. The set of segments of text may have a different overall bias than the record. The processor may display the summary of the record to a user.

Type: Application

Filed: May 23, 2019

Publication date: November 26, 2020

Inventors: Manish Anand Bhide, Kuntal Dey, Nishtha Madaan, Seema Nagar, Sameep Mehta
AUTOMATIC SUMMARIZATION WITH BIAS MINIMIZATION

Publication number: 20200372101

Abstract: A processor may receive a record. The record may include one or more segments of text. The processor may automatically generate a first summary of the record. The processor may determine an overall bias of the first summary. The overall bias of the first summary may be identified from one or more instances of bias in the first summary. The processor may generate a second summary of the record. The second summary of the record may include an indicator of the overall bias of the first summary. The indicator may include a description of a type of overall bias of the first summary and a numerical value of the overall bias of the first summary. The processor may determine an overall bias of the second summary. The processor may display the second summary of the record to a user.

Type: Application

Filed: May 23, 2019

Publication date: November 26, 2020

Inventors: Manish Anand Bhide, Kuntal Dey, Nishtha Madaan, Seema Nagar, Sameep Mehta
OPERATIONS TO TRANSFORM DATASET TO INTENT

Publication number: 20200364235

Abstract: One embodiment provides a method, including: receiving, from a user, (i) a dataset and (ii) an intended output from the dataset that is generated in view of a given analytical framework for the dataset, wherein the intended output identifies an output that the user wants from the dataset and wherein the dataset is related to an analytical domain; identifying a plurality of dataset functions related to the analytical domain; determining one or more dataset functions for each of one or more operations identified, wherein the one or more operations are identified using the repository to identify operations used to result in an intended output similar to the received intended output; and recommending an ordered subset of the one or more dataset functions to be used to transform the dataset to the intended output, wherein the ordered subset comprises (i) one dataset function for each of the one or more operations and (ii) an order for performing the one or more operations.

Type: Application

Filed: May 14, 2019

Publication date: November 19, 2020

Inventors: Kalapriya Kannan, Sameep Mehta
Edit distance computation on encrypted data

Patent number: 10824755

Abstract: One embodiment provides a method, including: receiving, at a third-party storage provider and from a data owner, a plurality of encrypted documents, wherein each of the plurality of encrypted documents is encrypted by the data owner using at least one encryption key; receiving, from a query user, an encrypted query, wherein the query is encrypted using the at least one encryption key; computing an edit distance value between the encrypted query and at least a portion of the plurality of encrypted documents, wherein the computing comprises communicating with an entity to work together to compute the edit distance value; the communicating comprising (i) providing, from the third-party storage provider to the entity, an encrypted function of an edit distance matrix and (ii) receiving an encrypted edit distance value computed by the entity from the encrypted function; and returning the encrypted edit distance value to the query user.

Type: Grant

Filed: October 11, 2018

Date of Patent: November 3, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Akshar Kaul, Sameep Mehta, Shashank Srivastava
Detecting and delaying effect of machine learning model attacks

Patent number: 10824721

Abstract: One embodiment provides a method for delaying malicious attacks on machine learning models that a trained using input captured from a plurality of users, including: deploying a model, said model designed to be used with an application, for responding to requests received from users, wherein the model comprises a machine learning model that has been previously trained using a data set; receiving input from one or more users; determining, using a malicious input detection technique, if the received input comprises malicious input; if the received input comprises malicious input, removing the malicious input from the input to be used to retrain the model; retraining the model using received input that is determined to not be malicious input; and providing, using the retrained model, a response to a received user query, the retrained model delaying the effect of malicious input on provided responses by removing malicious input from retraining input.

Type: Grant

Filed: May 22, 2018

Date of Patent: November 3, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Manish Kesarwani, Atul Kumar, Vijay Arya, Rakesh R. Pimplikar, Sameep Mehta
BIAS DETECTION FOR UNSTRUCTURED TEXT

Publication number: 20200327424

Abstract: One embodiment provides a method, including: receiving a target unstructured document for determining whether the target unstructured document comprises biased information; identifying an objective of the target unstructured document by extracting, from the target unstructured document, (i) entities and (ii) relationships between the entities; creating a structured knowledge base, wherein the creating comprises (i) creating an entry in the structured knowledge base corresponding to the target unstructured document, (ii) identifying other unstructured documents having a similarity to the target unstructured document, and (iii) generating an entry in the structured knowledge base corresponding to each of the other unstructured documents; applying a bias detection technique on the structured knowledge base; and providing an indication of whether the target unstructured document comprises bias.

Type: Application

Filed: April 15, 2019

Publication date: October 15, 2020

Inventors: Pranay Kumar Lohia, Rajmohan Chandrahasan, Himanshu Gupta, Samiulla Zakir Hussain Shaikh, Sameep Mehta, Atul Kumar
COMMENT-BASED ARTICLE AUGMENTATION

Publication number: 20200302006

Abstract: An article is automatically augmented. The article and one or more comments are received. Comment elements are extracted from the one or more comments, and article elements are extracted from the article. Alignment scores are generated for comment-article pairs based on the extracted comment and article elements. Further, it is determined that at least one comment-article pair has an alignment score at or above a threshold alignment score. At least one augmentation feature is then generated.

Type: Application

Filed: July 15, 2019

Publication date: September 24, 2020

Inventors: Manish Anand Bhide, Nishtha Madaan, Seema Nagar, Sameep Mehta, Kuntal Dey
COMMENT-BASED ARTICLE AUGMENTATION

Publication number: 20200302005

Abstract: An article is automatically augmented. The article and one or more comments are received. Comment elements are extracted from the one or more comments, and article elements are extracted from the article. Alignment scores are generated for comment-article pairs based on the extracted comment and article elements. Further, it is determined that at least one comment-article pair has an alignment score at or above a threshold alignment score. At least one augmentation feature is then generated.

Type: Application

Filed: March 22, 2019

Publication date: September 24, 2020

Inventors: Manish Anand Bhide, Nishtha Madaan, Seema Nagar, Sameep Mehta, Kuntal Dey
Generating a recommended shaping function to integrate data within a data repository

Patent number: 10783161

Abstract: A method includes determining, by a controller, a portion of data that is selected by a user. The portion of data includes source data that is to be transformed by at least one shaping function. The method also includes generating, by the controller, a first output recommendation data that communicates at least one recommended shaping function to apply to the portion of data. The first output recommendation data is generated based on patterns of shaping functions that have been previously chosen. The patterns of shaping functions that have been previously chosen can be chosen by a plurality of system users. The method also includes determining whether to apply the at least one recommended shaping function to the portion of data. The method also includes applying the at least one recommended shaping function based on the determining.

Type: Grant

Filed: December 15, 2017

Date of Patent: September 22, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Manish Bhide, Shabharesh Gudla, Sameep Mehta, Prishni Rateria, Samiulla Shaikh, Neelesh K. Shukla, Paul S. Taylor
Tracking missing data using provenance traces and data simulation

Patent number: 10740209

Abstract: Methods, systems, and computer program products for tracking missing data using provenance traces and data simulation are provided herein. A computer-implemented method includes generating, for each of multiple stages in a data curation sequence, a machine learning model of the data curation sequence, wherein the model is based on historical input records within the data curation sequence, historical output records within the data curation sequence, and provenance data within the data curation sequence; creating a simulated output record based on a detected anomaly corresponding to the data curation sequence; predicting the content of absent input records that precede the simulated output record in the data curation sequence and provenance data corresponding to the simulated output record; and outputting, to a user, in response to a query pertaining to the detected anomaly, the predicted input records and information relating the predicted input records to the detected anomaly.

Type: Grant

Filed: August 20, 2018

Date of Patent: August 11, 2020

Assignee: International Business Machines Corporation

Inventors: Salil Joshi, Hima Prasad Karanam, Manish Kesarwani, Sameep Mehta
Half-pyramid data encryption

Patent number: 10742401

Abstract: One embodiment provides a method, including: receiving, from a data owner, an input string of plaintext data comprising a plurality of characters for storage in a database of a third-party storage provider; arranging the plurality of characters of the input string as a half pyramid, wherein the half pyramid comprises a plurality of rows, each row comprising at least one more character than a preceding row; encrypting, using a secure encryption scheme and based upon a key, each row of the half pyramid independently from each other row of the half pyramid; and storing, in the database of the third-party storage provider, the encrypted rows of the half pyramid. Other aspects are claimed and described.

Type: Grant

Filed: December 19, 2017

Date of Patent: August 11, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Akshar Kaul, Manish Kesarwani, Sameep Mehta, Prasad G. Naldurg, Gagandeep Singh

prev 1 2 3 4 5 6 7 8 9 … next