Patents by Inventor Sameep Mehta

Sameep Mehta has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Bucketing records using temporal point processes

Patent number: 11768860

Abstract: An embodiment establishes a designated attribute value as a semantic criterion for grouping records in a bucket, identifies a first set of records having attribute values that satisfy the semantic criterion, and adds the first set of records to the bucket. The embodiment detects that the first set of records represent a first series of events that occurred in succession at respective times. The embodiment derives a temporal attribute value representative of a time pattern formed by the times of the first series of events and designates the temporal attribute value as a temporal criterion for grouping records in the bucket. The embodiment identifies a second set of records that represent a second series of events and satisfy the temporal criterion and adds the second set of records to the bucket based at least in part on the second set of records satisfying the temporal criterion.

Type: Grant

Filed: November 3, 2021

Date of Patent: September 26, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Avirup Saha, Balaji Ganesan, Shettigar Parkala Srinivas, Sumit Bhatia, Sameep Mehta, Soma Shekar Naganna
AUTOMATED MODEL LINEAGE INFERENCE

Publication number: 20230289649

Abstract: A computer-implemented method, a computer program product, and a computer system for automated model lineage inference. A computer system identifies training datasets which is used to train a machine learning model. A computer system identifies parent datasets from which the training datasets are derived. A computer system identifies associated feature transformations when the training datasets are derived from the parent datasets.

Type: Application

Filed: March 11, 2022

Publication date: September 14, 2023

Inventors: Rajmohan Chandrahasan, Kriti Rajput, Nitin Gupta, HIMANSHU GUPTA, Sameep Mehta, Emma Rose Tucker, Manish Anand Bhide
GENERATING SMART AUTOMATED DATA MOVEMENT WORKFLOWS

Publication number: 20230281212

Abstract: A computer-implemented method generates an automated data movement workflow. The method includes transforming a received request for data, which was received in a restricted natural language form, into a form suitable for accessing a metadata repository. The method further includes identifying data and data dependencies using the transformed request for data. The method further includes building a workflow using the identified data and data dependencies. The method further includes, upon applying at least one governance rule to the workflow, modifying the built workflow to be compliant with the at least one governance rule, and if no compliance with the at least one governance rule is achievable, recommending a change to the built workflow.

Type: Application

Filed: March 7, 2022

Publication date: September 7, 2023

Inventors: Anton Zorin, Manish Kesarwani, Niels Dominic Pardon, Ritesh Kumar Gupta, Sameep Mehta
SELECTING BEST CLOUD COMPUTING ENVIRONMENT IN A HYBRID CLOUD SCENARIO

Publication number: 20230259401

Abstract: Embodiments for identifying an optimal cloud computing environment for a computing task is disclosed. Embodiments comprises receiving a computing task to be executed in a cloud computing environment, wherein the computing task requires a set of cloud computing environment parameter values of the cloud computing environment, pre-selecting a set of candidate cloud computing environments, each of which meets the set of cloud computing environment parameter values, ranking the candidate cloud computing environments using reward-based ranking parameter values of the candidate cloud computing environments as an additional selection constraint, and selecting the highest ranking cloud computing environment as the optimal cloud computing environment for the computing task.

Type: Application

Filed: February 15, 2022

Publication date: August 17, 2023

Inventors: Anton Zorin, Manish Kesarwani, Niels Dominic Pardon, Ritesh Kumar Gupta, Sameep Mehta
Automated classification of data types for databases

Patent number: 11720533

Abstract: Techniques for automatically determining different data types found in databases are disclosed. In one example, a computer implemented method comprises receiving a portion of identifying information for one or more components of a database, and generating one or more descriptions for the one or more components based at least in part on the portion of the identifying information for the one or more components. The one or more descriptions are inputted to one or more machine learning models, and, using the one or more machine learning models, one or more data types associated with the one or more components are predicted. The prediction is based at least in part on the one or more descriptions.

Type: Grant

Filed: November 29, 2021

Date of Patent: August 8, 2023

Assignee: International Business Machines Corporation

Inventors: Rajmohan Chandrahasan, Ankush Gupta, Venkata Nagaraju Pavuluri, Arvind Agarwal, Sameep Mehta
Content modification using natural language processing to include features of interest to various groups

Patent number: 11714963

Abstract: According to one embodiment of the present invention, a system for modifying content associated with an item comprises at least one processor. Features of interest of the item to a plurality of different groups are determined based on user comments produced by members of the plurality of different groups. The members within each group have a common characteristic. The features of interest to each group within the content associated with the item are identified, and the content associated with the item is modified by balancing the features of interest to the plurality of different groups within the content associated with the item. Embodiments of the present invention further include a method and computer program product for modifying content associated with an item in substantially the same manner described above.

Type: Grant

Filed: March 13, 2020

Date of Patent: August 1, 2023

Assignee: International Business Machines Corporation

Inventors: Seema Nagar, Kuntal Dey, Nishtha Madaan, Manish Anand Bhide, Sameep Mehta, Diptikalyan Saha
EFFECTIVE PERFORMANCE ASSESSMENT

Publication number: 20230186197

Abstract: In an approach for effective performance assessment, a processor classifies relevancy of a goal submitted by an employee. A processor classifies the goal into one of pre-defined dimensions. A processor receives feedback about the goal from a manager. A processor classifies whether the feedback is actionable with respect to the corresponding goal. A processor classifies consistency of the feedback with the corresponding dimension of the goal. A processor classifies consistency of the feedback with the corresponding position level of the employee. A processor converts the feedback along the corresponding dimension into a rating for the dimension on a pre-defined scale.

Type: Application

Filed: December 14, 2021

Publication date: June 15, 2023

Inventors: Rakesh Rameshrao Pimplikar, Sameep Mehta, Nazia Hasan, Varun Gupta, Kingshuk Banerjee
PRIORITIZED DATA CLEANING

Publication number: 20230185791

Abstract: Methods, systems, and computer program products for prioritized data cleaning are provided herein. A computer-implemented method includes obtaining a dataset comprising a plurality of data issues; determining a priority of one or more features of the dataset; generating a respective model for each of a plurality of data resolution algorithms, wherein each model indicates computing costs of the corresponding data resolution algorithm for resolving at least portion of the plurality of data issues in an order of the priority of the features; and applying one or more of the plurality of data resolutions algorithm to resolve at least a portion of the data issues in the order of the priority of the features based at least in part on the generated models.

Type: Application

Filed: December 9, 2021

Publication date: June 15, 2023

Inventors: Ritwik Chaudhuri, Sameep Mehta
PRIVACY-PRESERVING CLASS LABEL STANDARDIZATION IN FEDERATED LEARNING SETTINGS

Publication number: 20230177113

Abstract: Methods, systems, and computer program products for privacy-preserving class label standardization in federated learning settings are provided herein. A computer-implemented method includes determining, using one or more data privacy-preserving techniques, a signature for each of one or more classes of data for each of multiple client devices within a federated learning environment; identifying one or more signature matches across at least a portion of the multiple client devices; generating one or more class labels for the one or more classes of data associated with the one or more signature matches; labeling, across the at least a portion of the multiple client devices, the one or more classes of data associated with the one or more signature matches with the one or more generated class labels; and performing one or more automated actions based at least in part on the one or more labeled classes of data.

Type: Application

Filed: December 2, 2021

Publication date: June 8, 2023

Inventors: Shonda Adena Witherspoon, Ramasuri Narayanam, Hima Patel, Sameep Mehta
ADJUSTING MACHINE LEARNING MODELS BASED ON SIMULATED FAIRNESS IMPACT

Publication number: 20230177383

Abstract: Methods, systems, and computer program products for adjusting machine learning models based on simulated fairness impact are provided herein. A computer-implemented method includes obtaining, by a central simulation system, policies to be used for performing a simulation involving machine learning models, implemented on different systems, interacting with a target population; providing information for configuring simulators on the different systems, each simulator representing at least the machine learning model of a given one of the different systems; performing iterations of the simulation for the policies, wherein, for each iteration, the central simulation system: predicts a state of the target population, provides the state to the simulators, and collects metrics based on results of the simulators; and selecting and sending one of the policies to at least one of the different systems based on the collected metrics.

Type: Application

Filed: December 7, 2021

Publication date: June 8, 2023

Inventors: Pranay Kumar Lohia, Kushal Mukherjee, Rakesh Rameshrao Pimplikar, Monika Gupta, Sameep Mehta, Stacy F. Hobson
AUTOMATED FAIRNESS-DRIVEN GRAPH NODE LABEL CLASSIFICATION

Publication number: 20230177355

Abstract: Methods, systems, and computer program products for automated fairness-driven graph node label classification are provided herein. A computer-implemented method includes obtaining at least one input graph; predicting one or more node labels associated with the at least one input graph by processing at least a portion of the at least one input graph using a graph node label prediction model, wherein the graph node label prediction model includes at least one loss function; generating an updated version of the graph node label prediction model based at least in part on the one or more predicted node labels and one or more group fairness-based constraints relevant to the at least one input graph; and performing one or more automated actions using the updated version of the graph node label prediction model.

Type: Application

Filed: December 6, 2021

Publication date: June 8, 2023

Inventors: Ramasuri Narayanam, Sameep Mehta, Rakesh Rameshrao Pimplikar, Pranay Kumar Lohia
AUTOMATED CLASSIFICATION OF DATA TYPES FOR DATABASES

Publication number: 20230169050

Abstract: Techniques for automatically determining different data types found in databases are disclosed. In one example, a computer implemented method comprises receiving a portion of identifying information for one or more components of a database, and generating one or more descriptions for the one or more components based at least in part on the portion of the identifying information for the one or more components. The one or more descriptions are inputted to one or more machine learning models, and, using the one or more machine learning models, one or more data types associated with the one or more components are predicted. The prediction is based at least in part on the one or more descriptions.

Type: Application

Filed: November 29, 2021

Publication date: June 1, 2023

Inventors: Rajmohan Chandrahasan, Ankush Gupta, Venkata Nagaraju Pavuluri, Arvind Agarwal, Sameep Mehta
Data Transformations for Mapping Enterprise Applications

Publication number: 20230169070

Abstract: A computer implemented method, computer system, and computer program product for transforming mapped data fields of enterprise applications. A number of processor units receiving a matching from a source data field to a target data field. The set of processor units receiving a number of annotated examples of transformations from a source format to a target format. Based on the annotated examples, the set of processor units autogenerating a query language expression for transforming data items from the source format to the target format.

Type: Application

Filed: November 29, 2021

Publication date: June 1, 2023

Inventors: Ramkumar Ramalingam, Nagarjuna Surabathina, Thanmayi Mruthyunjaya, Nitin Gupta, Pranay Kumar Lohia, Shanmukha Chaitanya Guttula, Hima Patel, Sameep Mehta, Matu Agarwal, Mudit Mehrotra
AUTOMATIC DATA DOMAIN IDENTIFICATION

Publication number: 20230153537

Abstract: An apparatus is disclosed which includes at least one processing device comprising a processor coupled to a memory. The at least one processing device, when executing program code, is configured to: extract one or more entities identified in a plurality of data artifacts based at least in part on one or more datasets, extract one or more entities identified in a plurality of code artifacts based at least in part on the one or more datasets, extract one or more entities identified in a plurality of user interface artifacts based at least in part on the one or more datasets, generate a set of dependency graphs each based at least in part on one or more relationships among the respective extracted one or more entities, and perform one or more of a lexical analysis and a semantic analysis on the set of dependency graphs to identify a data domain of the one or more datasets.

Type: Application

Filed: November 18, 2021

Publication date: May 18, 2023

Inventors: Malolan Chetlur, Arvind Agarwal, Subhendu Dey, Sameep Mehta, Sandipan Sarkar
BUCKETING RECORDS USING TEMPORAL POINT PROCESSES

Publication number: 20230135407

Abstract: An embodiment establishes a designated attribute value as a semantic criterion for grouping records in a bucket, identifies a first set of records having attribute values that satisfy the semantic criterion, and adds the first set of records to the bucket. The embodiment detects that the first set of records represent a first series of events that occurred in succession at respective times. The embodiment derives a temporal attribute value representative of a time pattern formed by the times of the first series of events and designates the temporal attribute value as a temporal criterion for grouping records in the bucket. The embodiment identifies a second set of records that represent a second series of events and satisfy the temporal criterion and adds the second set of records to the bucket based at least in part on the second set of records satisfying the temporal criterion.

Type: Application

Filed: November 3, 2021

Publication date: May 4, 2023

Applicant: International Business Machines Corporation

Inventors: Avirup Saha, Balaji Ganesan, Shettigar Parkala Srinivas, Sumit Bhatia, Sameep Mehta, Soma Shekar Naganna
FEDERATED LEARNING DATA SOURCE SELECTION

Publication number: 20230128548

Abstract: One embodiment provides a method, including: receiving, at a central server, data from each of a plurality of data sources, the plurality of data sources being within a plurality of data storage locations, wherein the central server includes a validation dataset having a plurality of annotated datapoints; computing, at the central server, an influential score for each of the plurality of data sources based upon the data provided to the central server from each of the plurality of data sources, wherein an influential score of a data source identifies an influence of the data source in accurately predicting annotations of the validation dataset; selecting, at the central server and based upon the influential score of the plurality of data sources, a subset of the plurality of data sources; and generating, at the central server, the training dataset utilizing the data of the data sources included within the subset.

Type: Application

Filed: October 25, 2021

Publication date: April 27, 2023

Inventors: Ruhi Sharma Mittal, Ramasuri Narayanam, Lokesh Nagalapatti, Sameep Mehta
Determining data representative of bias within a model

Patent number: 11636386

Abstract: Methods, systems, and computer program products for determining data representative of bias within a model are provided herein. A computer-implemented method includes obtaining a first dataset on which a model was trained, wherein the first dataset contains protected attributes, and a second dataset on which the model was trained, wherein the protected attributes have been removed from the second dataset; identifying, for each of the one or more protected attributes in the first dataset, one or more attributes in the second dataset correlated therewith; determining bias among at least a portion of the identified correlated attributes; and outputting, to at least one user, identifying information pertaining to the one or more instances of bias.

Type: Grant

Filed: November 21, 2019

Date of Patent: April 25, 2023

Assignee: International Business Machines Corporation

Inventors: Pranay Kumar Lohia, Diptikalyan Saha, Manish Anand Bhide, Sameep Mehta
Extract-transform-load script generation

Patent number: 11630833

Abstract: One embodiment provides a computer implemented method, including: receiving, from a user, a natural language query for data contained within at least one data repository; identifying at least one concept from the natural language query, wherein the at least one concept includes an entity and an intent; identifying a plurality of datasets satisfying the natural language query by querying the at least one data repository utilizing the at least one concept; ranking the dataset based on relevance to the query; generating an extract-transform-load script that extracts, transforms, and loads a dataset selected by the user from the plurality of datasets; and retrieving data included in the dataset utilizing the extract-transform-load script, wherein the retrieving includes returning the data to the user.

Type: Grant

Filed: October 29, 2020

Date of Patent: April 18, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Manish Kesarwani, Sumit Bhatia, Sameep Mehta
FEDERATED DATA STANDARDIZATION USING DATA PRIVACY TECHNIQUES

Publication number: 20230021563

Abstract: Methods, systems, and computer program products for federated data standardization using data privacy techniques are provided herein. A computer-implemented method includes obtaining multiple datasets from multiple clients in accordance with one or more data privacy techniques; determining one or more similar data columns across at least a portion of the multiple datasets; generating one or more column labels for the one or more similar data columns; standardizing at least a portion of data within the one or more similar data columns by processing the one or more generated column labels using at least one federated learning technique; and performing one or more automated actions based at least in part on results of the standardizing of the at least a portion of data within the one or more similar data columns.

Type: Application

Filed: July 23, 2021

Publication date: January 26, 2023

Inventors: Ramasuri Narayanam, Hima Patel, Sameep Mehta
Bias detection for unstructured text

Patent number: 11551102

Abstract: One embodiment provides a method, including: receiving a target unstructured document for determining whether the target unstructured document comprises biased information; identifying an objective of the target unstructured document by extracting, from the target unstructured document, (i) entities and (ii) relationships between the entities; creating a structured knowledge base, wherein the creating comprises (i) creating an entry in the structured knowledge base corresponding to the target unstructured document, (ii) identifying other unstructured documents having a similarity to the target unstructured document, and (iii) generating an entry in the structured knowledge base corresponding to each of the other unstructured documents; applying a bias detection technique on the structured knowledge base; and providing an indication of whether the target unstructured document comprises bias.

Type: Grant

Filed: April 15, 2019

Date of Patent: January 10, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Pranay Kumar Lohia, Rajmohan Chandrahasan, Himanshu Gupta, Samiulla Zakir Hussain Shaikh, Sameep Mehta, Atul Kumar

prev 1 2 3 4 5 6 … next