Patents by Inventor Sameep Mehta

Sameep Mehta has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Schema recommendations

Patent number: 12688168

Abstract: Embodiments analyze at least one data source from at least one database to extract characteristics for the at least one data source; generate at least one recommendation based on the extracted characteristics for the at least one data source; perform conflict resolution on the generated at least one recommendation by removing at least one conflicted recommendation in the at least one recommendation; determine a chosen recommendation (CR) set based on the conflict resolution; rank each recommendation in the CR set based on a ranking score; and display a visual summary of each recommendation in the CR set.

Type: Grant

Filed: January 23, 2025

Date of Patent: July 21, 2026

Assignee: International Business Machines Corporation

Inventors: Akshar Kaul, Hima Patel, Shazia Afzal, Sameep Mehta, Vaibhav Sudhakar Dantale
TRANSFORMING SEMANTICALLY EQUIVALENT DATA USING LARGE LANGUAGE MODELS

Publication number: 20260178844

Abstract: A computer-implemented method for transforming data is provided. A processor set receives a number of data pairs. The processor set creates a program graph for each data pair in the number of data pairs. The processor set identifies a number of paths between nodes in each program graph. The processor set identifies a number of common paths from the number of paths based on common characters between the input data and the output data for each data pair. The processor set identifies a set of nodes in the program graphs based on the number of common paths. The processor set generates a prompt for a large language model based on the number of data pairs and the set of nodes that represent positions of unmatched characters between input data and output data in the number of data pairs.

Type: Application

Filed: December 24, 2024

Publication date: June 25, 2026

Inventors: Nitin Gupta, Shramona Chakraborty, Hima Patel, Sameep Mehta
Automated model lineage inference

Patent number: 12619907

Abstract: A computer-implemented method, a computer program product, and a computer system for automated model lineage inference. A computer system identifies training datasets which is used to train a machine learning model. A computer system identifies parent datasets from which the training datasets are derived. A computer system identifies associated feature transformations when the training datasets are derived from the parent datasets.

Type: Grant

Filed: March 11, 2022

Date of Patent: May 5, 2026

Assignee: International Business Machines Corporation

Inventors: Rajmohan Chandrahasan, Kriti Rajput, Nitin Gupta, Himanshu Gupta, Sameep Mehta, Emma Rose Tucker, Manish Anand Bhide
System and method for robust GraphQL query evaluation

Patent number: 12561319

Abstract: A system includes a processor that executes computer executable components stored in a memory. The computer executable components can comprise a generation component that generates a GraphQL query. The computer executable components can further comprise a comparison component that compares a predefined GraphQL query to the generated GraphQL query. The computer executable components can further comprise a data mutation component that creates data variations to execute on the predefined GraphQL query and the generated GraphQL query to identify false positives or false negatives of execution equivalence of the queries. The computer executable components can further comprise an evaluation component that compares results of the executed queries to determine sufficiency of the generated GraphQL query.

Type: Grant

Filed: January 3, 2025

Date of Patent: February 24, 2026

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sambit Ghosh, Manish Kesarwani, Nitin Gupta, Renuka Sindhgatta Rajan, Sameep Mehta
PERFORMING IN-CONTEXT LEARNING USING SELECT POOL ENTRIES

Publication number: 20260050653

Abstract: A method, according to one approach, includes: receiving a test sample having schema information and natural language information. The schema information is compared to a pool of entries that correspond to a given query language. One or more entries in the pool that match the schema information of the test sample are identified. One or more entries in the pool that match the natural language information of the test sample are also identified. The method also includes merging selected ones of the entries that match the schema information and selected ones of the entries that match the natural language information. Furthermore, a large language model performs in-context learning using the merged entries and the test sample.

Type: Application

Filed: August 13, 2024

Publication date: February 19, 2026

Inventors: Nitin Gupta, Manish Kesarwani, Sambit Ghosh, Sameep Mehta, Carlos Eberhardt, Daniel Debrunner
Semantically similar historical example based data transformation

Patent number: 12524431

Abstract: An embodiment computes a plurality of similarity scores, each similarity score in the plurality of similarity scores measuring a similarity of task data of a first data transformation task to data of a stored data transformation in a plurality of stored data transformations, wherein each of the plurality of stored data transformations comprises a stored data transformation program, each stored data transformation program comprising a data transformation from a first data format to a second data format. An embodiment generates, from a first stored data transformation program in the plurality of stored data transformations, a generated data transformation program. An embodiment performs, by modifying a plurality of records described by the first data transformation task into a second plurality of records according to the generated data transformation program, the first data transformation task.

Type: Grant

Filed: January 12, 2024

Date of Patent: January 13, 2026

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Nitin Gupta, Shramona Chakraborty, Hima Patel, Nagarjuna Surabathina, Sameep Mehta, Ramkumar Ramalingam, Matu Agarwal
Prioritized data cleaning

Patent number: 12517887

Abstract: Methods, systems, and computer program products for prioritized data cleaning are provided herein. A computer-implemented method includes obtaining a dataset comprising a plurality of data issues; determining a priority of one or more features of the dataset; generating a respective model for each of a plurality of data resolution algorithms, wherein each model indicates computing costs of the corresponding data resolution algorithm for resolving at least portion of the plurality of data issues in an order of the priority of the features; and applying one or more of the plurality of data resolutions algorithm to resolve at least a portion of the data issues in the order of the priority of the features based at least in part on the generated models.

Type: Grant

Filed: December 9, 2021

Date of Patent: January 6, 2026

Assignee: International Business Machines Corporation

Inventors: Ritwik Chaudhuri, Sameep Mehta
Generation of training data from redacted information

Patent number: 12468778

Abstract: One embodiment provides a computer implemented method, including: obtaining an information document corresponding to an entity, wherein the information document includes redacted information spans; identifying an entity type for each of the redacted information spans, wherein the entity type identifies a relationship between a redacted information span and at least one other entity within the information document; replacing the redacted information spans with replacement entities corresponding to the entity type of a given redacted information span, wherein the replacing is performed in view of a frequency distribution of actual information and wherein the replacing includes maintaining relationships of the redacted information spans; and controlling bias within the replacement entities, wherein the controlling includes detecting bias within the replacement entities.

Type: Grant

Filed: December 11, 2020

Date of Patent: November 11, 2025

Assignee: International Business Machines Corporation

Inventors: Balaji Ganesan, Kalapriya Kannan, Neeraj Ramkrishna Singh, Shettigar Parkala Srinivas, Hima Patel, Soma Shekar Naganna, Berthold Reinwald, Sameep Mehta
Distributed entity re-resolution based on streaming updates

Patent number: 12455915

Abstract: Mechanisms are provided for dynamic re-resolution of entities in a knowledge graph (KG) based on streaming updates. The KG and corresponding initial clusters associated with first entities are received along with a dynamic data stream having second documents referencing second entities. Clustering on the second documents based on the set of initial clusters, and document features of the second documents, is performed to provide a set of second document clusters. For second document clusters that should be modified based on entities associated with the second document cluster, a cluster modification operation is performed. Updated clusters are generated based on the clustering and modification of clusters. Entity re-resolution is dynamically performed on the entities in the KG based on the second entities associated with the updated clusters to generate an updated knowledge graph data structure.

Type: Grant

Filed: August 4, 2022

Date of Patent: October 28, 2025

Assignee: International Business Machines Corporation

Inventors: Avirup Saha, Balaji Ganesan, Soma Shekar Naganna, Sameep Mehta
CAUSALLY-AWARE ATTRIBUTE CONTROLLED STATEMENT GENERATION IN LANGUAGE MODELS

Publication number: 20250284886

Abstract: Various systems and methods are presented regarding reducing/mitigating generation of one or more statements by a language model (LM), wherein the statements can be any of toxic, offensive, biased, etc. A statement automatically generated by the LM, e.g., in response to a prompt, can be assessed with regard to a probability of the statement being associated with a negative attribute(s). The statement can be further reviewed to identify tokens within the statement causing the association with the negative attribute(s). The tokens can be replaced with counterfactuals, and further assessment(s) made to determine the effect of the statement having a token replaced by a counterfactual with regard to probability of the modified statement being associated with the attribute. The LM can undergo further finetuning to mitigate generation of an offensive statement being generated by the LM.

Type: Application

Filed: July 19, 2023

Publication date: September 11, 2025

Inventors: Kahini Wadhawan, Rahul Madhavan, Rishabh Garg, Sameep Mehta, Vijay Arya
MODIFYING DATA PIPELINE AND DATASET QUALITY USING DATA OBSERVABILITY

Publication number: 20250265234

Abstract: A computer-implemented process for modifying one or more of a plurality of data pipelines within a computer architecture that processes data from a plurality of datasets includes the following operations. Real-time observability data regarding the plurality of data pipelines and the plurality of datasets is ingested. A dataset from the plurality of datasets is classified based upon usage and reliability to generate a classification of the dataset. Based upon the classification, a recommendation to modify at least one of the plurality of data pipelines or the quality of one of the plurality of datasets is generated.

Type: Application

Filed: February 20, 2024

Publication date: August 21, 2025

Inventors: Rajmohan Chandrahasan, Nishtha Madaan, Himanshu Gupta, Sameep Mehta
Selecting best cloud computing environment in a hybrid cloud scenario

Patent number: 12393464

Abstract: Embodiments for identifying an optimal cloud computing environment for a computing task is disclosed. Embodiments comprises receiving a computing task to be executed in a cloud computing environment, wherein the computing task requires a set of cloud computing environment parameter values of the cloud computing environment, pre-selecting a set of candidate cloud computing environments, each of which meets the set of cloud computing environment parameter values, ranking the candidate cloud computing environments using reward-based ranking parameter values of the candidate cloud computing environments as an additional selection constraint, and selecting the highest ranking cloud computing environment as the optimal cloud computing environment for the computing task.

Type: Grant

Filed: February 15, 2022

Date of Patent: August 19, 2025

Assignee: International Business Machines Corporation

Inventors: Anton Zorin, Manish Kesarwani, Niels Dominic Pardon, Ritesh Kumar Gupta, Sameep Mehta
SEMANTICALLY SIMILAR HISTORICAL EXAMPLE BASED DATA TRANSFORMATION

Publication number: 20250231958

Abstract: An embodiment computes a plurality of similarity scores, each similarity score in the plurality of similarity scores measuring a similarity of task data of a first data transformation task to data of a stored data transformation in a plurality of stored data transformations, wherein each of the plurality of stored data transformations comprises a stored data transformation program, each stored data transformation program comprising a data transformation from a first data format to a second data format. An embodiment generates, from a first stored data transformation program in the plurality of stored data transformations, a generated data transformation program. An embodiment performs, by modifying a plurality of records described by the first data transformation task into a second plurality of records according to the generated data transformation program, the first data transformation task.

Type: Application

Filed: January 12, 2024

Publication date: July 17, 2025

Applicant: International Business Machines Corporation

Inventors: Nitin Gupta, Shramona Chakraborty, Hima Patel, Nagarjuna Surabathina, Sameep Mehta, Ramkumar Ramalingam, Matu Agarwal
Automatic data domain identification

Patent number: 12333253

Abstract: An apparatus is disclosed which includes at least one processing device comprising a processor coupled to a memory. The at least one processing device, when executing program code, is configured to: extract one or more entities identified in a plurality of data artifacts based at least in part on one or more datasets, extract one or more entities identified in a plurality of code artifacts based at least in part on the one or more datasets, extract one or more entities identified in a plurality of user interface artifacts based at least in part on the one or more datasets, generate a set of dependency graphs each based at least in part on one or more relationships among the respective extracted one or more entities, and perform one or more of a lexical analysis and a semantic analysis on the set of dependency graphs to identify a data domain of the one or more datasets.

Type: Grant

Filed: November 18, 2021

Date of Patent: June 17, 2025

Assignee: International Business Machines Corporation

Inventors: Malolan Chetlur, Arvind Agarwal, Subhendu Dey, Sameep Mehta, Sandipan Sarkar
Identifying anomalous transformations using lineage data

Patent number: 12326852

Abstract: Methods, systems, and computer program products for identifying anomalous transformations using lineage data are provided herein. A computer-implemented method includes generating a set of column profiles for a corresponding set of columns within one or more datasets based at least in part on lineage data and glossary data, wherein the lineage data comprises information related to transformations performed on each column in the set by a computing platform, and wherein the glossary data comprises information related to one or more terms assigned to one or more of the columns; obtaining information related to a new transformation involving at least one column in the set of columns; comparing the new transformation to the set of column profiles to determine whether the new transformation is anomalous; and in response to determining the new transformation is anomalous, outputting an alert to a user of the computing platform.

Type: Grant

Filed: April 26, 2021

Date of Patent: June 10, 2025

Assignee: International Business Machines Corporation

Inventors: Rajmohan Chandrahasan, Himanshu Gupta, Sameep Mehta, Emma Rose Tucker, Andrzej Jan Wrobel
AUTOMATIC GENERATION AND UPDATION OF DIALOG FLOWS WITH NEW CAPABILITIES

Publication number: 20250138920

Abstract: Various systems and methods are presented regarding implementing one or more capabilities into a dialog flow occurring at an automated interface (e.g., a chatbot). A capability can be invoked at the interface in accordance with a user's requirements, e.g., the capability is a function to review data. An application programming interface (API) can be generated from the capability, wherein the API has features, parameters, metadata, etc., generated based on those of the capability. The API can be incorporated into a dialog, wherein the dialog can be subsequently presented on the interface (e.g., as part of a dialog flow). Interaction between the user and the dialog can cause the capability to be executed. Based upon the API features, etc., the API can be incorporated into a dialog, for example, by cloning a dialog, appending a dialog with the API, replacing a pre-existing API with the API in a dialog, and suchlike.

Type: Application

Filed: October 26, 2023

Publication date: May 1, 2025

Inventors: Manish Kesarwani, Ankush Gupta, Arvind Agarwal, Binayak Dutta, Soujanya Soni, Sameep Mehta
Microservice creation using runtime metadata

Patent number: 12288044

Abstract: A computer implemented method creates microservices for an application. A number of processor units clusters programs and data structures for the application using runtime metadata to form groups of the programs and data structures. The runtime metadata is obtained from running the application. The number of processor units creates a design for the microservices for the application using the groups of the programs and the data structures.

Type: Grant

Filed: January 26, 2023

Date of Patent: April 29, 2025

Assignee: International Business Machines Corporation

Inventors: Akshar Kaul, Himanshu Gupta, Sameep Mehta, Srikanth Govindaraj Tamilselvam, Amith Singhee, Vaibhav Sudhakar Dantale, Ravi Vishnu Israni
Factchecking artificial intelligence models using blockchain

Patent number: 12277507

Abstract: Methods, systems, and computer program products for factchecking artificial intelligence models using blockchain are provided herein. A computer-implemented method includes obtaining at least one artificial intelligence model and at least one set of data related to the at least one artificial intelligence model; determining a set of characteristics based at least in part on the at least one artificial intelligence model and the at least one set of data; selecting one of a plurality of networks based at least in part on a target deployment of the at least one artificial intelligence model to verify the set of characteristics; generating a report based at least in part on verifying the set of characteristics using the selected network, wherein the report establishes a threshold level of trust for the at least one artificial intelligence model; and storing the report on a blockchain.

Type: Grant

Filed: January 22, 2021

Date of Patent: April 15, 2025

Assignee: International Business Machines Corporation

Inventors: Srikanth Govindaraj Tamilselvam, Sai Koti Reddy Danda, Senthil Kumar Kumarasamy Mani, Kalapriya Kannan, Sameep Mehta
TRAINING FOUNDATION MODELS ON TABULAR DATA

Publication number: 20250110970

Abstract: A processor set is configured to receive tabular data records and generate a plurality of clusters, associated with specific real-world entities, within the received tabular data records, wherein each cluster is associated with a specific real-world entity. The processor set may further identify informative features within a first cluster and mask a subset of the informative features. Based on the masked subset of informative features and using self-supervision techniques, the processor set may train a tabular foundation model.

Type: Application

Filed: September 29, 2023

Publication date: April 3, 2025

Inventors: Balaji Ganesan, Avirup Saha, Muhammed Abdul Majeed Ameen, Soma Shekar Naganna, Sameep Mehta
Automatically orchestrating a computerized workflow

Patent number: 12248810

Abstract: The method performs at the orchestration interface at which update information, including changes to tasks of a workflow, is received from a task manager system (TMS), where the workflow includes a set of tasks, inputs to the tasks, and outputs from the tasks. The inputs and outputs determine runtime dependencies between the tasks. Based on the update information received, the orchestration interface populates a topology of nodes and edges as a directed acyclic graph (DAG) that maps nodes to tasks and edges to runtime dependencies between tasks, based on node inputs and outputs. The orchestration interface instructs the execution of the tasks and handling dependencies by interacting with a task execution system (TES) and by traversing the DAG, the orchestration interface identifies tasks that depend on completed tasks as per the runtime dependencies and instructs the TES to execute the dependent tasks identified.

Type: Grant

Filed: June 15, 2022

Date of Patent: March 11, 2025

Assignee: International Business Machines Corporation

Inventors: Anton Zorin, Manish Kesarwani, Niels Dominic Pardon, Ritesh Kumar Gupta, Sameep Mehta

1 2 3 4 5 … next