Patents by Inventor Sameep Mehta
Sameep Mehta has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12619907Abstract: A computer-implemented method, a computer program product, and a computer system for automated model lineage inference. A computer system identifies training datasets which is used to train a machine learning model. A computer system identifies parent datasets from which the training datasets are derived. A computer system identifies associated feature transformations when the training datasets are derived from the parent datasets.Type: GrantFiled: March 11, 2022Date of Patent: May 5, 2026Assignee: International Business Machines CorporationInventors: Rajmohan Chandrahasan, Kriti Rajput, Nitin Gupta, Himanshu Gupta, Sameep Mehta, Emma Rose Tucker, Manish Anand Bhide
-
Patent number: 12561319Abstract: A system includes a processor that executes computer executable components stored in a memory. The computer executable components can comprise a generation component that generates a GraphQL query. The computer executable components can further comprise a comparison component that compares a predefined GraphQL query to the generated GraphQL query. The computer executable components can further comprise a data mutation component that creates data variations to execute on the predefined GraphQL query and the generated GraphQL query to identify false positives or false negatives of execution equivalence of the queries. The computer executable components can further comprise an evaluation component that compares results of the executed queries to determine sufficiency of the generated GraphQL query.Type: GrantFiled: January 3, 2025Date of Patent: February 24, 2026Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sambit Ghosh, Manish Kesarwani, Nitin Gupta, Renuka Sindhgatta Rajan, Sameep Mehta
-
Publication number: 20260050653Abstract: A method, according to one approach, includes: receiving a test sample having schema information and natural language information. The schema information is compared to a pool of entries that correspond to a given query language. One or more entries in the pool that match the schema information of the test sample are identified. One or more entries in the pool that match the natural language information of the test sample are also identified. The method also includes merging selected ones of the entries that match the schema information and selected ones of the entries that match the natural language information. Furthermore, a large language model performs in-context learning using the merged entries and the test sample.Type: ApplicationFiled: August 13, 2024Publication date: February 19, 2026Inventors: Nitin Gupta, Manish Kesarwani, Sambit Ghosh, Sameep Mehta, Carlos Eberhardt, Daniel Debrunner
-
Patent number: 12524431Abstract: An embodiment computes a plurality of similarity scores, each similarity score in the plurality of similarity scores measuring a similarity of task data of a first data transformation task to data of a stored data transformation in a plurality of stored data transformations, wherein each of the plurality of stored data transformations comprises a stored data transformation program, each stored data transformation program comprising a data transformation from a first data format to a second data format. An embodiment generates, from a first stored data transformation program in the plurality of stored data transformations, a generated data transformation program. An embodiment performs, by modifying a plurality of records described by the first data transformation task into a second plurality of records according to the generated data transformation program, the first data transformation task.Type: GrantFiled: January 12, 2024Date of Patent: January 13, 2026Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nitin Gupta, Shramona Chakraborty, Hima Patel, Nagarjuna Surabathina, Sameep Mehta, Ramkumar Ramalingam, Matu Agarwal
-
Patent number: 12517887Abstract: Methods, systems, and computer program products for prioritized data cleaning are provided herein. A computer-implemented method includes obtaining a dataset comprising a plurality of data issues; determining a priority of one or more features of the dataset; generating a respective model for each of a plurality of data resolution algorithms, wherein each model indicates computing costs of the corresponding data resolution algorithm for resolving at least portion of the plurality of data issues in an order of the priority of the features; and applying one or more of the plurality of data resolutions algorithm to resolve at least a portion of the data issues in the order of the priority of the features based at least in part on the generated models.Type: GrantFiled: December 9, 2021Date of Patent: January 6, 2026Assignee: International Business Machines CorporationInventors: Ritwik Chaudhuri, Sameep Mehta
-
Patent number: 12468778Abstract: One embodiment provides a computer implemented method, including: obtaining an information document corresponding to an entity, wherein the information document includes redacted information spans; identifying an entity type for each of the redacted information spans, wherein the entity type identifies a relationship between a redacted information span and at least one other entity within the information document; replacing the redacted information spans with replacement entities corresponding to the entity type of a given redacted information span, wherein the replacing is performed in view of a frequency distribution of actual information and wherein the replacing includes maintaining relationships of the redacted information spans; and controlling bias within the replacement entities, wherein the controlling includes detecting bias within the replacement entities.Type: GrantFiled: December 11, 2020Date of Patent: November 11, 2025Assignee: International Business Machines CorporationInventors: Balaji Ganesan, Kalapriya Kannan, Neeraj Ramkrishna Singh, Shettigar Parkala Srinivas, Hima Patel, Soma Shekar Naganna, Berthold Reinwald, Sameep Mehta
-
Patent number: 12455915Abstract: Mechanisms are provided for dynamic re-resolution of entities in a knowledge graph (KG) based on streaming updates. The KG and corresponding initial clusters associated with first entities are received along with a dynamic data stream having second documents referencing second entities. Clustering on the second documents based on the set of initial clusters, and document features of the second documents, is performed to provide a set of second document clusters. For second document clusters that should be modified based on entities associated with the second document cluster, a cluster modification operation is performed. Updated clusters are generated based on the clustering and modification of clusters. Entity re-resolution is dynamically performed on the entities in the KG based on the second entities associated with the updated clusters to generate an updated knowledge graph data structure.Type: GrantFiled: August 4, 2022Date of Patent: October 28, 2025Assignee: International Business Machines CorporationInventors: Avirup Saha, Balaji Ganesan, Soma Shekar Naganna, Sameep Mehta
-
Publication number: 20250284886Abstract: Various systems and methods are presented regarding reducing/mitigating generation of one or more statements by a language model (LM), wherein the statements can be any of toxic, offensive, biased, etc. A statement automatically generated by the LM, e.g., in response to a prompt, can be assessed with regard to a probability of the statement being associated with a negative attribute(s). The statement can be further reviewed to identify tokens within the statement causing the association with the negative attribute(s). The tokens can be replaced with counterfactuals, and further assessment(s) made to determine the effect of the statement having a token replaced by a counterfactual with regard to probability of the modified statement being associated with the attribute. The LM can undergo further finetuning to mitigate generation of an offensive statement being generated by the LM.Type: ApplicationFiled: July 19, 2023Publication date: September 11, 2025Inventors: Kahini Wadhawan, Rahul Madhavan, Rishabh Garg, Sameep Mehta, Vijay Arya
-
Publication number: 20250265234Abstract: A computer-implemented process for modifying one or more of a plurality of data pipelines within a computer architecture that processes data from a plurality of datasets includes the following operations. Real-time observability data regarding the plurality of data pipelines and the plurality of datasets is ingested. A dataset from the plurality of datasets is classified based upon usage and reliability to generate a classification of the dataset. Based upon the classification, a recommendation to modify at least one of the plurality of data pipelines or the quality of one of the plurality of datasets is generated.Type: ApplicationFiled: February 20, 2024Publication date: August 21, 2025Inventors: Rajmohan Chandrahasan, Nishtha Madaan, Himanshu Gupta, Sameep Mehta
-
Patent number: 12393464Abstract: Embodiments for identifying an optimal cloud computing environment for a computing task is disclosed. Embodiments comprises receiving a computing task to be executed in a cloud computing environment, wherein the computing task requires a set of cloud computing environment parameter values of the cloud computing environment, pre-selecting a set of candidate cloud computing environments, each of which meets the set of cloud computing environment parameter values, ranking the candidate cloud computing environments using reward-based ranking parameter values of the candidate cloud computing environments as an additional selection constraint, and selecting the highest ranking cloud computing environment as the optimal cloud computing environment for the computing task.Type: GrantFiled: February 15, 2022Date of Patent: August 19, 2025Assignee: International Business Machines CorporationInventors: Anton Zorin, Manish Kesarwani, Niels Dominic Pardon, Ritesh Kumar Gupta, Sameep Mehta
-
Publication number: 20250231958Abstract: An embodiment computes a plurality of similarity scores, each similarity score in the plurality of similarity scores measuring a similarity of task data of a first data transformation task to data of a stored data transformation in a plurality of stored data transformations, wherein each of the plurality of stored data transformations comprises a stored data transformation program, each stored data transformation program comprising a data transformation from a first data format to a second data format. An embodiment generates, from a first stored data transformation program in the plurality of stored data transformations, a generated data transformation program. An embodiment performs, by modifying a plurality of records described by the first data transformation task into a second plurality of records according to the generated data transformation program, the first data transformation task.Type: ApplicationFiled: January 12, 2024Publication date: July 17, 2025Applicant: International Business Machines CorporationInventors: Nitin Gupta, Shramona Chakraborty, Hima Patel, Nagarjuna Surabathina, Sameep Mehta, Ramkumar Ramalingam, Matu Agarwal
-
Patent number: 12333253Abstract: An apparatus is disclosed which includes at least one processing device comprising a processor coupled to a memory. The at least one processing device, when executing program code, is configured to: extract one or more entities identified in a plurality of data artifacts based at least in part on one or more datasets, extract one or more entities identified in a plurality of code artifacts based at least in part on the one or more datasets, extract one or more entities identified in a plurality of user interface artifacts based at least in part on the one or more datasets, generate a set of dependency graphs each based at least in part on one or more relationships among the respective extracted one or more entities, and perform one or more of a lexical analysis and a semantic analysis on the set of dependency graphs to identify a data domain of the one or more datasets.Type: GrantFiled: November 18, 2021Date of Patent: June 17, 2025Assignee: International Business Machines CorporationInventors: Malolan Chetlur, Arvind Agarwal, Subhendu Dey, Sameep Mehta, Sandipan Sarkar
-
Patent number: 12326852Abstract: Methods, systems, and computer program products for identifying anomalous transformations using lineage data are provided herein. A computer-implemented method includes generating a set of column profiles for a corresponding set of columns within one or more datasets based at least in part on lineage data and glossary data, wherein the lineage data comprises information related to transformations performed on each column in the set by a computing platform, and wherein the glossary data comprises information related to one or more terms assigned to one or more of the columns; obtaining information related to a new transformation involving at least one column in the set of columns; comparing the new transformation to the set of column profiles to determine whether the new transformation is anomalous; and in response to determining the new transformation is anomalous, outputting an alert to a user of the computing platform.Type: GrantFiled: April 26, 2021Date of Patent: June 10, 2025Assignee: International Business Machines CorporationInventors: Rajmohan Chandrahasan, Himanshu Gupta, Sameep Mehta, Emma Rose Tucker, Andrzej Jan Wrobel
-
Publication number: 20250138920Abstract: Various systems and methods are presented regarding implementing one or more capabilities into a dialog flow occurring at an automated interface (e.g., a chatbot). A capability can be invoked at the interface in accordance with a user's requirements, e.g., the capability is a function to review data. An application programming interface (API) can be generated from the capability, wherein the API has features, parameters, metadata, etc., generated based on those of the capability. The API can be incorporated into a dialog, wherein the dialog can be subsequently presented on the interface (e.g., as part of a dialog flow). Interaction between the user and the dialog can cause the capability to be executed. Based upon the API features, etc., the API can be incorporated into a dialog, for example, by cloning a dialog, appending a dialog with the API, replacing a pre-existing API with the API in a dialog, and suchlike.Type: ApplicationFiled: October 26, 2023Publication date: May 1, 2025Inventors: Manish Kesarwani, Ankush Gupta, Arvind Agarwal, Binayak Dutta, Soujanya Soni, Sameep Mehta
-
Patent number: 12288044Abstract: A computer implemented method creates microservices for an application. A number of processor units clusters programs and data structures for the application using runtime metadata to form groups of the programs and data structures. The runtime metadata is obtained from running the application. The number of processor units creates a design for the microservices for the application using the groups of the programs and the data structures.Type: GrantFiled: January 26, 2023Date of Patent: April 29, 2025Assignee: International Business Machines CorporationInventors: Akshar Kaul, Himanshu Gupta, Sameep Mehta, Srikanth Govindaraj Tamilselvam, Amith Singhee, Vaibhav Sudhakar Dantale, Ravi Vishnu Israni
-
Patent number: 12277507Abstract: Methods, systems, and computer program products for factchecking artificial intelligence models using blockchain are provided herein. A computer-implemented method includes obtaining at least one artificial intelligence model and at least one set of data related to the at least one artificial intelligence model; determining a set of characteristics based at least in part on the at least one artificial intelligence model and the at least one set of data; selecting one of a plurality of networks based at least in part on a target deployment of the at least one artificial intelligence model to verify the set of characteristics; generating a report based at least in part on verifying the set of characteristics using the selected network, wherein the report establishes a threshold level of trust for the at least one artificial intelligence model; and storing the report on a blockchain.Type: GrantFiled: January 22, 2021Date of Patent: April 15, 2025Assignee: International Business Machines CorporationInventors: Srikanth Govindaraj Tamilselvam, Sai Koti Reddy Danda, Senthil Kumar Kumarasamy Mani, Kalapriya Kannan, Sameep Mehta
-
Publication number: 20250110970Abstract: A processor set is configured to receive tabular data records and generate a plurality of clusters, associated with specific real-world entities, within the received tabular data records, wherein each cluster is associated with a specific real-world entity. The processor set may further identify informative features within a first cluster and mask a subset of the informative features. Based on the masked subset of informative features and using self-supervision techniques, the processor set may train a tabular foundation model.Type: ApplicationFiled: September 29, 2023Publication date: April 3, 2025Inventors: Balaji Ganesan, Avirup Saha, Muhammed Abdul Majeed Ameen, Soma Shekar Naganna, Sameep Mehta
-
Patent number: 12248810Abstract: The method performs at the orchestration interface at which update information, including changes to tasks of a workflow, is received from a task manager system (TMS), where the workflow includes a set of tasks, inputs to the tasks, and outputs from the tasks. The inputs and outputs determine runtime dependencies between the tasks. Based on the update information received, the orchestration interface populates a topology of nodes and edges as a directed acyclic graph (DAG) that maps nodes to tasks and edges to runtime dependencies between tasks, based on node inputs and outputs. The orchestration interface instructs the execution of the tasks and handling dependencies by interacting with a task execution system (TES) and by traversing the DAG, the orchestration interface identifies tasks that depend on completed tasks as per the runtime dependencies and instructs the TES to execute the dependent tasks identified.Type: GrantFiled: June 15, 2022Date of Patent: March 11, 2025Assignee: International Business Machines CorporationInventors: Anton Zorin, Manish Kesarwani, Niels Dominic Pardon, Ritesh Kumar Gupta, Sameep Mehta
-
Publication number: 20250077877Abstract: A computer-implemented method for improving entity matching in a probabilistic matching engine can train a graph neural network (GNN) model on an output of a probabilistic matching engine to perform entity matching and determine counterfactual explanations for non-matches of entities. A list of data transformations can be identified by actionable recourse using the GNN model. The list of data transformations can be ranked, using the GNN model, based on computational overhead and an estimated improvement in entity matching within the probabilistic matching engine.Type: ApplicationFiled: August 29, 2023Publication date: March 6, 2025Inventors: Balaji Ganesan, Sameep Mehta, Muhammed Abdul Majeed Ameen, Abhishek Seth, Soma Shekar Naganna
-
Publication number: 20250077537Abstract: A method, system, and computer program product are configured to: receive a data transfer request to transfer a dataset stored on a source cloud to a destination cloud; determine a target view of the data transfer request based on one or more policies; determine, using lineage metadata, a first portion of the target view exists in one or more copies of a dataset stored on the destination cloud; extract data corresponding to the first portion from the one or more copies of the dataset stored on the destination cloud; create the target view using the extracted data; and serve the data transfer request using the created target viewType: ApplicationFiled: August 31, 2023Publication date: March 6, 2025Inventors: Christopher J. Giblin, Rajmohan Chandrahasan, Himanshu GUPTA, Sameep Mehta