Patents by Inventor Sameep Mehta

Sameep Mehta has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240202573
    Abstract: A method, computer program product, and computer system for transforming sets of source data having different formats into respective sets of target data having a same format. N source patterns are determined and respectively describe N different formats in which N sets of source data items are formatted, where N?1. A target format pattern is determined and describes a target format in which a target data items are formatted. N graphs are generated and respectively describe transformations of the N source patterns to the target pattern. Each graph includes multiple transformation paths. Each transformation path transforms the source pattern to the target pattern in a manner that maps source strings in the source pattern to each target string in the target pattern. A single transformation path is selected from the multiple transformation paths resulting in N single transformation paths having been selected.
    Type: Application
    Filed: December 19, 2022
    Publication date: June 20, 2024
    Inventors: Nagarjuna Surabathina, Nitin Gupta, Shramona Chakraborty, Hima Patel, Sameep Mehta, Ramkumar Ramalingam, Matu Agarwal
  • Publication number: 20240152557
    Abstract: Records can be matched by a graph neural network model performing entity resolution on the records, and representing each record as a respective node in a graph. Record matching explanations can be generated, each record matching explanation indicating a first set of attributes, and a first set of corresponding values, used for the matching at least two of the records. Nodes can be clustered into a plurality of clusters by aggregating the record matching explanations and, based on the record matching explanations, determining which of the records have high importance values, in the first set of values, that match. At least one cluster explanation can be generated, the cluster explanation indicating a second set of attributes, and a second set of values corresponding to the second set of attributes, used for the clustering the nodes. The record matching explanation and the cluster explanation can be output.
    Type: Application
    Filed: November 3, 2022
    Publication date: May 9, 2024
    Inventors: Muhammed Abdul Majeed Ameen, Balaji Ganesan, Avirup Saha, Abhishek Seth, Devbrat Sharma, Arvind Agarwal, Soma Shekar Naganna, Sameep Mehta
  • Patent number: 11966453
    Abstract: Embodiments are disclosed for a method. The method includes receiving an annotation set for a machine learning model. The annotation set includes multiple data points relevant to a task for the machine learning model. The method also includes determining total weights corresponding to the data points. The total weights are determined based on multiple ordering constraints indicating multiple data classes and corresponding weights. The corresponding weights represent a relative priority of the data classes with respect to each other. The method further includes generating an ordered annotation set from the annotation set. The ordered annotation set includes the data points in a sequence based on the determined total weights.
    Type: Grant
    Filed: February 15, 2021
    Date of Patent: April 23, 2024
    Assignee: International Business Machines Corporation
    Inventors: Naveen Panwar, Anush Sankaran, Kuntal Dey, Hima Patel, Sameep Mehta
  • Publication number: 20240086780
    Abstract: A method, computer program, and computer system are provided for determining similar nodes in a federated learning environment. Data corresponding to a dataset associated with a node in the federated learning environment is retrieved by the node. A frequency distribution associated with the dataset is calculated and transmitted to an aggregator. One or more frequency distributions associated with one or more other nodes in the federated learning environment are received from the aggregator. Based on the received frequency distributions associated with the one or more other nodes, a similarity between the node and a subset of the one or more other nodes is identified.
    Type: Application
    Filed: September 12, 2022
    Publication date: March 14, 2024
    Inventors: Soujanya Soni, Sameep Mehta
  • Patent number: 11921861
    Abstract: Methods, systems, and computer program products for providing the status of model extraction in the presence of colluding users are provided herein. A computer-implemented method includes generating, for each of multiple users, a summary of user input to a machine learning model; comparing the generated summaries to boundaries of multiple feature classes within an input space of the machine learning model; computing correspondence metrics based at least in part on the comparisons; identifying, based at least in part on the computed metrics, one or more of the multiple users as candidates for extracting portions of the machine learning model in an adversarial manner; and generating and outputting an alert, based on the identified users, to an entity related to the machine learning model.
    Type: Grant
    Filed: May 21, 2018
    Date of Patent: March 5, 2024
    Assignee: International Business Machines Corporation
    Inventors: Manish Kesarwani, Vijay Arya, Sameep Mehta
  • Publication number: 20240070350
    Abstract: An example operation may include one or more of identifying an external system that passes an input attribute to a process based on a workflow representation of the process, building a simulator of the external system based on attributes of the external system identified from the workflow representation, simulating future values of the input attribute to be passed to the process by the external system based on the simulator of the external system and a previous simulation run of the process performed via a workflow software application, and executing a new simulation of the process via the workflow software application based on the simulated future values of the input attribute.
    Type: Application
    Filed: August 23, 2022
    Publication date: February 29, 2024
    Inventors: Rakesh Rameshrao Pimplikar, Ritwik Chaudhuri, Pranay Kumar Lohia, Ramasuri Narayanam, Sameep Mehta, Gyana Ranjan Parija
  • Publication number: 20240070519
    Abstract: A method, computer program, and computer system are provided for online fairness monitoring. A dataset having one or more entries with one or more protected attributes and data corresponding to a trained machine learning model is received. An entry having a maximum reward is selected based on a reward probability associated with the entry. A determination is made as to whether bias has developed in the trained machine learning model toward one or more of the one or more protected attributes based on a change to the reward probability or a distribution of reward probabilities exceeding a threshold value.
    Type: Application
    Filed: August 26, 2022
    Publication date: February 29, 2024
    Inventors: Manish Kesarwani, Pranay Kumar Lohia, Ramasuri Narayanam, Rakesh Rameshrao Pimplikar, Sameep Mehta
  • Publication number: 20240045896
    Abstract: Mechanisms are provided for dynamic re-resolution of entities in a knowledge graph (KG) based on streaming updates. The KG and corresponding initial clusters associated with first entities are received along with a dynamic data stream having second documents referencing second entities. Clustering on the second documents based on the set of initial clusters, and document features of the second documents, is performed to provide a set of second document clusters. For second document clusters that should be modified based on entities associated with the second document cluster, a cluster modification operation is performed. Updated clusters are generated based on the clustering and modification of clusters. Entity re-resolution is dynamically performed on the entities in the KG based on the second entities associated with the updated clusters to generate an updated knowledge graph data structure.
    Type: Application
    Filed: August 4, 2022
    Publication date: February 8, 2024
    Inventors: Avirup Saha, Balaji Ganesan, Soma Shekar Naganna, Sameep Mehta
  • Publication number: 20230409386
    Abstract: The method performs at the orchestration interface at which update information, including changes to tasks of a workflow, is received from a task manager system (TMS), where the workflow includes a set of tasks, inputs to the tasks, and outputs from the tasks. The inputs and outputs determine runtime dependencies between the tasks. Based on the update information received, the orchestration interface populates a topology of nodes and edges as a directed acyclic graph (DAG) that maps nodes to tasks and edges to runtime dependencies between tasks, based on node inputs and outputs. The orchestration interface instructs the execution of the tasks and handling dependencies by interacting with a task execution system (TES) and by traversing the DAG, the orchestration interface identifies tasks that depend on completed tasks as per the runtime dependencies and instructs the TES to execute the dependent tasks identified.
    Type: Application
    Filed: June 15, 2022
    Publication date: December 21, 2023
    Inventors: Anton Zorin, Manish Kesarwani, Niels Dominic Pardon, Ritesh Kumar Gupta, Sameep Mehta
  • Patent number: 11790265
    Abstract: Aspects of the present invention provide an approach for reducing bias in active learning. In an embodiment, a data point is selected from a training dataset for a current training iteration while monitoring for data bias at each addition of data to a virtual training dataset. In addition, a machine learning model is examined for bias after adding the selected data point to the virtual training dataset. When data bias and/or model bias is detected, the data point is considered for potential label modification. The selected data point is modified and, if the raw value of the modified data point is within a predefined tolerance and within a bin of a desired class, the modified data point having a label of the target class is retained. Otherwise, it can be discarded.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: October 17, 2023
    Assignee: International Business Machines Corporation
    Inventors: Kuntal Dey, Sameep Mehta, Manish Anand Bhide
  • Patent number: 11768860
    Abstract: An embodiment establishes a designated attribute value as a semantic criterion for grouping records in a bucket, identifies a first set of records having attribute values that satisfy the semantic criterion, and adds the first set of records to the bucket. The embodiment detects that the first set of records represent a first series of events that occurred in succession at respective times. The embodiment derives a temporal attribute value representative of a time pattern formed by the times of the first series of events and designates the temporal attribute value as a temporal criterion for grouping records in the bucket. The embodiment identifies a second set of records that represent a second series of events and satisfy the temporal criterion and adds the second set of records to the bucket based at least in part on the second set of records satisfying the temporal criterion.
    Type: Grant
    Filed: November 3, 2021
    Date of Patent: September 26, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Avirup Saha, Balaji Ganesan, Shettigar Parkala Srinivas, Sumit Bhatia, Sameep Mehta, Soma Shekar Naganna
  • Publication number: 20230289649
    Abstract: A computer-implemented method, a computer program product, and a computer system for automated model lineage inference. A computer system identifies training datasets which is used to train a machine learning model. A computer system identifies parent datasets from which the training datasets are derived. A computer system identifies associated feature transformations when the training datasets are derived from the parent datasets.
    Type: Application
    Filed: March 11, 2022
    Publication date: September 14, 2023
    Inventors: Rajmohan Chandrahasan, Kriti Rajput, Nitin Gupta, HIMANSHU GUPTA, Sameep Mehta, Emma Rose Tucker, Manish Anand Bhide
  • Publication number: 20230281212
    Abstract: A computer-implemented method generates an automated data movement workflow. The method includes transforming a received request for data, which was received in a restricted natural language form, into a form suitable for accessing a metadata repository. The method further includes identifying data and data dependencies using the transformed request for data. The method further includes building a workflow using the identified data and data dependencies. The method further includes, upon applying at least one governance rule to the workflow, modifying the built workflow to be compliant with the at least one governance rule, and if no compliance with the at least one governance rule is achievable, recommending a change to the built workflow.
    Type: Application
    Filed: March 7, 2022
    Publication date: September 7, 2023
    Inventors: Anton Zorin, Manish Kesarwani, Niels Dominic Pardon, Ritesh Kumar Gupta, Sameep Mehta
  • Publication number: 20230259401
    Abstract: Embodiments for identifying an optimal cloud computing environment for a computing task is disclosed. Embodiments comprises receiving a computing task to be executed in a cloud computing environment, wherein the computing task requires a set of cloud computing environment parameter values of the cloud computing environment, pre-selecting a set of candidate cloud computing environments, each of which meets the set of cloud computing environment parameter values, ranking the candidate cloud computing environments using reward-based ranking parameter values of the candidate cloud computing environments as an additional selection constraint, and selecting the highest ranking cloud computing environment as the optimal cloud computing environment for the computing task.
    Type: Application
    Filed: February 15, 2022
    Publication date: August 17, 2023
    Inventors: Anton Zorin, Manish Kesarwani, Niels Dominic Pardon, Ritesh Kumar Gupta, Sameep Mehta
  • Patent number: 11720533
    Abstract: Techniques for automatically determining different data types found in databases are disclosed. In one example, a computer implemented method comprises receiving a portion of identifying information for one or more components of a database, and generating one or more descriptions for the one or more components based at least in part on the portion of the identifying information for the one or more components. The one or more descriptions are inputted to one or more machine learning models, and, using the one or more machine learning models, one or more data types associated with the one or more components are predicted. The prediction is based at least in part on the one or more descriptions.
    Type: Grant
    Filed: November 29, 2021
    Date of Patent: August 8, 2023
    Assignee: International Business Machines Corporation
    Inventors: Rajmohan Chandrahasan, Ankush Gupta, Venkata Nagaraju Pavuluri, Arvind Agarwal, Sameep Mehta
  • Patent number: 11714963
    Abstract: According to one embodiment of the present invention, a system for modifying content associated with an item comprises at least one processor. Features of interest of the item to a plurality of different groups are determined based on user comments produced by members of the plurality of different groups. The members within each group have a common characteristic. The features of interest to each group within the content associated with the item are identified, and the content associated with the item is modified by balancing the features of interest to the plurality of different groups within the content associated with the item. Embodiments of the present invention further include a method and computer program product for modifying content associated with an item in substantially the same manner described above.
    Type: Grant
    Filed: March 13, 2020
    Date of Patent: August 1, 2023
    Assignee: International Business Machines Corporation
    Inventors: Seema Nagar, Kuntal Dey, Nishtha Madaan, Manish Anand Bhide, Sameep Mehta, Diptikalyan Saha
  • Publication number: 20230185791
    Abstract: Methods, systems, and computer program products for prioritized data cleaning are provided herein. A computer-implemented method includes obtaining a dataset comprising a plurality of data issues; determining a priority of one or more features of the dataset; generating a respective model for each of a plurality of data resolution algorithms, wherein each model indicates computing costs of the corresponding data resolution algorithm for resolving at least portion of the plurality of data issues in an order of the priority of the features; and applying one or more of the plurality of data resolutions algorithm to resolve at least a portion of the data issues in the order of the priority of the features based at least in part on the generated models.
    Type: Application
    Filed: December 9, 2021
    Publication date: June 15, 2023
    Inventors: Ritwik Chaudhuri, Sameep Mehta
  • Publication number: 20230186197
    Abstract: In an approach for effective performance assessment, a processor classifies relevancy of a goal submitted by an employee. A processor classifies the goal into one of pre-defined dimensions. A processor receives feedback about the goal from a manager. A processor classifies whether the feedback is actionable with respect to the corresponding goal. A processor classifies consistency of the feedback with the corresponding dimension of the goal. A processor classifies consistency of the feedback with the corresponding position level of the employee. A processor converts the feedback along the corresponding dimension into a rating for the dimension on a pre-defined scale.
    Type: Application
    Filed: December 14, 2021
    Publication date: June 15, 2023
    Inventors: Rakesh Rameshrao Pimplikar, Sameep Mehta, Nazia Hasan, Varun Gupta, Kingshuk Banerjee
  • Publication number: 20230177383
    Abstract: Methods, systems, and computer program products for adjusting machine learning models based on simulated fairness impact are provided herein. A computer-implemented method includes obtaining, by a central simulation system, policies to be used for performing a simulation involving machine learning models, implemented on different systems, interacting with a target population; providing information for configuring simulators on the different systems, each simulator representing at least the machine learning model of a given one of the different systems; performing iterations of the simulation for the policies, wherein, for each iteration, the central simulation system: predicts a state of the target population, provides the state to the simulators, and collects metrics based on results of the simulators; and selecting and sending one of the policies to at least one of the different systems based on the collected metrics.
    Type: Application
    Filed: December 7, 2021
    Publication date: June 8, 2023
    Inventors: Pranay Kumar Lohia, Kushal Mukherjee, Rakesh Rameshrao Pimplikar, Monika Gupta, Sameep Mehta, Stacy F. Hobson
  • Publication number: 20230177355
    Abstract: Methods, systems, and computer program products for automated fairness-driven graph node label classification are provided herein. A computer-implemented method includes obtaining at least one input graph; predicting one or more node labels associated with the at least one input graph by processing at least a portion of the at least one input graph using a graph node label prediction model, wherein the graph node label prediction model includes at least one loss function; generating an updated version of the graph node label prediction model based at least in part on the one or more predicted node labels and one or more group fairness-based constraints relevant to the at least one input graph; and performing one or more automated actions using the updated version of the graph node label prediction model.
    Type: Application
    Filed: December 6, 2021
    Publication date: June 8, 2023
    Inventors: Ramasuri Narayanam, Sameep Mehta, Rakesh Rameshrao Pimplikar, Pranay Kumar Lohia