Patents by Inventor Sameep Mehta

Sameep Mehta has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230289649
    Abstract: A computer-implemented method, a computer program product, and a computer system for automated model lineage inference. A computer system identifies training datasets which is used to train a machine learning model. A computer system identifies parent datasets from which the training datasets are derived. A computer system identifies associated feature transformations when the training datasets are derived from the parent datasets.
    Type: Application
    Filed: March 11, 2022
    Publication date: September 14, 2023
    Inventors: Rajmohan Chandrahasan, Kriti Rajput, Nitin Gupta, HIMANSHU GUPTA, Sameep Mehta, Emma Rose Tucker, Manish Anand Bhide
  • Publication number: 20230281212
    Abstract: A computer-implemented method generates an automated data movement workflow. The method includes transforming a received request for data, which was received in a restricted natural language form, into a form suitable for accessing a metadata repository. The method further includes identifying data and data dependencies using the transformed request for data. The method further includes building a workflow using the identified data and data dependencies. The method further includes, upon applying at least one governance rule to the workflow, modifying the built workflow to be compliant with the at least one governance rule, and if no compliance with the at least one governance rule is achievable, recommending a change to the built workflow.
    Type: Application
    Filed: March 7, 2022
    Publication date: September 7, 2023
    Inventors: Anton Zorin, Manish Kesarwani, Niels Dominic Pardon, Ritesh Kumar Gupta, Sameep Mehta
  • Publication number: 20230259401
    Abstract: Embodiments for identifying an optimal cloud computing environment for a computing task is disclosed. Embodiments comprises receiving a computing task to be executed in a cloud computing environment, wherein the computing task requires a set of cloud computing environment parameter values of the cloud computing environment, pre-selecting a set of candidate cloud computing environments, each of which meets the set of cloud computing environment parameter values, ranking the candidate cloud computing environments using reward-based ranking parameter values of the candidate cloud computing environments as an additional selection constraint, and selecting the highest ranking cloud computing environment as the optimal cloud computing environment for the computing task.
    Type: Application
    Filed: February 15, 2022
    Publication date: August 17, 2023
    Inventors: Anton Zorin, Manish Kesarwani, Niels Dominic Pardon, Ritesh Kumar Gupta, Sameep Mehta
  • Patent number: 11720533
    Abstract: Techniques for automatically determining different data types found in databases are disclosed. In one example, a computer implemented method comprises receiving a portion of identifying information for one or more components of a database, and generating one or more descriptions for the one or more components based at least in part on the portion of the identifying information for the one or more components. The one or more descriptions are inputted to one or more machine learning models, and, using the one or more machine learning models, one or more data types associated with the one or more components are predicted. The prediction is based at least in part on the one or more descriptions.
    Type: Grant
    Filed: November 29, 2021
    Date of Patent: August 8, 2023
    Assignee: International Business Machines Corporation
    Inventors: Rajmohan Chandrahasan, Ankush Gupta, Venkata Nagaraju Pavuluri, Arvind Agarwal, Sameep Mehta
  • Patent number: 11714963
    Abstract: According to one embodiment of the present invention, a system for modifying content associated with an item comprises at least one processor. Features of interest of the item to a plurality of different groups are determined based on user comments produced by members of the plurality of different groups. The members within each group have a common characteristic. The features of interest to each group within the content associated with the item are identified, and the content associated with the item is modified by balancing the features of interest to the plurality of different groups within the content associated with the item. Embodiments of the present invention further include a method and computer program product for modifying content associated with an item in substantially the same manner described above.
    Type: Grant
    Filed: March 13, 2020
    Date of Patent: August 1, 2023
    Assignee: International Business Machines Corporation
    Inventors: Seema Nagar, Kuntal Dey, Nishtha Madaan, Manish Anand Bhide, Sameep Mehta, Diptikalyan Saha
  • Publication number: 20230185791
    Abstract: Methods, systems, and computer program products for prioritized data cleaning are provided herein. A computer-implemented method includes obtaining a dataset comprising a plurality of data issues; determining a priority of one or more features of the dataset; generating a respective model for each of a plurality of data resolution algorithms, wherein each model indicates computing costs of the corresponding data resolution algorithm for resolving at least portion of the plurality of data issues in an order of the priority of the features; and applying one or more of the plurality of data resolutions algorithm to resolve at least a portion of the data issues in the order of the priority of the features based at least in part on the generated models.
    Type: Application
    Filed: December 9, 2021
    Publication date: June 15, 2023
    Inventors: Ritwik Chaudhuri, Sameep Mehta
  • Publication number: 20230186197
    Abstract: In an approach for effective performance assessment, a processor classifies relevancy of a goal submitted by an employee. A processor classifies the goal into one of pre-defined dimensions. A processor receives feedback about the goal from a manager. A processor classifies whether the feedback is actionable with respect to the corresponding goal. A processor classifies consistency of the feedback with the corresponding dimension of the goal. A processor classifies consistency of the feedback with the corresponding position level of the employee. A processor converts the feedback along the corresponding dimension into a rating for the dimension on a pre-defined scale.
    Type: Application
    Filed: December 14, 2021
    Publication date: June 15, 2023
    Inventors: Rakesh Rameshrao Pimplikar, Sameep Mehta, Nazia Hasan, Varun Gupta, Kingshuk Banerjee
  • Publication number: 20230177113
    Abstract: Methods, systems, and computer program products for privacy-preserving class label standardization in federated learning settings are provided herein. A computer-implemented method includes determining, using one or more data privacy-preserving techniques, a signature for each of one or more classes of data for each of multiple client devices within a federated learning environment; identifying one or more signature matches across at least a portion of the multiple client devices; generating one or more class labels for the one or more classes of data associated with the one or more signature matches; labeling, across the at least a portion of the multiple client devices, the one or more classes of data associated with the one or more signature matches with the one or more generated class labels; and performing one or more automated actions based at least in part on the one or more labeled classes of data.
    Type: Application
    Filed: December 2, 2021
    Publication date: June 8, 2023
    Inventors: Shonda Adena Witherspoon, Ramasuri Narayanam, Hima Patel, Sameep Mehta
  • Publication number: 20230177383
    Abstract: Methods, systems, and computer program products for adjusting machine learning models based on simulated fairness impact are provided herein. A computer-implemented method includes obtaining, by a central simulation system, policies to be used for performing a simulation involving machine learning models, implemented on different systems, interacting with a target population; providing information for configuring simulators on the different systems, each simulator representing at least the machine learning model of a given one of the different systems; performing iterations of the simulation for the policies, wherein, for each iteration, the central simulation system: predicts a state of the target population, provides the state to the simulators, and collects metrics based on results of the simulators; and selecting and sending one of the policies to at least one of the different systems based on the collected metrics.
    Type: Application
    Filed: December 7, 2021
    Publication date: June 8, 2023
    Inventors: Pranay Kumar Lohia, Kushal Mukherjee, Rakesh Rameshrao Pimplikar, Monika Gupta, Sameep Mehta, Stacy F. Hobson
  • Publication number: 20230177355
    Abstract: Methods, systems, and computer program products for automated fairness-driven graph node label classification are provided herein. A computer-implemented method includes obtaining at least one input graph; predicting one or more node labels associated with the at least one input graph by processing at least a portion of the at least one input graph using a graph node label prediction model, wherein the graph node label prediction model includes at least one loss function; generating an updated version of the graph node label prediction model based at least in part on the one or more predicted node labels and one or more group fairness-based constraints relevant to the at least one input graph; and performing one or more automated actions using the updated version of the graph node label prediction model.
    Type: Application
    Filed: December 6, 2021
    Publication date: June 8, 2023
    Inventors: Ramasuri Narayanam, Sameep Mehta, Rakesh Rameshrao Pimplikar, Pranay Kumar Lohia
  • Publication number: 20230169070
    Abstract: A computer implemented method, computer system, and computer program product for transforming mapped data fields of enterprise applications. A number of processor units receiving a matching from a source data field to a target data field. The set of processor units receiving a number of annotated examples of transformations from a source format to a target format. Based on the annotated examples, the set of processor units autogenerating a query language expression for transforming data items from the source format to the target format.
    Type: Application
    Filed: November 29, 2021
    Publication date: June 1, 2023
    Inventors: Ramkumar Ramalingam, Nagarjuna Surabathina, Thanmayi Mruthyunjaya, Nitin Gupta, Pranay Kumar Lohia, Shanmukha Chaitanya Guttula, Hima Patel, Sameep Mehta, Matu Agarwal, Mudit Mehrotra
  • Publication number: 20230169050
    Abstract: Techniques for automatically determining different data types found in databases are disclosed. In one example, a computer implemented method comprises receiving a portion of identifying information for one or more components of a database, and generating one or more descriptions for the one or more components based at least in part on the portion of the identifying information for the one or more components. The one or more descriptions are inputted to one or more machine learning models, and, using the one or more machine learning models, one or more data types associated with the one or more components are predicted. The prediction is based at least in part on the one or more descriptions.
    Type: Application
    Filed: November 29, 2021
    Publication date: June 1, 2023
    Inventors: Rajmohan Chandrahasan, Ankush Gupta, Venkata Nagaraju Pavuluri, Arvind Agarwal, Sameep Mehta
  • Publication number: 20230153537
    Abstract: An apparatus is disclosed which includes at least one processing device comprising a processor coupled to a memory. The at least one processing device, when executing program code, is configured to: extract one or more entities identified in a plurality of data artifacts based at least in part on one or more datasets, extract one or more entities identified in a plurality of code artifacts based at least in part on the one or more datasets, extract one or more entities identified in a plurality of user interface artifacts based at least in part on the one or more datasets, generate a set of dependency graphs each based at least in part on one or more relationships among the respective extracted one or more entities, and perform one or more of a lexical analysis and a semantic analysis on the set of dependency graphs to identify a data domain of the one or more datasets.
    Type: Application
    Filed: November 18, 2021
    Publication date: May 18, 2023
    Inventors: Malolan Chetlur, Arvind Agarwal, Subhendu Dey, Sameep Mehta, Sandipan Sarkar
  • Publication number: 20230135407
    Abstract: An embodiment establishes a designated attribute value as a semantic criterion for grouping records in a bucket, identifies a first set of records having attribute values that satisfy the semantic criterion, and adds the first set of records to the bucket. The embodiment detects that the first set of records represent a first series of events that occurred in succession at respective times. The embodiment derives a temporal attribute value representative of a time pattern formed by the times of the first series of events and designates the temporal attribute value as a temporal criterion for grouping records in the bucket. The embodiment identifies a second set of records that represent a second series of events and satisfy the temporal criterion and adds the second set of records to the bucket based at least in part on the second set of records satisfying the temporal criterion.
    Type: Application
    Filed: November 3, 2021
    Publication date: May 4, 2023
    Applicant: International Business Machines Corporation
    Inventors: Avirup Saha, Balaji Ganesan, Shettigar Parkala Srinivas, Sumit Bhatia, Sameep Mehta, Soma Shekar Naganna
  • Publication number: 20230128548
    Abstract: One embodiment provides a method, including: receiving, at a central server, data from each of a plurality of data sources, the plurality of data sources being within a plurality of data storage locations, wherein the central server includes a validation dataset having a plurality of annotated datapoints; computing, at the central server, an influential score for each of the plurality of data sources based upon the data provided to the central server from each of the plurality of data sources, wherein an influential score of a data source identifies an influence of the data source in accurately predicting annotations of the validation dataset; selecting, at the central server and based upon the influential score of the plurality of data sources, a subset of the plurality of data sources; and generating, at the central server, the training dataset utilizing the data of the data sources included within the subset.
    Type: Application
    Filed: October 25, 2021
    Publication date: April 27, 2023
    Inventors: Ruhi Sharma Mittal, Ramasuri Narayanam, Lokesh Nagalapatti, Sameep Mehta
  • Patent number: 11636386
    Abstract: Methods, systems, and computer program products for determining data representative of bias within a model are provided herein. A computer-implemented method includes obtaining a first dataset on which a model was trained, wherein the first dataset contains protected attributes, and a second dataset on which the model was trained, wherein the protected attributes have been removed from the second dataset; identifying, for each of the one or more protected attributes in the first dataset, one or more attributes in the second dataset correlated therewith; determining bias among at least a portion of the identified correlated attributes; and outputting, to at least one user, identifying information pertaining to the one or more instances of bias.
    Type: Grant
    Filed: November 21, 2019
    Date of Patent: April 25, 2023
    Assignee: International Business Machines Corporation
    Inventors: Pranay Kumar Lohia, Diptikalyan Saha, Manish Anand Bhide, Sameep Mehta
  • Patent number: 11630833
    Abstract: One embodiment provides a computer implemented method, including: receiving, from a user, a natural language query for data contained within at least one data repository; identifying at least one concept from the natural language query, wherein the at least one concept includes an entity and an intent; identifying a plurality of datasets satisfying the natural language query by querying the at least one data repository utilizing the at least one concept; ranking the dataset based on relevance to the query; generating an extract-transform-load script that extracts, transforms, and loads a dataset selected by the user from the plurality of datasets; and retrieving data included in the dataset utilizing the extract-transform-load script, wherein the retrieving includes returning the data to the user.
    Type: Grant
    Filed: October 29, 2020
    Date of Patent: April 18, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Manish Kesarwani, Sumit Bhatia, Sameep Mehta
  • Publication number: 20230021563
    Abstract: Methods, systems, and computer program products for federated data standardization using data privacy techniques are provided herein. A computer-implemented method includes obtaining multiple datasets from multiple clients in accordance with one or more data privacy techniques; determining one or more similar data columns across at least a portion of the multiple datasets; generating one or more column labels for the one or more similar data columns; standardizing at least a portion of data within the one or more similar data columns by processing the one or more generated column labels using at least one federated learning technique; and performing one or more automated actions based at least in part on results of the standardizing of the at least a portion of data within the one or more similar data columns.
    Type: Application
    Filed: July 23, 2021
    Publication date: January 26, 2023
    Inventors: Ramasuri Narayanam, Hima Patel, Sameep Mehta
  • Patent number: 11551102
    Abstract: One embodiment provides a method, including: receiving a target unstructured document for determining whether the target unstructured document comprises biased information; identifying an objective of the target unstructured document by extracting, from the target unstructured document, (i) entities and (ii) relationships between the entities; creating a structured knowledge base, wherein the creating comprises (i) creating an entry in the structured knowledge base corresponding to the target unstructured document, (ii) identifying other unstructured documents having a similarity to the target unstructured document, and (iii) generating an entry in the structured knowledge base corresponding to each of the other unstructured documents; applying a bias detection technique on the structured knowledge base; and providing an indication of whether the target unstructured document comprises bias.
    Type: Grant
    Filed: April 15, 2019
    Date of Patent: January 10, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Pranay Kumar Lohia, Rajmohan Chandrahasan, Himanshu Gupta, Samiulla Zakir Hussain Shaikh, Sameep Mehta, Atul Kumar
  • Patent number: 11544566
    Abstract: A method, computer system, and a computer program product for generating deep learning model insights using provenance data is provided. Embodiments of the present invention may include collecting provenance data. Embodiments of the present invention may include generating model insights based on the collected provenance data. Embodiments of the present invention may include generating a training model based on the generated model insights. Embodiments of the present invention may include reducing the training model size. Embodiments of the present invention may include creating a final trained model.
    Type: Grant
    Filed: June 3, 2019
    Date of Patent: January 3, 2023
    Assignee: International Business Machines Corporation
    Inventors: Nitin Gupta, Himanshu Gupta, Rajmohan Chandrahasan, Sameep Mehta, Pranay Kumar Lohia