Patents by Inventor Sameep Mehta

Sameep Mehta has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220188567
    Abstract: One embodiment provides a computer implemented method, including: obtaining an information document corresponding to an entity, wherein the information document includes redacted information spans; identifying an entity type for each of the redacted information spans, wherein the entity type identifies a relationship between a redacted information span and at least one other entity within the information document; replacing the redacted information spans with replacement entities corresponding to the entity type of a given redacted information span, wherein the replacing is performed in view of a frequency distribution of actual information and wherein the replacing includes maintaining relationships of the redacted information spans; and controlling bias within the replacement entities, wherein the controlling includes detecting bias within the replacement entities.
    Type: Application
    Filed: December 11, 2020
    Publication date: June 16, 2022
    Inventors: Balaji Ganesan, Kalapriya Kannan, Neeraj Ramkrishna Singh, Shettigar Parkala Srinivas, Hima Patel, Soma Shekar Naganna, Berthold Reinwald, Sameep Mehta
  • Publication number: 20220164698
    Abstract: A method to automatically assess data quality of data input into a machine learning model and remediate the data includes receiving input data for an automated machine learning model. Selections for a multiple data quality metrics are displayed. A selection for data quality metrics is received. The data quality metrics are determined according to the selection. Selections for data remediation strategies based on the selection of the data quality metrics are displayed. A selection for remediation recommendation strategies is received. The selected data remediation strategies are performed on the input data. Learning from the selection of the data quality metrics and the selection for the remediation strategies is performed. A new customized machine learning model is generated based on the learning.
    Type: Application
    Filed: November 25, 2020
    Publication date: May 26, 2022
    Inventors: Arunima Chaudhary, Dakuo Wang, Abel Valente, Carolina Maria Spina, Hima Patel, Nitin Gupta, Gregory Bramble, Horst Cornelius Samulowitz, Sameep Mehta, Theodoros Salonidis, Daniel M. Gruen, Chaung Gan
  • Publication number: 20220138216
    Abstract: One embodiment provides a computer implemented method, including: receiving, from a user, a natural language query for data contained within at least one data repository; identifying at least one concept from the natural language query, wherein the at least one concept includes an entity and an intent; identifying a plurality of datasets satisfying the natural language query by querying the at least one data repository utilizing the at least one concept; ranking the dataset based on relevance to the query; generating an extract-transform-load script that extracts, transforms, and loads a dataset selected by the user from the plurality of datasets; and retrieving data included in the dataset utilizing the extract-transform-load script, wherein the retrieving includes returning the data to the user.
    Type: Application
    Filed: October 29, 2020
    Publication date: May 5, 2022
    Inventors: Manish Kesarwani, Sumit Bhatia, Sameep Mehta
  • Patent number: 11321304
    Abstract: Methods, systems, and computer program products for domain aware explainable anomaly and drift detection for multi-variate raw data using a constraint repository are provided herein. A computer-implemented method includes obtaining a set of data and information indicative of a domain of said set of data; obtaining constraints from a domain-indexed constraint repository based on said set of data and said information, wherein the domain-indexed constraint repository comprises a knowledge graph having a plurality of nodes, wherein each node comprises an attribute associated with at least one of a plurality of domains and constraints corresponding to the attribute; detecting anomalies in said set of data based on whether portions of said set of data violate said retrieved constraints; generating an explanation corresponding to each of the anomalies that describe the attributes corresponding to the violated constraints; and outputting an indication of the anomalies and the corresponding explanation.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: May 3, 2022
    Assignee: International Business Machines Corporation
    Inventors: Sandeep Hans, Samiulla Zakir Hussain Shaikh, Rema Ananthanarayanan, Diptikalyan Saha, Aniya Aggarwal, Gagandeep Singh, Pranay Kumar Lohia, Manish Anand Bhide, Sameep Mehta
  • Patent number: 11302096
    Abstract: Methods, systems, and computer program products for determining model-related bias associated with training data are provided herein. A computer-implemented method includes obtaining, via execution of a first model, class designations attributed to data points used to train the first model; identifying any of the data points associated with an inaccurate class designation and/or a low-confidence class designation; training a second model using the data points from the dataset, but excluding the identified data points; determining bias related to at least a portion of those data points used to train the second model by: modifying one or more of the data points used to train the second model; executing the first model using the modified data points; and identifying a change to one or more class designations attributed to the modified data points as compared to before the modifying; and outputting identifying information pertaining to the determined bias.
    Type: Grant
    Filed: November 21, 2019
    Date of Patent: April 12, 2022
    Assignee: International Business Machines Corporation
    Inventors: Pranay Kumar Lohia, Diptikalyan Saha, Manish Anand Bhide, Sameep Mehta
  • Publication number: 20220101182
    Abstract: One embodiment provides a method, including: obtaining a dataset for use in building a machine-learning model; assessing a quality of the dataset, wherein the quality is assessed in view of an effect of the dataset on a performance of the machine-learning model, wherein the assessing comprises scoring the dataset with respect to each of a plurality of attributes of the dataset; for each of the plurality of attributes having a low quality score, providing at least one recommendation for increasing the quality of the dataset with respect to the attribute having a low quality score; and for each of the plurality of attributes having a low quality score, providing an explanation explaining a cause of the low quality score for the attribute having a low quality score.
    Type: Application
    Filed: September 28, 2020
    Publication date: March 31, 2022
    Inventors: Hima Patel, Lokesh Nagalapatti, Naveen Panwar, Nitin Gupta, Ruhi Sharma Mittal, Sameep Mehta, Shanmukha Chaitanya Guttula, Shazia Afzal
  • Patent number: 11263188
    Abstract: A method for automatically generating documentation for an artificial intelligence model includes receiving, by a computing device, an artificial intelligence model. The computing device accesses a model facts policy that indicates data to be collected for artificial intelligence models. The computing device collects artificial intelligence model facts regarding the artificial intelligence model according to the model facts policy. The computing device accesses a factsheet template. The factsheet template provides a schema for an artificial intelligence model factsheet for the artificial intelligence model. The computing device populates the artificial intelligence model factsheet using the factsheet template with the artificial intelligence model facts related to the artificial intelligence model.
    Type: Grant
    Filed: November 1, 2019
    Date of Patent: March 1, 2022
    Assignee: International Business Machines Corporation
    Inventors: Matthew R. Arnold, Rachel K. E. Bellamy, Kaoutar El Maghraoui, Michael Hind, Stephanie Houde, Kalapriya Kannan, Sameep Mehta, Aleksandra Mojsilovic, Ramya Raghavendra, Darrell C. Reimer, John T. Richards, David J. Piorkowski, Jason Tsay, Kush R. Varshney, Manish Kesarwani
  • Publication number: 20220027561
    Abstract: Aspects of the present disclosure relate to neural-based ontology generation and refinement. A set of input data can be received. A set of entities can be extracted from the set of input data using a named-entity recognition (NER) process, each entity having a corresponding label, the corresponding labels making up a label set. The label set can be compared to concepts in a set of reference ontologies. Labels that match to concepts in the set of reference ontologies can be selected as a candidate concept set. Relations associated with the candidate concepts within the set of reference ontologies can be identified as a candidate relation set. An ontology can then be generated using the candidate concept set and candidate relation set.
    Type: Application
    Filed: July 24, 2020
    Publication date: January 27, 2022
    Inventors: Balaji Ganesan, Riddhiman Dasgupta, Akshay Parekh, Hima Patel, Berthold Reinwald, Sameep Mehta
  • Patent number: 11227099
    Abstract: A processor may receive a record. The record may include one or more segments of text. The processor may automatically generate a first summary of the record. The processor may determine an overall bias of the first summary. The overall bias of the first summary may be identified from one or more instances of bias in the first summary. The processor may generate a second summary of the record. The second summary of the record may include an indicator of the overall bias of the first summary. The indicator may include a description of a type of overall bias of the first summary and a numerical value of the overall bias of the first summary. The processor may determine an overall bias of the second summary. The processor may display the second summary of the record to a user.
    Type: Grant
    Filed: May 23, 2019
    Date of Patent: January 18, 2022
    Assignee: International Business Machines Corporation
    Inventors: Manish Anand Bhide, Kuntal Dey, Nishtha Madaan, Seema Nagar, Sameep Mehta
  • Patent number: 11222087
    Abstract: An apparatus for dynamically debiasing an online job application system includes a processor and a memory that stores code executable by the processor to receive a plurality of job listings and corresponding job descriptions in response to a search query on an online job listing system and to dynamically modify bias terms of the job description for each of the job listings based on profile information for a user such that each of the job descriptions conforms to the user's profile information. The apparatus includes code executable by the processor to dynamically rank each of the job listings based on the modified job descriptions and with respect to the user's profile information and the search query and to present the job listings and their corresponding modified job descriptions in order of the rank for each of the job listings.
    Type: Grant
    Filed: January 21, 2019
    Date of Patent: January 11, 2022
    Assignee: International Business Machines Corporation
    Inventors: Manish Bhide, Seema Nagar, Sameep Mehta, Kuntal Dey
  • Patent number: 11205138
    Abstract: A method, computer system, and a computer program product for utilizing provenance data to improve machine learning is provided. Embodiments of the present invention may include collecting provenance data. Embodiments of the present invention may include identifying model quality improvements based on the collected provenance data. Embodiments of the present invention may include identifying related models based on the collected provenance data. Embodiments of the present invention may include recommending model quality improvements to a user.
    Type: Grant
    Filed: May 22, 2019
    Date of Patent: December 21, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Samiulla Zakir Hussain Shaikh, Himanshu Gupta, Rajmohan Chandrahasan, Sameep Mehta, Manish Anand Bhide
  • Patent number: 11204953
    Abstract: One embodiment provides a method, including: generating a plurality of ontologies wherein each ontology is generated by: monitoring interactions of a user with lineage information, wherein the monitoring comprises monitoring (i) filter interactions and (ii) access interactions; aggregating the monitored interactions of the user with monitored interactions of other users having a given business role; and generating an ontology for the given business role, wherein the subset comprises (i) event types, (ii) event constraints, (iii) event metadata, and (iv) event context; and upon a user having one of the plurality of business roles accessing lineage information on the data platform, providing a subset of the lineage information.
    Type: Grant
    Filed: April 20, 2020
    Date of Patent: December 21, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Rajmohan Chandrahasan, Himanshu Gupta, Sameep Mehta, Bhanu Mudhireddy, Manish Anand Bhide
  • Patent number: 11200283
    Abstract: One embodiment provides a method, including: receiving a query from a user requesting assistance regarding instructions for performing a task; identifying, within steps of the instructions, words that can be visualized, wherein the identifying comprises identifying relationships between terms within the query to generate a step query; retrieving, for each of the steps, a plurality of images representing the identified words; identifying at least one object occurring within the plurality of images corresponding to more than one of the steps; selecting an image for each of the steps of the instructions, wherein the selecting an image comprises selecting an image for each step such that the identified at least one object is represented similarly in each selected image including the identified at least one object; and presenting the instructions as visualized instructions by presenting the selected images for each of the steps in order.
    Type: Grant
    Filed: October 8, 2018
    Date of Patent: December 14, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shashank Mujumdar, Nitin Gupta, Sameep Mehta
  • Patent number: 11157983
    Abstract: Methods, systems, and computer program products for generating a framework for prioritizing machine learning model offerings via a platform are provided herein. A computer-implemented method includes processing, via a computing platform, a machine learning model input by a first user and metadata corresponding to the machine learning model input by the first user; automatically comparing, via the computing platform, the metadata corresponding to the machine learning model with metadata corresponding to one or more existing machine learning models stored by the computing platform; automatically calculating, via the computing platform, initial pricing information for the machine learning model based on the comparison; and outputting, via an interactive user interface of the computing platform, the machine learning model to one or more additional users for purchase in accordance with the calculated initial pricing information.
    Type: Grant
    Filed: July 8, 2019
    Date of Patent: October 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Kalapriya Kannan, Samiulla Zakir Hussain Shaikh, Pranay Kumar Lohia, Vijay Arya, Sameep Mehta
  • Publication number: 20210326366
    Abstract: One embodiment provides a method, including: generating a plurality of ontologies wherein each ontology is generated by: monitoring interactions of a user with lineage information, wherein the monitoring comprises monitoring (i) filter interactions and (ii) access interactions; aggregating the monitored interactions of the user with monitored interactions of other users having a given business role; and generating an ontology for the given business role, wherein the subset comprises (i) event types, (ii) event constraints, (iii) event metadata, and (iv) event context; and upon a user having one of the plurality of business roles accessing lineage information on the data platform, providing a subset of the lineage information.
    Type: Application
    Filed: April 20, 2020
    Publication date: October 21, 2021
    Inventors: Rajmohan Chandrahasan, Himanshu Gupta, Sameep Mehta, Bhanu Mudhireddy, Manish Anand Bhide
  • Patent number: 11144569
    Abstract: One embodiment provides a method, including: receiving, from a user, (i) a dataset and (ii) an intended output from the dataset that is generated in view of a given analytical framework for the dataset, wherein the intended output identifies an output that the user wants from the dataset and wherein the dataset is related to an analytical domain; identifying a plurality of dataset functions related to the analytical domain; determining one or more dataset functions for each of one or more operations identified, wherein the one or more operations are identified using the repository to identify operations used to result in an intended output similar to the received intended output; and recommending an ordered subset of the one or more dataset functions to be used to transform the dataset to the intended output, wherein the ordered subset comprises (i) one dataset function for each of the one or more operations and (ii) an order for performing the one or more operations.
    Type: Grant
    Filed: May 14, 2019
    Date of Patent: October 12, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kalapriya Kannan, Sameep Mehta
  • Patent number: 11132500
    Abstract: One embodiment provides a method, including: receiving, from a client, (i) a task of annotating information, (ii) a set of instructions for performing the task, and (iii) client annotations for a subset of the information within the task; assigning the subset to a plurality of annotators; obtaining (i) annotator annotations for the subset and (ii) a response time for providing the annotator annotation for each piece of information within the subset; identifying improvements to the set of instructions by (i) comparing the annotator annotations to the client annotations and (ii) identifying discrepancies made by the annotators in view of the response time; and generating a new set of instructions, wherein the generating comprises (i) identifying at least one feature of the information that distinguishes correctly annotated information from incorrectly annotated information and (ii) generating an instruction from the at least one feature.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: September 28, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shashank Mujumdar, Nitin Gupta, Arvind Agarwal, Sameep Mehta
  • Publication number: 20210286945
    Abstract: According to one embodiment of the present invention, a system for modifying content associated with an item comprises at least one processor. Features of interest of the item to a plurality of different groups are determined based on user comments produced by members of the plurality of different groups. The members within each group have a common characteristic. The features of interest to each group within the content associated with the item are identified, and the content associated with the item is modified by balancing the features of interest to the plurality of different groups within the content associated with the item. Embodiments of the present invention further include a method and computer program product for modifying content associated with an item in substantially the same manner described above.
    Type: Application
    Filed: March 13, 2020
    Publication date: September 16, 2021
    Inventors: Seema Nagar, Kuntal Dey, Nishtha Madaan, Manish Anand Bhide, Sameep Mehta, Diptikalyan Saha
  • Patent number: 11120204
    Abstract: An article is automatically augmented. The article and one or more comments are received. Comment elements are extracted from the one or more comments, and article elements are extracted from the article. Alignment scores are generated for comment-article pairs based on the extracted comment and article elements. Further, it is determined that at least one comment-article pair has an alignment score at or above a threshold alignment score. At least one augmentation feature is then generated.
    Type: Grant
    Filed: July 15, 2019
    Date of Patent: September 14, 2021
    Assignee: International Business Machines Corporation
    Inventors: Manish Anand Bhide, Nishtha Madaan, Seema Nagar, Sameep Mehta, Kuntal Dey
  • Patent number: 11106864
    Abstract: An article is automatically augmented. The article and one or more comments are received. Comment elements are extracted from the one or more comments, and article elements are extracted from the article. Alignment scores are generated for comment-article pairs based on the extracted comment and article elements. Further, it is determined that at least one comment-article pair has an alignment score at or above a threshold alignment score. At least one augmentation feature is then generated.
    Type: Grant
    Filed: March 22, 2019
    Date of Patent: August 31, 2021
    Assignee: International Business Machines Corporation
    Inventors: Manish Anand Bhide, Nishtha Madaan, Seema Nagar, Sameep Mehta, Kuntal Dey