Patents by Inventor Hima Patel
Hima Patel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11966453Abstract: Embodiments are disclosed for a method. The method includes receiving an annotation set for a machine learning model. The annotation set includes multiple data points relevant to a task for the machine learning model. The method also includes determining total weights corresponding to the data points. The total weights are determined based on multiple ordering constraints indicating multiple data classes and corresponding weights. The corresponding weights represent a relative priority of the data classes with respect to each other. The method further includes generating an ordered annotation set from the annotation set. The ordered annotation set includes the data points in a sequence based on the determined total weights.Type: GrantFiled: February 15, 2021Date of Patent: April 23, 2024Assignee: International Business Machines CorporationInventors: Naveen Panwar, Anush Sankaran, Kuntal Dey, Hima Patel, Sameep Mehta
-
Patent number: 11928126Abstract: A computer implemented method transforms data. Responsive to receiving a data transformation of an input string to an output string, a computer system identifies mappable tokens in the input string that are mappable to the output string. The computer system creates a set of initial mappings for a set of common tokens in the mappable tokens. The set of initial mappings maps the set of common tokens from the input string to the output string. The computer system creates a set of user mappings that maps the mappable tokens from input string to the output string using a user input to the set of initial mappings. The computer system generates program code that transform input strings to output strings using the set of user mappings that maps the mappable tokens from input string to the output string, wherein the program code is used to transform input strings to output strings.Type: GrantFiled: August 22, 2022Date of Patent: March 12, 2024Assignee: International Business Machines CorporationInventors: Shanmukha Chaitanya Guttula, Pranay Kumar Lohia, Nitin Gupta, Hima Patel
-
Publication number: 20240061858Abstract: A computer implemented method transforms data. Responsive to receiving a data transformation of an input string to an output string, a computer system identifies mappable tokens in the input string that are mappable to the output string. The computer system creates a set of initial mappings for a set of common tokens in the mappable tokens. The set of initial mappings maps the set of common tokens from the input string to the output string. The computer system creates a set of user mappings that maps the mappable tokens from input string to the output string using a user input to the set of initial mappings. The computer system generates program code that transform input strings to output strings using the set of user mappings that maps the mappable tokens from input string to the output string, wherein the program code is used to transform input strings to output strings.Type: ApplicationFiled: August 22, 2022Publication date: February 22, 2024Inventors: Shanmukha Chaitanya Guttula, Pranay Kumar Lohia, Nitin Gupta, Hima Patel
-
Patent number: 11836219Abstract: One embodiment provides a method, including: receiving a sample set for training a machine-learning model, wherein the sample set includes a plurality of classes, wherein classes within the plurality of classes have an imbalance in a number of samples; creating an enlarged minority class by generating new samples from the samples within the minority class and adding the new samples to the minority class; selecting subset samples from both the samples within the enlarged minority class and the majority class; weighting each of the subset samples based upon user input defining goals for attributes of a training sample set to be used in training the machine-learning model; and generating, using the neural network, the training sample set by re-running the selecting in view of the weighting.Type: GrantFiled: November 3, 2021Date of Patent: December 5, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ruhi Sharma Mittal, Lokesh Nagalapatti, Hima Patel, Nitin Gupta
-
Publication number: 20230274160Abstract: Methods, systems, and computer program products for automatically detecting periods of normal activity by analyzing observability data in IT operations environments are provided herein. A computer-implemented method includes obtaining multiple types of data related to one or more artificial intelligence-related information technology operations; modelling at least a portion of the obtained data as time series data; automatically identifying, from the time series data, one or more time periods associated with one or more given levels of data activity; and performing one or more automated actions, in at least one artificial intelligence-related information technology operations environment, based at least in part on the data corresponding to the one or more identified time periods.Type: ApplicationFiled: February 28, 2022Publication date: August 31, 2023Inventors: Shashank Mujumdar, Hima Patel, Sambaran Bandyopadhyay, Pooja Aggarwal, Anbang Xu, Hau-Wen Chang, Harshit Kumar, Katherine Guo, Rama Kalyani T. Akkiraju, Gargi B. Dasgupta
-
Publication number: 20230177113Abstract: Methods, systems, and computer program products for privacy-preserving class label standardization in federated learning settings are provided herein. A computer-implemented method includes determining, using one or more data privacy-preserving techniques, a signature for each of one or more classes of data for each of multiple client devices within a federated learning environment; identifying one or more signature matches across at least a portion of the multiple client devices; generating one or more class labels for the one or more classes of data associated with the one or more signature matches; labeling, across the at least a portion of the multiple client devices, the one or more classes of data associated with the one or more signature matches with the one or more generated class labels; and performing one or more automated actions based at least in part on the one or more labeled classes of data.Type: ApplicationFiled: December 2, 2021Publication date: June 8, 2023Inventors: Shonda Adena Witherspoon, Ramasuri Narayanam, Hima Patel, Sameep Mehta
-
Publication number: 20230169070Abstract: A computer implemented method, computer system, and computer program product for transforming mapped data fields of enterprise applications. A number of processor units receiving a matching from a source data field to a target data field. The set of processor units receiving a number of annotated examples of transformations from a source format to a target format. Based on the annotated examples, the set of processor units autogenerating a query language expression for transforming data items from the source format to the target format.Type: ApplicationFiled: November 29, 2021Publication date: June 1, 2023Inventors: Ramkumar Ramalingam, Nagarjuna Surabathina, Thanmayi Mruthyunjaya, Nitin Gupta, Pranay Kumar Lohia, Shanmukha Chaitanya Guttula, Hima Patel, Sameep Mehta, Matu Agarwal, Mudit Mehrotra
-
Publication number: 20230136125Abstract: One embodiment provides a method, including: receiving a sample set for training a machine-learning model, wherein the sample set includes a plurality of classes, wherein classes within the plurality of classes have an imbalance in a number of samples; creating an enlarged minority class by generating new samples from the samples within the minority class and adding the new samples to the minority class; selecting subset samples from both the samples within the enlarged minority class and the majority class; weighting each of the subset samples based upon user input defining goals for attributes of a training sample set to be used in training the machine-learning model; and generating, using the neural network, the training sample set by re-running the selecting in view of the weighting.Type: ApplicationFiled: November 3, 2021Publication date: May 4, 2023Inventors: Ruhi Sharma Mittal, Lokesh Nagalapatti, Hima Patel, Nitin Gupta
-
Publication number: 20230106490Abstract: Methods, systems, and computer program products for automatically improving data annotations by processing annotation properties and user feedback are provided herein. A computer-implemented method includes obtaining data annotation pairs, each comprising an input data annotation in a first format and a corresponding output data annotation in a second format; determining, within at least a portion of the data annotation pairs, one or more non-diffs; identifying, across the at least a portion of data annotation pairs, data annotation properties associated with multiple intents by processing the non-diffs using property-related rules; modifying at least a portion of the data annotation pairs based on the identified data annotation properties; outputting the modified data annotation pairs to at least one user; and generating a final collection of data annotation pairs by processing at least a portion of the modified data annotation pairs and user feedback received in response to the outputting.Type: ApplicationFiled: October 6, 2021Publication date: April 6, 2023Inventors: Shanmukha Chaitanya Guttula, Nitin Gupta, Pranay Kumar Lohia, Hima Patel
-
Patent number: 11580092Abstract: A method for automatically detecting errors in at least one data entry in a database, the at least one data entry including an input string of characters that do not match at least one predefined string of characters. The method includes generating a first image map; generating at least one classification parameter by comparing the first image map to a second image map, the second image map based at least partially on the predefined string of characters; determining that the input string of characters correlates to the predefined string of characters; and modifying the at least one data entry to match the predefined string of characters in response to determining that the input string of characters correlates to the predefined string of characters. Various other methods and systems for automatically detecting errors in at least one data entry in a database are also disclosed.Type: GrantFiled: December 23, 2020Date of Patent: February 14, 2023Assignee: Visa International Service AssociationInventor: Hima Patel
-
Publication number: 20230021563Abstract: Methods, systems, and computer program products for federated data standardization using data privacy techniques are provided herein. A computer-implemented method includes obtaining multiple datasets from multiple clients in accordance with one or more data privacy techniques; determining one or more similar data columns across at least a portion of the multiple datasets; generating one or more column labels for the one or more similar data columns; standardizing at least a portion of data within the one or more similar data columns by processing the one or more generated column labels using at least one federated learning technique; and performing one or more automated actions based at least in part on results of the standardizing of the at least a portion of data within the one or more similar data columns.Type: ApplicationFiled: July 23, 2021Publication date: January 26, 2023Inventors: Ramasuri Narayanam, Hima Patel, Sameep Mehta
-
Publication number: 20220405631Abstract: Techniques for qualitatively assessing unlabeled data in an unsupervised machine learning environment are disclosed. In one example, a method comprises the following steps. A dataset of unlabeled data points is converted into a graph structure. Nodes of the graph structure represent the unlabeled data points in the dataset and weighted edges between at least a portion of the nodes represent similarity between the unlabeled data points represented by the nodes. A metric is computed for each node of the graph structure. A value generated by the metric for a given node represents a measure of dissimilarity between the corresponding unlabeled data point of the given node and one or more other unlabeled data points of one or more other nodes. A subset of the dataset is generated by removing one or more unlabeled data points from the dataset based on one or more values of the computed metric.Type: ApplicationFiled: June 22, 2021Publication date: December 22, 2022Inventors: Ramasuri Narayanam, Hima Patel, Lokesh Nagalapatti, Ruhi Sharma Mittal
-
Patent number: 11520986Abstract: Aspects of the present disclosure relate to neural-based ontology generation and refinement. A set of input data can be received. A set of entities can be extracted from the set of input data using a named-entity recognition (NER) process, each entity having a corresponding label, the corresponding labels making up a label set. The label set can be compared to concepts in a set of reference ontologies. Labels that match to concepts in the set of reference ontologies can be selected as a candidate concept set. Relations associated with the candidate concepts within the set of reference ontologies can be identified as a candidate relation set. An ontology can then be generated using the candidate concept set and candidate relation set.Type: GrantFiled: July 24, 2020Date of Patent: December 6, 2022Assignee: International Business Machines CorporationInventors: Balaji Ganesan, Riddhiman Dasgupta, Akshay Parekh, Hima Patel, Berthold Reinwald, Sameep Mehta
-
Publication number: 20220261597Abstract: Embodiments are disclosed for a method. The method includes receiving an annotation set for a machine learning model. The annotation set includes multiple data points relevant to a task for the machine learning model. The method also includes determining total weights corresponding to the data points. The total weights are determined based on multiple ordering constraints indicating multiple data classes and corresponding weights. The corresponding weights represent a relative priority of the data classes with respect to each other. The method further includes generating an ordered annotation set from the annotation set. The ordered annotation set includes the data points in a sequence based on the determined total weights.Type: ApplicationFiled: February 15, 2021Publication date: August 18, 2022Inventors: Naveen Panwar, Anush Sankaran, Kuntal Dey, Hima Patel, Sameep Mehta
-
Patent number: 11416682Abstract: Knowledge gaps in a chatbot are identified with reference to a domain-specific document and a set of QA pairs of the chatbot. Entities and/or entity values associated with the document are compared to the entities and/or entity values of the QA pairs. Entities of the document not associated with the QA pairs are identified as knowledge gaps. The QA pairs and knowledge gaps are ranked by relevance to the domain.Type: GrantFiled: July 1, 2020Date of Patent: August 16, 2022Assignee: International Business Machines CorporationInventors: Hima Patel, Jayachandu Bandlamudi, Kuntal Dey, Daivik Swarup Oggu Venkata
-
Publication number: 20220188567Abstract: One embodiment provides a computer implemented method, including: obtaining an information document corresponding to an entity, wherein the information document includes redacted information spans; identifying an entity type for each of the redacted information spans, wherein the entity type identifies a relationship between a redacted information span and at least one other entity within the information document; replacing the redacted information spans with replacement entities corresponding to the entity type of a given redacted information span, wherein the replacing is performed in view of a frequency distribution of actual information and wherein the replacing includes maintaining relationships of the redacted information spans; and controlling bias within the replacement entities, wherein the controlling includes detecting bias within the replacement entities.Type: ApplicationFiled: December 11, 2020Publication date: June 16, 2022Inventors: Balaji Ganesan, Kalapriya Kannan, Neeraj Ramkrishna Singh, Shettigar Parkala Srinivas, Hima Patel, Soma Shekar Naganna, Berthold Reinwald, Sameep Mehta
-
Publication number: 20220164698Abstract: A method to automatically assess data quality of data input into a machine learning model and remediate the data includes receiving input data for an automated machine learning model. Selections for a multiple data quality metrics are displayed. A selection for data quality metrics is received. The data quality metrics are determined according to the selection. Selections for data remediation strategies based on the selection of the data quality metrics are displayed. A selection for remediation recommendation strategies is received. The selected data remediation strategies are performed on the input data. Learning from the selection of the data quality metrics and the selection for the remediation strategies is performed. A new customized machine learning model is generated based on the learning.Type: ApplicationFiled: November 25, 2020Publication date: May 26, 2022Inventors: Arunima Chaudhary, Dakuo Wang, Abel Valente, Carolina Maria Spina, Hima Patel, Nitin Gupta, Gregory Bramble, Horst Cornelius Samulowitz, Sameep Mehta, Theodoros Salonidis, Daniel M. Gruen, Chaung Gan
-
Publication number: 20220101182Abstract: One embodiment provides a method, including: obtaining a dataset for use in building a machine-learning model; assessing a quality of the dataset, wherein the quality is assessed in view of an effect of the dataset on a performance of the machine-learning model, wherein the assessing comprises scoring the dataset with respect to each of a plurality of attributes of the dataset; for each of the plurality of attributes having a low quality score, providing at least one recommendation for increasing the quality of the dataset with respect to the attribute having a low quality score; and for each of the plurality of attributes having a low quality score, providing an explanation explaining a cause of the low quality score for the attribute having a low quality score.Type: ApplicationFiled: September 28, 2020Publication date: March 31, 2022Inventors: Hima Patel, Lokesh Nagalapatti, Naveen Panwar, Nitin Gupta, Ruhi Sharma Mittal, Sameep Mehta, Shanmukha Chaitanya Guttula, Shazia Afzal
-
Publication number: 20220101186Abstract: One embodiment provides a method, including: obtaining predictions generated by a deployed machine-learning model; generating, from the obtained predictions, a validation dataset comprising a plurality of data points, wherein the validation dataset is generated in view of user preferences related to desired performance metrics of the deployed machine-learning model; ranking the plurality of data points of the validation dataset in view of the user preferences; determining the deployed machine-learning model needs to be retrained by comparing the ranked plurality of data points to a training dataset used to train the deployed machine-learning model and identifying, based upon the comparison, a quality of the deployed machine-learning model can be increased above a predetermined threshold; and retraining the deployed machine-learning model utilizing a new training dataset being based upon the validation dataset and the ranked plurality of data points.Type: ApplicationFiled: September 29, 2020Publication date: March 31, 2022Inventors: Ruhi Sharma Mittal, Lokesh Nagalapatti, Nitin Gupta, Hima Patel
-
Publication number: 20220027561Abstract: Aspects of the present disclosure relate to neural-based ontology generation and refinement. A set of input data can be received. A set of entities can be extracted from the set of input data using a named-entity recognition (NER) process, each entity having a corresponding label, the corresponding labels making up a label set. The label set can be compared to concepts in a set of reference ontologies. Labels that match to concepts in the set of reference ontologies can be selected as a candidate concept set. Relations associated with the candidate concepts within the set of reference ontologies can be identified as a candidate relation set. An ontology can then be generated using the candidate concept set and candidate relation set.Type: ApplicationFiled: July 24, 2020Publication date: January 27, 2022Inventors: Balaji Ganesan, Riddhiman Dasgupta, Akshay Parekh, Hima Patel, Berthold Reinwald, Sameep Mehta