Patents by Inventor Pallika Haridas KANANI
Pallika Haridas KANANI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250021664Abstract: Subject level privacy attack analysis for federated learning may be performed. A request that selects an analysis of one or more inference attacks may be received to determine a presence of data of a subject in a training set of a federated machine learning model. The selected inference attacks may be performed to determine the presence of the data of subject in the training set of the federated machine learning model. Respective success measurements may be generated for the selected inference attacks based on the performance of the selected inference attacks, which may then be provided.Type: ApplicationFiled: September 27, 2024Publication date: January 16, 2025Inventors: Pallika Haridas Kanani, Virendra J. Marathe, Daniel Wyde Peterson, Anshuman Suri
-
Publication number: 20240394597Abstract: Federated training of a machine learning model with enforcement of subject level privacy is implemented. Respective samples of data items from a training data set are generated at multiple nodes of a federated machine learning system. Noise values are determined for individual ones of the sampled data items according to respective counts of data items of particular subjects and the cumulative counts of the items of the subjects. Respective gradients for the data items are the determined The gradients are then clipped and noise values are applied. Each subject's noisy clipped gradients in the sample are then aggregated. The aggregasted gradients for the entire sample are then used for determining machine learning model updates.Type: ApplicationFiled: March 6, 2024Publication date: November 28, 2024Inventors: Virendra J. Marathe, Pallika Haridas Kanani
-
Patent number: 12130929Abstract: Subject level privacy attack analysis for federated learning may be performed. A request that selects an analysis of one or more inference attacks may be received to determine a presence of data of a subject in a training set of a federated machine learning model. The selected inference attacks may be performed to determine the presence of the data of subject in the training set of the federated machine learning model. Respective success measurements may be generated for the selected inference attacks based on the performance of the selected inference attacks, which may then be provided.Type: GrantFiled: February 25, 2022Date of Patent: October 29, 2024Assignee: Oracle International CorporationInventors: Pallika Haridas Kanani, Virendra J. Marathe, Daniel Wyde Peterson, Anshuman Suri
-
Publication number: 20230394374Abstract: Hierarchical gradient averaging is performed as part of training a machine learning model to enforce subject level privacy. A sample of data items from a training data set is identified and respective gradients for the data items are determined. The gradients are then clipped. Each subject's clipped gradients in the sample are averaged. A noise value is added to a sum of the averaged gradients of each of the subjects in the sample. An average gradient for the entire sample is determined from the averaged gradients of the individual subjects with the added noise value. This average gradient for the entire sample is used for determining machine learning model updates.Type: ApplicationFiled: June 6, 2022Publication date: December 7, 2023Inventors: Virendra J. Marathe, Pallika Haridas Kanani
-
Publication number: 20230274004Abstract: Subject level privacy attack analysis for federated learning may be performed. A request that selects an analysis of one or more inference attacks may be received to determine a presence of data of a subject in a training set of a federated machine learning model. The selected inference attacks may be performed to determine the presence of the data of subject in the training set of the federated machine learning model. Respective success measurements may be generated for the selected inference attacks based on the performance of the selected inference attacks, which may then be provided.Type: ApplicationFiled: February 25, 2022Publication date: August 31, 2023Inventors: Pallika Haridas Kanani, Virendra J. Marathe, Daniel Wyde Peterson, Anshuman Suri
-
Publication number: 20230052231Abstract: Group-level privacy preservation is implemented within federated machine learning. An aggregation server may distribute a machine learning model to multiple users each including respective private datasets. The private datasets may individually include multiple items associated with a single group. Individual users may train the model using their local, private dataset to generate one or more parameter updates and to determine a count of the largest number of items associated with any single group of a number of groups in the dataset. Parameter updates generated by the individual users may be modified by applying respective noise values to individual ones of the parameter updates according to the respective counts to ensure differential privacy for the groups of the dataset. The aggregation server may aggregate the updates into a single set of parameter updates to update the machine learning model.Type: ApplicationFiled: May 11, 2022Publication date: February 16, 2023Inventors: Virendra J. Marathe, Pallika Haridas Kanani
-
Publication number: 20230047092Abstract: User-level privacy preservation is implemented within federated machine learning. An aggregation server may distribute a machine learning model to multiple users each including respective private datasets. Individual users may train the model using the local, private dataset to generate one or more parameter updates. Prior to sending the generated parameter updates to the aggregation server for incorporation into the machine learning model, a user may modify the parameter updates by applying respective noise values to individual ones of the parameter updates to ensure differential privacy for the dataset private to the user. The aggregation server may then receive the respective modified parameter updates from the multiple users and aggregate the updates into a single set of parameter updates to update the machine learning model. The federated machine learning may further include iteratively performing said sending, training, modifying, receiving, aggregating and updating steps.Type: ApplicationFiled: May 11, 2022Publication date: February 16, 2023Inventors: Virendra Marathe, Pallika Haridas Kanani, Daniel Peterson, Swetasudha Panda
-
Patent number: 11443240Abstract: Herein are techniques for domain adaptation of a machine learning (ML) model. These techniques impose differential privacy onto federated learning by the ML model. In an embodiment, each of many client devices receive, from a server, coefficients of a general ML model. For respective new data point(s), each client device operates as follows. Based on the new data point(s), a respective private ML model is trained. Based on the new data point(s), respective gradients are calculated for the coefficients of the general ML model. Random noise is added to the gradients to generate respective noisy gradients. A combined inference may be generated based on: the private ML model, the general ML model, and one of the new data point(s). The noisy gradients are sent to the server. The server adjusts the general ML model based on the noisy gradients from the client devices. This client/server process may be repeated indefinitely.Type: GrantFiled: March 25, 2020Date of Patent: September 13, 2022Assignee: Oracle International CorporationInventors: Daniel Peterson, Pallika Haridas Kanani, Virendra J. Marathe
-
Patent number: 11010768Abstract: A system is provided that extracts attribute values. The system receives data including unstructured text from a data store. The system further tokenizes the unstructured text into tokens, where a token is a character of the unstructured text. The system further annotates the tokens with attribute labels, where an attribute label for a token is determined, in least in part, based on a word that the token originates from within the unstructured text. The system further groups the tokens into text segments based on the attribute labels, where a set of tokens that are annotated with an identical attribute label are grouped into a text segment, and where the text segments define attribute values. The system further stores the attribute labels and the attribute values within the data store.Type: GrantFiled: April 30, 2015Date of Patent: May 18, 2021Assignee: Oracle International CorporationInventors: Pallika Haridas Kanani, Michael Louis Wick, Adam Craig Pocock
-
Publication number: 20210073677Abstract: Herein are techniques for domain adaptation of a machine learning (ML) model. These techniques impose differential privacy onto federated learning by the ML model. In an embodiment, each of many client devices receive, from a server, coefficients of a general ML model. For respective new data point(s), each client device operates as follows. Based on the new data point(s), a respective private ML model is trained. Based on the new data point(s), respective gradients are calculated for the coefficients of the general ML model. Random noise is added to the gradients to generate respective noisy gradients. A combined inference may be generated based on: the private ML model, the general ML model, and one of the new data point(s). The noisy gradients are sent to the server. The server adjusts the general ML model based on the noisy gradients from the client devices. This client/server process may be repeated indefinitely.Type: ApplicationFiled: March 25, 2020Publication date: March 11, 2021Inventors: Daniel Peterson, Pallika Haridas Kanani, Virendra J. Marathe
-
Publication number: 20200372290Abstract: A Bayesian test of demographic parity for learning to rank may be applied to determine ranking modifications. A fairness control system receiving a ranking of items may apply Bayes factors to determine a likelihood of bias for the ranking. These Bayes factors may include a factor for determining bias in each item and a factor for determining bias in the ranking of the items. An indicator of bias may be generated using the applied Bayes factors and the fairness control system may modify the ranking if the determines likelihood of bias satisfies modification criteria for the ranking.Type: ApplicationFiled: February 4, 2020Publication date: November 26, 2020Inventors: Jean-Baptiste Frederic George Tristan, Pallika Haridas Kanani, Michael Louis Wick, Swetasudha Panda, Haniyeh Mahmoudian
-
Patent number: 10410139Abstract: A system that performs natural language processing receives a text corpus that includes a plurality of documents and receives a knowledge base. The system generates a set of document n-grams from the text corpus and considers all n-grams as candidate mentions. The system, for each candidate mention, queries the knowledge base and in response retrieves results. From the results retrieved by the queries, the system generates a search space and generates a joint model from the search space.Type: GrantFiled: May 31, 2016Date of Patent: September 10, 2019Assignee: Oracle International CorporationInventors: Pallika Haridas Kanani, Michael Louis Wick, Katherine Silverstein
-
Patent number: 9779085Abstract: A natural language processing (“NLP”) manager is provided that manages NLP model training. An unlabeled corpus of multilingual documents is provided that span a plurality of target languages. A multilingual embedding is trained on the corpus of multilingual documents as input training data, the multilingual embedding being generalized across the target languages by modifying the input training data and/or transforming multilingual dictionaries into constraints in an underlying optimization problem. An NLP model is trained on training data for a first language of the target languages, using word embeddings of the trained multilingual embedding as features. The trained NLP model is applied for data from a second of the target languages, the first and second languages being different.Type: GrantFiled: September 24, 2015Date of Patent: October 3, 2017Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Michael Louis Wick, Pallika Haridas Kanani, Adam Craig Pocock
-
Publication number: 20170193396Abstract: A system that performs natural language processing receives a text corpus that includes a plurality of documents and receives a knowledge base. The system generates a set of document n-grams from the text corpus and considers all n-grams as candidate mentions. The system, for each candidate mention, queries the knowledge base and in response retrieves results. From the results retrieved by the queries, the system generates a search space and generates a joint model from the search space.Type: ApplicationFiled: May 31, 2016Publication date: July 6, 2017Inventors: Pallika Haridas KANANI, Michael Louis WICK, Katherine SILVERSTEIN
-
Publication number: 20160350288Abstract: A natural language processing (“NLP”) manager is provided that manages NLP model training. An unlabeled corpus of multilingual documents is provided that span a plurality of target languages. A multilingual embedding is trained on the corpus of multilingual documents as input training data, the multilingual embedding being generalized across the target languages by modifying the input training data and/or transforming multilingual dictionaries into constraints in an underlying optimization problem. An NLP model is trained on training data for a first language of the target languages, using word embeddings of the trained multilingual embedding as features. The trained NLP model is applied for data from a second of the target languages, the first and second languages being different.Type: ApplicationFiled: September 24, 2015Publication date: December 1, 2016Inventors: Michael Louis WICK, Pallika Haridas KANANI, Adam Craig POCOCK
-
Publication number: 20160321358Abstract: A system is provided that extracts attribute values. The system receives data including unstructured text from a data store. The system further tokenizes the unstructured text into tokens, where a token is a character of the unstructured text. The system further annotates the tokens with attribute labels, where an attribute label for a token is determined, in least in part, based on a word that the token originates from within the unstructured text. The system further groups the tokens into text segments based on the attribute labels, where a set of tokens that are annotated with an identical attribute label are grouped into a text segment, and where the text segments define attribute values. The system further stores the attribute labels and the attribute values within the data store.Type: ApplicationFiled: April 30, 2015Publication date: November 3, 2016Inventors: Pallika Haridas KANANI, Michael Louis WICK, Adam Craig POCOCK