Patents by Inventor Pallika Haridas KANANI

Pallika Haridas KANANI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240394597
    Abstract: Federated training of a machine learning model with enforcement of subject level privacy is implemented. Respective samples of data items from a training data set are generated at multiple nodes of a federated machine learning system. Noise values are determined for individual ones of the sampled data items according to respective counts of data items of particular subjects and the cumulative counts of the items of the subjects. Respective gradients for the data items are the determined The gradients are then clipped and noise values are applied. Each subject's noisy clipped gradients in the sample are then aggregated. The aggregasted gradients for the entire sample are then used for determining machine learning model updates.
    Type: Application
    Filed: March 6, 2024
    Publication date: November 28, 2024
    Inventors: Virendra J. Marathe, Pallika Haridas Kanani
  • Patent number: 12130929
    Abstract: Subject level privacy attack analysis for federated learning may be performed. A request that selects an analysis of one or more inference attacks may be received to determine a presence of data of a subject in a training set of a federated machine learning model. The selected inference attacks may be performed to determine the presence of the data of subject in the training set of the federated machine learning model. Respective success measurements may be generated for the selected inference attacks based on the performance of the selected inference attacks, which may then be provided.
    Type: Grant
    Filed: February 25, 2022
    Date of Patent: October 29, 2024
    Assignee: Oracle International Corporation
    Inventors: Pallika Haridas Kanani, Virendra J. Marathe, Daniel Wyde Peterson, Anshuman Suri
  • Publication number: 20230394374
    Abstract: Hierarchical gradient averaging is performed as part of training a machine learning model to enforce subject level privacy. A sample of data items from a training data set is identified and respective gradients for the data items are determined. The gradients are then clipped. Each subject's clipped gradients in the sample are averaged. A noise value is added to a sum of the averaged gradients of each of the subjects in the sample. An average gradient for the entire sample is determined from the averaged gradients of the individual subjects with the added noise value. This average gradient for the entire sample is used for determining machine learning model updates.
    Type: Application
    Filed: June 6, 2022
    Publication date: December 7, 2023
    Inventors: Virendra J. Marathe, Pallika Haridas Kanani
  • Publication number: 20230274004
    Abstract: Subject level privacy attack analysis for federated learning may be performed. A request that selects an analysis of one or more inference attacks may be received to determine a presence of data of a subject in a training set of a federated machine learning model. The selected inference attacks may be performed to determine the presence of the data of subject in the training set of the federated machine learning model. Respective success measurements may be generated for the selected inference attacks based on the performance of the selected inference attacks, which may then be provided.
    Type: Application
    Filed: February 25, 2022
    Publication date: August 31, 2023
    Inventors: Pallika Haridas Kanani, Virendra J. Marathe, Daniel Wyde Peterson, Anshuman Suri
  • Publication number: 20230052231
    Abstract: Group-level privacy preservation is implemented within federated machine learning. An aggregation server may distribute a machine learning model to multiple users each including respective private datasets. The private datasets may individually include multiple items associated with a single group. Individual users may train the model using their local, private dataset to generate one or more parameter updates and to determine a count of the largest number of items associated with any single group of a number of groups in the dataset. Parameter updates generated by the individual users may be modified by applying respective noise values to individual ones of the parameter updates according to the respective counts to ensure differential privacy for the groups of the dataset. The aggregation server may aggregate the updates into a single set of parameter updates to update the machine learning model.
    Type: Application
    Filed: May 11, 2022
    Publication date: February 16, 2023
    Inventors: Virendra J. Marathe, Pallika Haridas Kanani
  • Publication number: 20230047092
    Abstract: User-level privacy preservation is implemented within federated machine learning. An aggregation server may distribute a machine learning model to multiple users each including respective private datasets. Individual users may train the model using the local, private dataset to generate one or more parameter updates. Prior to sending the generated parameter updates to the aggregation server for incorporation into the machine learning model, a user may modify the parameter updates by applying respective noise values to individual ones of the parameter updates to ensure differential privacy for the dataset private to the user. The aggregation server may then receive the respective modified parameter updates from the multiple users and aggregate the updates into a single set of parameter updates to update the machine learning model. The federated machine learning may further include iteratively performing said sending, training, modifying, receiving, aggregating and updating steps.
    Type: Application
    Filed: May 11, 2022
    Publication date: February 16, 2023
    Inventors: Virendra Marathe, Pallika Haridas Kanani, Daniel Peterson, Swetasudha Panda
  • Patent number: 11443240
    Abstract: Herein are techniques for domain adaptation of a machine learning (ML) model. These techniques impose differential privacy onto federated learning by the ML model. In an embodiment, each of many client devices receive, from a server, coefficients of a general ML model. For respective new data point(s), each client device operates as follows. Based on the new data point(s), a respective private ML model is trained. Based on the new data point(s), respective gradients are calculated for the coefficients of the general ML model. Random noise is added to the gradients to generate respective noisy gradients. A combined inference may be generated based on: the private ML model, the general ML model, and one of the new data point(s). The noisy gradients are sent to the server. The server adjusts the general ML model based on the noisy gradients from the client devices. This client/server process may be repeated indefinitely.
    Type: Grant
    Filed: March 25, 2020
    Date of Patent: September 13, 2022
    Assignee: Oracle International Corporation
    Inventors: Daniel Peterson, Pallika Haridas Kanani, Virendra J. Marathe
  • Patent number: 11010768
    Abstract: A system is provided that extracts attribute values. The system receives data including unstructured text from a data store. The system further tokenizes the unstructured text into tokens, where a token is a character of the unstructured text. The system further annotates the tokens with attribute labels, where an attribute label for a token is determined, in least in part, based on a word that the token originates from within the unstructured text. The system further groups the tokens into text segments based on the attribute labels, where a set of tokens that are annotated with an identical attribute label are grouped into a text segment, and where the text segments define attribute values. The system further stores the attribute labels and the attribute values within the data store.
    Type: Grant
    Filed: April 30, 2015
    Date of Patent: May 18, 2021
    Assignee: Oracle International Corporation
    Inventors: Pallika Haridas Kanani, Michael Louis Wick, Adam Craig Pocock
  • Publication number: 20210073677
    Abstract: Herein are techniques for domain adaptation of a machine learning (ML) model. These techniques impose differential privacy onto federated learning by the ML model. In an embodiment, each of many client devices receive, from a server, coefficients of a general ML model. For respective new data point(s), each client device operates as follows. Based on the new data point(s), a respective private ML model is trained. Based on the new data point(s), respective gradients are calculated for the coefficients of the general ML model. Random noise is added to the gradients to generate respective noisy gradients. A combined inference may be generated based on: the private ML model, the general ML model, and one of the new data point(s). The noisy gradients are sent to the server. The server adjusts the general ML model based on the noisy gradients from the client devices. This client/server process may be repeated indefinitely.
    Type: Application
    Filed: March 25, 2020
    Publication date: March 11, 2021
    Inventors: Daniel Peterson, Pallika Haridas Kanani, Virendra J. Marathe
  • Publication number: 20200372290
    Abstract: A Bayesian test of demographic parity for learning to rank may be applied to determine ranking modifications. A fairness control system receiving a ranking of items may apply Bayes factors to determine a likelihood of bias for the ranking. These Bayes factors may include a factor for determining bias in each item and a factor for determining bias in the ranking of the items. An indicator of bias may be generated using the applied Bayes factors and the fairness control system may modify the ranking if the determines likelihood of bias satisfies modification criteria for the ranking.
    Type: Application
    Filed: February 4, 2020
    Publication date: November 26, 2020
    Inventors: Jean-Baptiste Frederic George Tristan, Pallika Haridas Kanani, Michael Louis Wick, Swetasudha Panda, Haniyeh Mahmoudian
  • Patent number: 10410139
    Abstract: A system that performs natural language processing receives a text corpus that includes a plurality of documents and receives a knowledge base. The system generates a set of document n-grams from the text corpus and considers all n-grams as candidate mentions. The system, for each candidate mention, queries the knowledge base and in response retrieves results. From the results retrieved by the queries, the system generates a search space and generates a joint model from the search space.
    Type: Grant
    Filed: May 31, 2016
    Date of Patent: September 10, 2019
    Assignee: Oracle International Corporation
    Inventors: Pallika Haridas Kanani, Michael Louis Wick, Katherine Silverstein
  • Patent number: 9779085
    Abstract: A natural language processing (“NLP”) manager is provided that manages NLP model training. An unlabeled corpus of multilingual documents is provided that span a plurality of target languages. A multilingual embedding is trained on the corpus of multilingual documents as input training data, the multilingual embedding being generalized across the target languages by modifying the input training data and/or transforming multilingual dictionaries into constraints in an underlying optimization problem. An NLP model is trained on training data for a first language of the target languages, using word embeddings of the trained multilingual embedding as features. The trained NLP model is applied for data from a second of the target languages, the first and second languages being different.
    Type: Grant
    Filed: September 24, 2015
    Date of Patent: October 3, 2017
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Michael Louis Wick, Pallika Haridas Kanani, Adam Craig Pocock
  • Publication number: 20170193396
    Abstract: A system that performs natural language processing receives a text corpus that includes a plurality of documents and receives a knowledge base. The system generates a set of document n-grams from the text corpus and considers all n-grams as candidate mentions. The system, for each candidate mention, queries the knowledge base and in response retrieves results. From the results retrieved by the queries, the system generates a search space and generates a joint model from the search space.
    Type: Application
    Filed: May 31, 2016
    Publication date: July 6, 2017
    Inventors: Pallika Haridas KANANI, Michael Louis WICK, Katherine SILVERSTEIN
  • Publication number: 20160350288
    Abstract: A natural language processing (“NLP”) manager is provided that manages NLP model training. An unlabeled corpus of multilingual documents is provided that span a plurality of target languages. A multilingual embedding is trained on the corpus of multilingual documents as input training data, the multilingual embedding being generalized across the target languages by modifying the input training data and/or transforming multilingual dictionaries into constraints in an underlying optimization problem. An NLP model is trained on training data for a first language of the target languages, using word embeddings of the trained multilingual embedding as features. The trained NLP model is applied for data from a second of the target languages, the first and second languages being different.
    Type: Application
    Filed: September 24, 2015
    Publication date: December 1, 2016
    Inventors: Michael Louis WICK, Pallika Haridas KANANI, Adam Craig POCOCK
  • Publication number: 20160321358
    Abstract: A system is provided that extracts attribute values. The system receives data including unstructured text from a data store. The system further tokenizes the unstructured text into tokens, where a token is a character of the unstructured text. The system further annotates the tokens with attribute labels, where an attribute label for a token is determined, in least in part, based on a word that the token originates from within the unstructured text. The system further groups the tokens into text segments based on the attribute labels, where a set of tokens that are annotated with an identical attribute label are grouped into a text segment, and where the text segments define attribute values. The system further stores the attribute labels and the attribute values within the data store.
    Type: Application
    Filed: April 30, 2015
    Publication date: November 3, 2016
    Inventors: Pallika Haridas KANANI, Michael Louis WICK, Adam Craig POCOCK