Patents by Inventor Abigail Goldsteen

Abigail Goldsteen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11893132
    Abstract: A method, computer system, and a computer program product for personal data discovery is provided. The present invention may include determining at least one feature used to train a target machine learning (ML) model. The present invention may also include mapping the determined at least one feature to at least one location of a data store including at least one personal data associated with the determined at least one feature. The present invention may further include retrieving a data record of the at least one personal data associated with the mapped at least one feature from the at least one location of the data store. The present invention may also include determining that the target ML model includes a trace of the retrieved data record. The present invention may further include marking the target ML model as containing the at least one personal data.
    Type: Grant
    Filed: February 23, 2021
    Date of Patent: February 6, 2024
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Abigail Goldsteen, Micha Gideon Moffie, Ariel Farkash
  • Patent number: 11841977
    Abstract: An example system includes a processor to receive training data and predictions on the training data of a trained machine learning model to be anonymized. The processor is to generate generalized data from training data based on the predictions of the trained machine learning model on the training data. The processor is to train an anonymized machine learning model using the generalized data.
    Type: Grant
    Filed: February 11, 2021
    Date of Patent: December 12, 2023
    Assignee: International Business Machines Corporation
    Inventors: Abigail Goldsteen, Ariel Farkash, Micha Gideon Moffie, Gilad Ezov, Ron Shmelkin
  • Publication number: 20220309381
    Abstract: An example system includes a processor to receive one or more target data samples from a training set used to train a machine learning model, a training data sample including a different data sample from the training set, and a forgotten model including the machine learning model with a forgetting mechanism applied on the target data sample. The processor can calculate a model uncertainty or a model similarity based on the forgotten model, the target data sample, and the training data sample. The processor can verify a removal of the target data sample from the forgotten model based on the model similarity or the model uncertainty.
    Type: Application
    Filed: March 23, 2021
    Publication date: September 29, 2022
    Inventors: Abigail GOLDSTEEN, Ron SHMELKIN
  • Publication number: 20220300837
    Abstract: A method, computer system, and a computer program product for testing a data removal are provided. Data elements are marked with a respective mark per represented entity. The marked data elements, with labels indicating the respective marks, are input into a machine learning model to form a trained machine learning model. The trained machine learning model is configured to perform a dual task that includes a main task and a secondary task that includes a classification based on the labels. A forgetting mechanism is applied to the trained machine learning model to remove a data element including a test mark of the marked data elements. A test data element marked with the test mark is input into the revised machine learning model. The classification of the secondary task of an output of the revised machine learning model is determined for the input test data element.
    Type: Application
    Filed: March 22, 2021
    Publication date: September 22, 2022
    Inventors: RON SHMELKIN, Abigail Goldsteen, GILAD EZOV, ARIEL FARKASH
  • Publication number: 20220300822
    Abstract: A method for forgetting data samples from a pretrained neural network (NN) model is provided. The method includes training an adversarial model to classify training data samples as members of the NN model and test data samples as non-members of the NN model. The method includes performing the following iteratively until the NN model has forgotten a specified threshold of data samples to be forgotten: (1) classifying the data samples as members or non-members using the trained adversarial model; (2) for the member data samples, determining a subset that includes data samples to be forgotten; (3) labeling the data samples within the subset as non-members and updating the NN model based on weight update techniques that cause the NN model to forget the data samples; (4) retraining the NN model without the data samples that have been forgotten; and (5) retraining the adversarial model for the next iteration.
    Type: Application
    Filed: March 17, 2021
    Publication date: September 22, 2022
    Inventors: Ron SHMELKIN, Abigail GOLDSTEEN, Ariel FARKASH
  • Publication number: 20220284341
    Abstract: A method, computer system, and a computer program product for testing a data removal from a trained machine learning model trained with a training data set are provided. A new machine learning model is trained by using an altered data set that includes training data from the training data set. The altered data set is without removal data. A first forgetting mechanism is applied to the trained machine learning model to form a first revised machine learning model. The applying includes removing the removal data from the trained machine learning model. A first membership leakage quantification on the first revised machine learning model is performed to quantify a first membership leakage of the removal data and that uses the new machine learning model for comparison. A first leakage score is determined from the first membership leakage quantification to test the forgetting mechanism.
    Type: Application
    Filed: March 3, 2021
    Publication date: September 8, 2022
    Inventors: Abigail Goldsteen, RON SHMELKIN
  • Publication number: 20220269814
    Abstract: A method, computer system, and a computer program product for personal data discovery is provided. The present invention may include determining at least one feature used to train a target machine learning (ML) model. The present invention may also include mapping the determined at least one feature to at least one location of a data store including at least one personal data associated with the determined at least one feature. The present invention may further include retrieving a data record of the at least one personal data associated with the mapped at least one feature from the at least one location of the data store. The present invention may also include determining that the target ML model includes a trace of the retrieved data record. The present invention may further include marking the target ML model as containing the at least one personal data.
    Type: Application
    Filed: February 23, 2021
    Publication date: August 25, 2022
    Inventors: Abigail Goldsteen, Micha Gideon Moffie, ARIEL FARKASH
  • Publication number: 20220253554
    Abstract: An example system includes a processor to receive training data and predictions on the training data of a trained machine learning model to be anonymized. The processor is to generate generalized data from training data based on the predictions of the trained machine learning model on the training data. The processor is to train an anonymized machine learning model using the generalized data.
    Type: Application
    Filed: February 11, 2021
    Publication date: August 11, 2022
    Inventors: Abigail GOLDSTEEN, Ariel FARKASH, Micha Gideon MOFFIE, Gilad EZOV, Ron SHMELKIN
  • Patent number: 11281728
    Abstract: A method, apparatus and a product for data generalization for predictive models. The method comprising: based on a labeled dataset, determining a plurality of buckets, each of which has an associated label; determining a plurality of clusters, grouping similar instances in the same bucket; based on the plurality of clusters, determining an alternative set of features comprising a set of generalized features, wherein each generalized feature corresponds to a cluster of the plurality of clusters, wherein a generalized feature that corresponds to a cluster is indicative of the instance being mapped to the corresponding cluster; obtaining a second instance; determining a generalized second instance that comprises a valuation of the alternative set of features for the second instance; and based on the generalized second instance, determining a label for the second instance.
    Type: Grant
    Filed: August 6, 2019
    Date of Patent: March 22, 2022
    Assignee: International Business Machines Corporation
    Inventors: Gilad Ezov, Ariel Farkash, Abigail Goldsteen, Ron Shmelkin, Micha Gideon Moffie
  • Patent number: 11240044
    Abstract: Embodiments of the present systems and methods may provide techniques for verifying the correct application purpose for applications that serve multiple purposes and to determine the correct purpose for each requested data access. For example, in an embodiment, a method for controlling application access to data implemented in a computer comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor may comprise: receiving an application comprising a plurality of application parts, each application part associated with a declared data access purpose and generating a cryptographic certificate for each application part to be certified by determining whether a declared data access purpose for each application part to be certified is correct and the only data access purpose for that part, wherein the declared purpose is included in purpose information associated with each application part to be certified.
    Type: Grant
    Filed: November 22, 2018
    Date of Patent: February 1, 2022
    Assignee: International Business Machines Corporation
    Inventors: Ariel Farkash, Abigail Goldsteen, Micha Gideon Moffie
  • Patent number: 11182491
    Abstract: A method of limiting data usage for certified purposes by using functional encryption, comprising: receiving from a software publisher an application code and declared privacy information, the declared privacy information specifies at least one declared usage for at least one data type; analyzing the application's usage of data collected by the application, to identify an actual usage of the at least one data type by a function; identifying when the actual usage is compliant with the at least one declared usage according to the analysis; in response to the identification, creating a pair of a public key and a master private key; creating a function private key for the function using the master private key; and sending the function private key to the software publisher to be used for operating the function on data which is encrypted using the public key.
    Type: Grant
    Filed: February 4, 2020
    Date of Patent: November 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Abigail Goldsteen, Ron Shmelkin, Gilad Ezov, Muhammad Barham
  • Publication number: 20210240840
    Abstract: A method of limiting data usage for certified purposes by using functional encryption, comprising: receiving from a software publisher an application code and declared privacy information, the declared privacy information specifies at least one declared usage for at least one data type; analyzing the application's usage of data collected by the application, to identify an actual usage of the at least one data type by a function; identifying when the actual usage is compliant with the at least one declared usage according to the analysis; in response to the identification, creating a pair of a public key and a master private key; creating a function private key for the function using the master private key; and sending the function private key to the software publisher to be used for operating the function on data which is encrypted using the public key.
    Type: Application
    Filed: February 4, 2020
    Publication date: August 5, 2021
    Inventors: ABIGAIL GOLDSTEEN, RON SHMELKIN, GILAD EZOV, MUHAMMAD BARHAM
  • Publication number: 20210042356
    Abstract: A method, apparatus and a product for data generalization for predictive models. The method comprising: based on a labeled dataset, determining a plurality of buckets, each of which has an associated label; determining a plurality of clusters, grouping similar instances in the same bucket; based on the plurality of clusters, determining an alternative set of features comprising a set of generalized features, wherein each generalized feature corresponds to a cluster of the plurality of clusters, wherein a generalized feature that corresponds to a cluster is indicative of the instance being mapped to the corresponding cluster; obtaining a second instance; determining a generalized second instance that comprises a valuation of the alternative set of features for the second instance; and based on the generalized second instance, determining a label for the second instance.
    Type: Application
    Filed: August 6, 2019
    Publication date: February 11, 2021
    Inventors: GILAD EZOV, ARIEL FARKASH, Abigail Goldsteen, RON SHMELKIN, Micha Gideon Moffie
  • Publication number: 20210042629
    Abstract: A method, apparatus and a product for data generalization for predictive models. The method comprising: obtaining a training dataset that comprises a plurality of training instances and predicted labels thereof, wherein each training instance is a valuation of a set of features, wherein the set of features comprises a feature having a domain, wherein the predicted label of each training instance is a label predicted thereto by a predictive model; training an auxiliary model using the training dataset; based on the auxiliary model, determining an alternative set of features that is a generalization of the set of features, wherein the alternative set of features comprises a generalized feature having a generalized domain, wherein each value in the generalized domain corresponds to one or more values in the domain; obtaining a generalized instance having a valuation of the alternative set of features; and determining a label for the generalized instance.
    Type: Application
    Filed: August 6, 2019
    Publication date: February 11, 2021
    Inventors: GILAD EZOV, ARIEL FARKASH, Abigail Goldsteen, RON SHMELKIN, Micha Gideon Moffie
  • Patent number: 10831869
    Abstract: Embodiments of the present systems and methods may provide data watermarking without reliance on error-tolerant fields, thereby providing for the incorporation of watermarks in data that was not considered suitable for watermarking. For example, in an embodiment, a computer-implemented method for watermarking data may comprise inserting watermark data into a field that requires format-preserving encryption.
    Type: Grant
    Filed: July 2, 2018
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Abigail Goldsteen, Lev Greenberg, Ariel Farkash, Boris Rozenberg, Omri Soceanu
  • Publication number: 20200320202
    Abstract: Conducting a privacy vulnerability assessment of a software application that comprises program code, by performing at least one of: (i) evaluating the program code to identify code segments presenting a potential dissemination of specified data to an unauthorized destination, (ii) detecting one or more execution paths in the software application which use the specified data for an unauthorized purpose, and (iii) analyzing the content of data flows from the software application to detect the specified data in the data flows. Then, generating one or more vulnerability summaries, based, at least in part, on the results of the evaluating, the detecting, and the analyzing.
    Type: Application
    Filed: April 4, 2019
    Publication date: October 8, 2020
    Inventors: ARIEL FARKASH, Abigail Goldsteen, RON SHMELKIN
  • Publication number: 20200169421
    Abstract: Embodiments of the present systems and methods may provide techniques for verifying the correct application purpose for applications that serve multiple purposes and to determine the correct purpose for each requested data access. For example, in an embodiment, a method for controlling application access to data implemented in a computer comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor may comprise: receiving an application comprising a plurality of application parts, each application part associated with a declared data access purpose and generating a cryptographic certificate for each application part to be certified by determining whether a declared data access purpose for each application part to be certified is correct and the only data access purpose for that part, wherein the declared purpose is included in purpose information associated with each application part to be certified.
    Type: Application
    Filed: November 22, 2018
    Publication date: May 28, 2020
    Inventors: ARIEL FARKASH, Abigail Goldsteen, Micha Gideon Moffie
  • Patent number: 10616206
    Abstract: A method of creating an application purpose certificate, comprising: receiving from a software publisher an application code and declared privacy information, the declared privacy information includes at least one allowed usage purpose for each of a plurality of data types; analyzing the application's usage of data of each of the plurality of data types; verifying the usage is compliant with the least one allowed usage purpose according to the analysis; creating an encrypted digital purpose certificate, the digital purpose certificate is unique for the application code; and sending the digital purpose certificate to the software publisher to be bundled with the application code and a publisher authentication certificate.
    Type: Grant
    Filed: September 27, 2016
    Date of Patent: April 7, 2020
    Assignee: International Business Machines Corporation
    Inventors: Sima Nadler, Abigail Goldsteen
  • Publication number: 20200004935
    Abstract: Embodiments of the present systems and methods may provide data watermarking without reliance on error-tolerant fields, thereby providing for the incorporation of watermarks in data that was not considered suitable for watermarking. For example, in an embodiment, a computer-implemented method for watermarking data may comprise inserting watermark data into a field that requires format-preserving encryption.
    Type: Application
    Filed: July 2, 2018
    Publication date: January 2, 2020
    Inventors: ABIGAIL GOLDSTEEN, Lev Greenberg, Ariel Farkash, Boris Rozenberg, Omri Soceanu
  • Patent number: 10148423
    Abstract: A data security method including creating a token-including plaintext by including a predefined token into a plaintext, generating a cyphertext by encrypting the token-including plaintext using format-preserving encryption, generating a decrypted cyphertext by decrypting an input text, determining whether the decrypted cyphertext includes a first predefined token, if the decrypted cyphertext includes the first predefined token, recreating the plaintext by removing the first predefined token from the decrypted cyphertext, and if the decrypted cyphertext does not include the first predefined token, using the input text as the plaintext.
    Type: Grant
    Filed: July 20, 2015
    Date of Patent: December 4, 2018
    Assignee: International Business Machines Corporation
    Inventors: Ariel Farkash, Abigail Goldsteen, Micha Moffie