Patents by Inventor Abigail Goldsteen
Abigail Goldsteen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11893132Abstract: A method, computer system, and a computer program product for personal data discovery is provided. The present invention may include determining at least one feature used to train a target machine learning (ML) model. The present invention may also include mapping the determined at least one feature to at least one location of a data store including at least one personal data associated with the determined at least one feature. The present invention may further include retrieving a data record of the at least one personal data associated with the mapped at least one feature from the at least one location of the data store. The present invention may also include determining that the target ML model includes a trace of the retrieved data record. The present invention may further include marking the target ML model as containing the at least one personal data.Type: GrantFiled: February 23, 2021Date of Patent: February 6, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Abigail Goldsteen, Micha Gideon Moffie, Ariel Farkash
-
Patent number: 11841977Abstract: An example system includes a processor to receive training data and predictions on the training data of a trained machine learning model to be anonymized. The processor is to generate generalized data from training data based on the predictions of the trained machine learning model on the training data. The processor is to train an anonymized machine learning model using the generalized data.Type: GrantFiled: February 11, 2021Date of Patent: December 12, 2023Assignee: International Business Machines CorporationInventors: Abigail Goldsteen, Ariel Farkash, Micha Gideon Moffie, Gilad Ezov, Ron Shmelkin
-
Publication number: 20220309381Abstract: An example system includes a processor to receive one or more target data samples from a training set used to train a machine learning model, a training data sample including a different data sample from the training set, and a forgotten model including the machine learning model with a forgetting mechanism applied on the target data sample. The processor can calculate a model uncertainty or a model similarity based on the forgotten model, the target data sample, and the training data sample. The processor can verify a removal of the target data sample from the forgotten model based on the model similarity or the model uncertainty.Type: ApplicationFiled: March 23, 2021Publication date: September 29, 2022Inventors: Abigail GOLDSTEEN, Ron SHMELKIN
-
Publication number: 20220300837Abstract: A method, computer system, and a computer program product for testing a data removal are provided. Data elements are marked with a respective mark per represented entity. The marked data elements, with labels indicating the respective marks, are input into a machine learning model to form a trained machine learning model. The trained machine learning model is configured to perform a dual task that includes a main task and a secondary task that includes a classification based on the labels. A forgetting mechanism is applied to the trained machine learning model to remove a data element including a test mark of the marked data elements. A test data element marked with the test mark is input into the revised machine learning model. The classification of the secondary task of an output of the revised machine learning model is determined for the input test data element.Type: ApplicationFiled: March 22, 2021Publication date: September 22, 2022Inventors: RON SHMELKIN, Abigail Goldsteen, GILAD EZOV, ARIEL FARKASH
-
Publication number: 20220300822Abstract: A method for forgetting data samples from a pretrained neural network (NN) model is provided. The method includes training an adversarial model to classify training data samples as members of the NN model and test data samples as non-members of the NN model. The method includes performing the following iteratively until the NN model has forgotten a specified threshold of data samples to be forgotten: (1) classifying the data samples as members or non-members using the trained adversarial model; (2) for the member data samples, determining a subset that includes data samples to be forgotten; (3) labeling the data samples within the subset as non-members and updating the NN model based on weight update techniques that cause the NN model to forget the data samples; (4) retraining the NN model without the data samples that have been forgotten; and (5) retraining the adversarial model for the next iteration.Type: ApplicationFiled: March 17, 2021Publication date: September 22, 2022Inventors: Ron SHMELKIN, Abigail GOLDSTEEN, Ariel FARKASH
-
Publication number: 20220284341Abstract: A method, computer system, and a computer program product for testing a data removal from a trained machine learning model trained with a training data set are provided. A new machine learning model is trained by using an altered data set that includes training data from the training data set. The altered data set is without removal data. A first forgetting mechanism is applied to the trained machine learning model to form a first revised machine learning model. The applying includes removing the removal data from the trained machine learning model. A first membership leakage quantification on the first revised machine learning model is performed to quantify a first membership leakage of the removal data and that uses the new machine learning model for comparison. A first leakage score is determined from the first membership leakage quantification to test the forgetting mechanism.Type: ApplicationFiled: March 3, 2021Publication date: September 8, 2022Inventors: Abigail Goldsteen, RON SHMELKIN
-
Publication number: 20220269814Abstract: A method, computer system, and a computer program product for personal data discovery is provided. The present invention may include determining at least one feature used to train a target machine learning (ML) model. The present invention may also include mapping the determined at least one feature to at least one location of a data store including at least one personal data associated with the determined at least one feature. The present invention may further include retrieving a data record of the at least one personal data associated with the mapped at least one feature from the at least one location of the data store. The present invention may also include determining that the target ML model includes a trace of the retrieved data record. The present invention may further include marking the target ML model as containing the at least one personal data.Type: ApplicationFiled: February 23, 2021Publication date: August 25, 2022Inventors: Abigail Goldsteen, Micha Gideon Moffie, ARIEL FARKASH
-
Publication number: 20220253554Abstract: An example system includes a processor to receive training data and predictions on the training data of a trained machine learning model to be anonymized. The processor is to generate generalized data from training data based on the predictions of the trained machine learning model on the training data. The processor is to train an anonymized machine learning model using the generalized data.Type: ApplicationFiled: February 11, 2021Publication date: August 11, 2022Inventors: Abigail GOLDSTEEN, Ariel FARKASH, Micha Gideon MOFFIE, Gilad EZOV, Ron SHMELKIN
-
Patent number: 11281728Abstract: A method, apparatus and a product for data generalization for predictive models. The method comprising: based on a labeled dataset, determining a plurality of buckets, each of which has an associated label; determining a plurality of clusters, grouping similar instances in the same bucket; based on the plurality of clusters, determining an alternative set of features comprising a set of generalized features, wherein each generalized feature corresponds to a cluster of the plurality of clusters, wherein a generalized feature that corresponds to a cluster is indicative of the instance being mapped to the corresponding cluster; obtaining a second instance; determining a generalized second instance that comprises a valuation of the alternative set of features for the second instance; and based on the generalized second instance, determining a label for the second instance.Type: GrantFiled: August 6, 2019Date of Patent: March 22, 2022Assignee: International Business Machines CorporationInventors: Gilad Ezov, Ariel Farkash, Abigail Goldsteen, Ron Shmelkin, Micha Gideon Moffie
-
Patent number: 11240044Abstract: Embodiments of the present systems and methods may provide techniques for verifying the correct application purpose for applications that serve multiple purposes and to determine the correct purpose for each requested data access. For example, in an embodiment, a method for controlling application access to data implemented in a computer comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor may comprise: receiving an application comprising a plurality of application parts, each application part associated with a declared data access purpose and generating a cryptographic certificate for each application part to be certified by determining whether a declared data access purpose for each application part to be certified is correct and the only data access purpose for that part, wherein the declared purpose is included in purpose information associated with each application part to be certified.Type: GrantFiled: November 22, 2018Date of Patent: February 1, 2022Assignee: International Business Machines CorporationInventors: Ariel Farkash, Abigail Goldsteen, Micha Gideon Moffie
-
Patent number: 11182491Abstract: A method of limiting data usage for certified purposes by using functional encryption, comprising: receiving from a software publisher an application code and declared privacy information, the declared privacy information specifies at least one declared usage for at least one data type; analyzing the application's usage of data collected by the application, to identify an actual usage of the at least one data type by a function; identifying when the actual usage is compliant with the at least one declared usage according to the analysis; in response to the identification, creating a pair of a public key and a master private key; creating a function private key for the function using the master private key; and sending the function private key to the software publisher to be used for operating the function on data which is encrypted using the public key.Type: GrantFiled: February 4, 2020Date of Patent: November 23, 2021Assignee: International Business Machines CorporationInventors: Abigail Goldsteen, Ron Shmelkin, Gilad Ezov, Muhammad Barham
-
Publication number: 20210240840Abstract: A method of limiting data usage for certified purposes by using functional encryption, comprising: receiving from a software publisher an application code and declared privacy information, the declared privacy information specifies at least one declared usage for at least one data type; analyzing the application's usage of data collected by the application, to identify an actual usage of the at least one data type by a function; identifying when the actual usage is compliant with the at least one declared usage according to the analysis; in response to the identification, creating a pair of a public key and a master private key; creating a function private key for the function using the master private key; and sending the function private key to the software publisher to be used for operating the function on data which is encrypted using the public key.Type: ApplicationFiled: February 4, 2020Publication date: August 5, 2021Inventors: ABIGAIL GOLDSTEEN, RON SHMELKIN, GILAD EZOV, MUHAMMAD BARHAM
-
Publication number: 20210042356Abstract: A method, apparatus and a product for data generalization for predictive models. The method comprising: based on a labeled dataset, determining a plurality of buckets, each of which has an associated label; determining a plurality of clusters, grouping similar instances in the same bucket; based on the plurality of clusters, determining an alternative set of features comprising a set of generalized features, wherein each generalized feature corresponds to a cluster of the plurality of clusters, wherein a generalized feature that corresponds to a cluster is indicative of the instance being mapped to the corresponding cluster; obtaining a second instance; determining a generalized second instance that comprises a valuation of the alternative set of features for the second instance; and based on the generalized second instance, determining a label for the second instance.Type: ApplicationFiled: August 6, 2019Publication date: February 11, 2021Inventors: GILAD EZOV, ARIEL FARKASH, Abigail Goldsteen, RON SHMELKIN, Micha Gideon Moffie
-
Publication number: 20210042629Abstract: A method, apparatus and a product for data generalization for predictive models. The method comprising: obtaining a training dataset that comprises a plurality of training instances and predicted labels thereof, wherein each training instance is a valuation of a set of features, wherein the set of features comprises a feature having a domain, wherein the predicted label of each training instance is a label predicted thereto by a predictive model; training an auxiliary model using the training dataset; based on the auxiliary model, determining an alternative set of features that is a generalization of the set of features, wherein the alternative set of features comprises a generalized feature having a generalized domain, wherein each value in the generalized domain corresponds to one or more values in the domain; obtaining a generalized instance having a valuation of the alternative set of features; and determining a label for the generalized instance.Type: ApplicationFiled: August 6, 2019Publication date: February 11, 2021Inventors: GILAD EZOV, ARIEL FARKASH, Abigail Goldsteen, RON SHMELKIN, Micha Gideon Moffie
-
Patent number: 10831869Abstract: Embodiments of the present systems and methods may provide data watermarking without reliance on error-tolerant fields, thereby providing for the incorporation of watermarks in data that was not considered suitable for watermarking. For example, in an embodiment, a computer-implemented method for watermarking data may comprise inserting watermark data into a field that requires format-preserving encryption.Type: GrantFiled: July 2, 2018Date of Patent: November 10, 2020Assignee: International Business Machines CorporationInventors: Abigail Goldsteen, Lev Greenberg, Ariel Farkash, Boris Rozenberg, Omri Soceanu
-
Publication number: 20200320202Abstract: Conducting a privacy vulnerability assessment of a software application that comprises program code, by performing at least one of: (i) evaluating the program code to identify code segments presenting a potential dissemination of specified data to an unauthorized destination, (ii) detecting one or more execution paths in the software application which use the specified data for an unauthorized purpose, and (iii) analyzing the content of data flows from the software application to detect the specified data in the data flows. Then, generating one or more vulnerability summaries, based, at least in part, on the results of the evaluating, the detecting, and the analyzing.Type: ApplicationFiled: April 4, 2019Publication date: October 8, 2020Inventors: ARIEL FARKASH, Abigail Goldsteen, RON SHMELKIN
-
Publication number: 20200169421Abstract: Embodiments of the present systems and methods may provide techniques for verifying the correct application purpose for applications that serve multiple purposes and to determine the correct purpose for each requested data access. For example, in an embodiment, a method for controlling application access to data implemented in a computer comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor may comprise: receiving an application comprising a plurality of application parts, each application part associated with a declared data access purpose and generating a cryptographic certificate for each application part to be certified by determining whether a declared data access purpose for each application part to be certified is correct and the only data access purpose for that part, wherein the declared purpose is included in purpose information associated with each application part to be certified.Type: ApplicationFiled: November 22, 2018Publication date: May 28, 2020Inventors: ARIEL FARKASH, Abigail Goldsteen, Micha Gideon Moffie
-
Patent number: 10616206Abstract: A method of creating an application purpose certificate, comprising: receiving from a software publisher an application code and declared privacy information, the declared privacy information includes at least one allowed usage purpose for each of a plurality of data types; analyzing the application's usage of data of each of the plurality of data types; verifying the usage is compliant with the least one allowed usage purpose according to the analysis; creating an encrypted digital purpose certificate, the digital purpose certificate is unique for the application code; and sending the digital purpose certificate to the software publisher to be bundled with the application code and a publisher authentication certificate.Type: GrantFiled: September 27, 2016Date of Patent: April 7, 2020Assignee: International Business Machines CorporationInventors: Sima Nadler, Abigail Goldsteen
-
Publication number: 20200004935Abstract: Embodiments of the present systems and methods may provide data watermarking without reliance on error-tolerant fields, thereby providing for the incorporation of watermarks in data that was not considered suitable for watermarking. For example, in an embodiment, a computer-implemented method for watermarking data may comprise inserting watermark data into a field that requires format-preserving encryption.Type: ApplicationFiled: July 2, 2018Publication date: January 2, 2020Inventors: ABIGAIL GOLDSTEEN, Lev Greenberg, Ariel Farkash, Boris Rozenberg, Omri Soceanu
-
Patent number: 10148423Abstract: A data security method including creating a token-including plaintext by including a predefined token into a plaintext, generating a cyphertext by encrypting the token-including plaintext using format-preserving encryption, generating a decrypted cyphertext by decrypting an input text, determining whether the decrypted cyphertext includes a first predefined token, if the decrypted cyphertext includes the first predefined token, recreating the plaintext by removing the first predefined token from the decrypted cyphertext, and if the decrypted cyphertext does not include the first predefined token, using the input text as the plaintext.Type: GrantFiled: July 20, 2015Date of Patent: December 4, 2018Assignee: International Business Machines CorporationInventors: Ariel Farkash, Abigail Goldsteen, Micha Moffie