Patents by Inventor Daniel Bernau
Daniel Bernau has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11501172Abstract: A system is described that can include a machine learning model and at least one programmable processor communicatively coupled to the machine learning model. The machine learning model can receive data, generate a continuous probability distribution associated with the data, sample a latent variable from the continuous probability distribution to generate a plurality of samples, and generate reconstructed data from the plurality of samples. The at least one programmable processor can compute a reconstruction error by determining a distance between the reconstructed data and the data, and generate, based on the reconstruction error, an indication representing whether a specific record within the received data was used to train the machine learning model. Related apparatuses, methods, techniques, non-transitory computer programmable products, non-transitory machine-readable medium, articles, and other systems are also within the scope of this disclosure.Type: GrantFiled: December 13, 2018Date of Patent: November 15, 2022Assignee: SAP SEInventors: Benjamin Hilprecht, Daniel Bernau, Martin Haerterich
-
Patent number: 11449639Abstract: Machine learning model data privacy can be maintained by training a machine learning model forming part of a data science process using data anonymized using each of two or more differential privacy mechanisms. Thereafter, it is determined, for each of the two or more differential privacy mechanisms, a level of accuracy and a level precision when evaluating data with known classifications. Subsequently, using the respective determined levels of precision and accuracy, a mitigation efficiency ratio is determined for each of the two or more differential privacy mechanisms. The differential privacy mechanism having a highest mitigation efficiency ratio is then incorporated into the data science process. Related apparatus, systems, techniques and articles are also described.Type: GrantFiled: June 14, 2019Date of Patent: September 20, 2022Assignee: SAP SEInventors: Daniel Bernau, Jonas Robl, Philip-William Grassal, Florian Kerschbaum
-
Patent number: 11366982Abstract: Various examples are directed to systems and methods for detecting training data for a generative model. A computer system may access generative model sample data and a first test sample. The computer system may determine whether a first generative model sample of the plurality of generative model samples is within a threshold distance of the first test sample and whether a second generative model sample of the plurality of generative model samples is within the threshold distance of the first test sample. The computer system may determine that a probability that the generative model was trained with the first test sample is greater than or equal to a threshold probability based at least in part on whether the first generative model sample is within the threshold distance of the first test sample, the determining also based at least in part on whether the second generative model sample is within the threshold distance of the first test sample.Type: GrantFiled: September 24, 2018Date of Patent: June 21, 2022Assignee: SAP SEInventors: Martin Haerterich, Benjamin Hilprecht, Daniel Bernau
-
Publication number: 20220138348Abstract: Data is received that specifies a bound for an adversarial posterior belief ?c that corresponds to a likelihood to re-identify data points from the dataset based on a differentially private function output. Privacy parameters ?, ? are then calculated based on the received data that govern a differential privacy (DP) algorithm to be applied to a function to be evaluated over a dataset. The calculating is based on a ratio of probabilities distributions of different observations, which are bound by the posterior belief ?c as applied to a dataset. The calculated privacy parameters are then used to apply the DP algorithm to the function over the dataset. Related apparatus, systems, techniques and articles are also described.Type: ApplicationFiled: October 30, 2020Publication date: May 5, 2022Inventors: Daniel Bernau, Philip-William Grassal, Hannah Keller, Martin Haerterich
-
Publication number: 20200394320Abstract: Machine learning model data privacy can be maintained by training a machine learning model forming part of a data science process using data anonymized using each of two or more differential privacy mechanisms. Thereafter, it is determined, for each of the two or more differential privacy mechanisms, a level of accuracy and a level precision when evaluating data with known classifications. Subsequently, using the respective determined levels of precision and accuracy, a mitigation efficiency ratio is determined for each of the two or more differential privacy mechanisms. The differential privacy mechanism having a highest mitigation efficiency ratio is then incorporated into the data science process. Related apparatus, systems, techniques and articles are also described.Type: ApplicationFiled: June 14, 2019Publication date: December 17, 2020Inventors: Daniel Bernau, Jonas Robl, Philip-William Grassal, Florian Kerschbaum
-
Patent number: 10746567Abstract: Methods, systems, and computer-readable storage media for privacy preserving metering is described herein. A resource threshold value associated with anonymizing meter data for resources metered at a first destination is received. Based on a noise scale value and the resource threshold value, an individual inference value of the first destination is computed. The individual inference value defines a probability of distinguishing the first destination as a contributor to a query result based on anonymized meter data of the first destination and other destinations according to the noise scale value. The noise scale value is defined for a processing application. Based on evaluating the individual inference value, it is determined to provide anonymized meter data for metered resources at the first destination. An activation of a communication channel for providing the anonymized meter data for metered resources is triggered.Type: GrantFiled: March 22, 2019Date of Patent: August 18, 2020Assignee: SAP SEInventors: Daniel Bernau, Philip-William Grassal, Florian Kerschbaum
-
Publication number: 20200193298Abstract: A system is described that can include a machine learning model and at least one programmable processor communicatively coupled to the machine learning model. The machine learning model can receive data, generate a continuous probability distribution associated with the data, sample a latent variable from the continuous probability distribution to generate a plurality of samples, and generate reconstructed data from the plurality of samples. The at least one programmable processor can compute a reconstruction error by determining a distance between the reconstructed data and the data, and generate, based on the reconstruction error, an indication representing whether a specific record within the received data was used to train the machine learning model. Related apparatuses, methods, techniques, non-transitory computer programmable products, non-transitory machine-readable medium, articles, and other systems are also within the scope of this disclosure.Type: ApplicationFiled: December 13, 2018Publication date: June 18, 2020Inventors: Benjamin Hilprecht, Daniel Bernau, Martin Haerterich
-
Patent number: 10628608Abstract: A set of data is received for a data analysis. The set of data includes personal identifiable information. The set of data is anonymized to protect the privacy information. Risk rates and utility rates are determined for a number of combinations of anonymization techniques defined correspondingly for data fields from the set of data. A risk rate is related to a privacy protection failure when defining first anonymized data through applying a combination of anonymization techniques for the data fields. A utility rate is related to accuracy of the data analysis when applied over the anonymized data. Based on evaluation of the risk rates and the utility rates, one or more anonymization techniques from the number of anonymization techniques are determined. The set of data is anonymized according to a determined anonymization techniques and/or a combination thereof.Type: GrantFiled: June 27, 2017Date of Patent: April 21, 2020Assignee: SAP SEInventors: Cedric Hebert, Daniel Bernau, Amine Lahouel
-
Publication number: 20200097763Abstract: Various examples are directed to systems and methods for detecting training data for a generative model. A computer system may access generative model sample data and a first test sample. The computer system may determine whether a first generative model sample of the plurality of generative model samples is within a threshold distance of the first test sample and whether a second generative model sample of the plurality of generative model samples is within the threshold distance of the first test sample. The computer system may determine that a probability that the generative model was trained with the first test sample is greater than or equal to a threshold probability based at least in part on whether the first generative model sample is within the threshold distance of the first test sample, the determining also based at least in part on whether the second generative model sample is within the threshold distance of the first test sample.Type: ApplicationFiled: September 24, 2018Publication date: March 26, 2020Inventors: Martin Haerterich, Benjamin Hilprecht, Daniel Bernau
-
Patent number: 10445527Abstract: A system for differential privacy is provided. In some implementations, the system performs operations comprising receiving a plurality of indices for a plurality of perturbed data points, which are anonymized versions of a plurality of unperturbed data points, wherein the plurality of indices indicate that the plurality of unperturbed data points are identified as presumed outliers. The plurality of perturbed data points can lie around a first center point and the plurality of unperturbed data points can lie around a second center point. The operations can further comprise classifying a portion of the presumed outliers as true positives and another portion of the presumed outliers as false positives, based upon differences in distances to the respective first and second center points for the perturbed and corresponding (e.g., same index) unperturbed data points. Related systems, methods, and articles of manufacture are also described.Type: GrantFiled: December 21, 2016Date of Patent: October 15, 2019Assignee: SAP SEInventors: Jonas Boehler, Daniel Bernau, Florian Kerschbaum
-
Patent number: 10423781Abstract: A method is disclosed for providing sanitized log data to a threat detection system. The sanitized log data is derived from a log table with continuous columns, themselves having continuous entries with continuous values. First, a retention probability parameter and an accuracy radius parameter are selected. Next, a probability distribution function is initialized with the retention probability parameter and the accuracy radius parameter. For each continuous value, the probability distribution function is applied, resulting in perturbed continuous values of a perturbed continuous columns Finally, the perturbed continuous columns are provided as the sanitized log.Type: GrantFiled: September 12, 2017Date of Patent: September 24, 2019Assignee: SAP SEInventors: Wasilij Beskorovajnov, Daniel Bernau
-
Patent number: 10380366Abstract: Systems and methods are provided for sending a request to register a data offer from a data owner to participate in a distributed ledger, the request including information associated with the data offer and a privacy budget for the data offer, and wherein the information associated with the data offer and the privacy budget is stored in the distributed ledger and the data offer is accessible by third parties to the data owner. The systems and method further providing for receiving a request, associated with a third party computer, to access data associated with the data offer, processing a data request associated with the request to access data, based on determining that there is sufficient privacy budget to allow access to the data associated with the request to access data, to produce result data, anonymizing the result data, and updating the distributed ledger.Type: GrantFiled: April 25, 2017Date of Patent: August 13, 2019Assignee: SAP SEInventors: Daniel Bernau, Florian Hahn, Jonas Boehler
-
Publication number: 20180322279Abstract: A method is disclosed for providing sanitized log data to a threat detection system. The sanitized log data is derived from a log table with continuous columns, themselves having continuous entries with continuous values. First, a retention probability parameter and an accuracy radius parameter are selected. Next, a probability distribution function is initialized with the retention probability parameter and the accuracy radius parameter. For each continuous value, the probability distribution function is applied, resulting in perturbed continuous values of a perturbed continuous columns Finally, the perturbed continuous columns are provided as the sanitized log.Type: ApplicationFiled: September 12, 2017Publication date: November 8, 2018Inventors: Wasilij Beskorovajnov, Daniel Bernau
-
Publication number: 20180307854Abstract: Systems and methods are provided for sending a request to register a data offer from a data owner to participate in a distributed ledger, the request including information associated with the data offer and a privacy budget for the data offer, and wherein the information associated with the data offer and the privacy budget is stored in the distributed ledger and the data offer is accessible by third parties to the data owner. The systems and method further providing for receiving a request, associated with a third party computer, to access data associated with the data offer, processing a data request associated with the request to access data, based on determining that there is sufficient privacy budget to allow access to the data associated with the request to access data, to produce result data, anonymizing the result data, and updating the distributed ledger.Type: ApplicationFiled: April 25, 2017Publication date: October 25, 2018Inventors: Daniel Bernau, Florian Hahn, Jonas Boehler
-
Publication number: 20180173894Abstract: A system for differential privacy is provided. In some implementations, the system performs operations comprising receiving a plurality of indices for a plurality of perturbed data points, which are anonymized versions of a plurality of unperturbed data points, wherein the plurality of indices indicate that the plurality of unperturbed data points are identified as presumed outliers. The plurality of perturbed data points can lie around a first center point and the plurality of unperturbed data points can lie around a second center point. The operations can further comprise classifying a portion of the presumed outliers as true positives and another portion of the presumed outliers as false positives, based upon differences in distances to the respective first and second center points for the perturbed and corresponding (e.g., same index) unperturbed data points. Related systems, methods, and articles of manufacture are also described.Type: ApplicationFiled: December 21, 2016Publication date: June 21, 2018Inventors: Jonas Boehler, Daniel Bernau, Florian Kerschbaum
-
Publication number: 20180004978Abstract: A set of data is received for a data analysis. The set of data includes personal identifiable information. The set of data is anonymized to protect the privacy information. Risk rates and utility rates are determined for a number of combinations of anonymization techniques defined correspondingly for data fields from the set of data. A risk rate is related to a privacy protection failure when defining first anonymized data through applying a combination of anonymization techniques for the data fields. A utility rate is related to accuracy of the data analysis when applied over the anonymized data. Based on evaluation of the risk rates and the utility rates, one or more anonymization techniques from the number of anonymization techniques are determined. The set of data is anonymized according to a determined anonymization techniques and/or a combination thereof.Type: ApplicationFiled: June 27, 2017Publication date: January 4, 2018Inventors: Cedric Hebert, Daniel Bernau, Amine Lahouel