Patents by Inventor Luk ARBUCKLE

Luk ARBUCKLE has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230307104
    Abstract: Methods and systems to de-identify a longitudinal dataset of personal records based on journalistic risk computed from a sample set of the personal records, including determining a similarity distribution of the sample set based on quasi-identifiers of the respective personal records, converting the similarity distribution of the sample set to an equivalence class distribution, and computing journalistic risk based on the equivalence distribution. In an embodiment, multiple similarity measures are determined for a personal record based on comparisons with multiple combinations of other personal records of the sample set, and an average of the multiple similarity measures is rounded. In an embodiment, similarity measures are determined for a subset of the sample set and, for each similarity measure, the number of records having the similarity measure is projected to the subset of personal records. Journalistic risk may be computed for multiple types of attacks.
    Type: Application
    Filed: May 26, 2023
    Publication date: September 28, 2023
    Inventors: Stephen Korte, Luk Arbuckle, Andrew Baker, Khaled El Emam, Sean Rose
  • Publication number: 20230237196
    Abstract: A data anonymization pipeline system for managing holding and pooling data is disclosed. The data anonymization pipeline system transforms personal data at a source and then stores the transformed data in a safe environment. Furthermore, a re-identification risk assessment is performed before providing access to a user to fetch the de-identified data for secondary purposes.
    Type: Application
    Filed: March 30, 2023
    Publication date: July 27, 2023
    Inventors: Lon Michel Luk Arbuckle, Jordan Elijah Collins, Khaldoun Zine El Abidine, Khaled El Emam
  • Patent number: 11664098
    Abstract: Methods and systems to de-identify a longitudinal dataset of personal records based on journalistic risk computed from a sample set of the personal records, including determining a similarity distribution of the sample set based on quasi-identifiers of the respective personal records, converting the similarity distribution of the sample set to an equivalence class distribution, and computing journalistic risk based on the equivalence distribution. In an embodiment, multiple similarity measures are determined for a personal record based on comparisons with multiple combinations of other personal records of the sample set, and an average of the multiple similarity measures is rounded. In an embodiment, similarity measures are determined for a subset of the sample set and, for each similarity measure, the number of records having the similarity measure is projected to the subset of personal records. Journalistic risk may be computed for multiple types of attacks.
    Type: Grant
    Filed: December 23, 2021
    Date of Patent: May 30, 2023
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Stephen Korte, Luk Arbuckle, Andrew Baker, Khaled El Emam, Sean Rose
  • Patent number: 11620408
    Abstract: A data anonymization pipeline system for managing holding and pooling data is disclosed. The data anonymization pipeline system transforms personal data at a source and then stores the transformed data in a safe environment. Furthermore, a re-identification risk assessment is performed before providing access to a user to fetch the de-identified data for secondary purposes.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: April 4, 2023
    Assignee: Privacy Analytics Inc.
    Inventors: Lon Michel Luk Arbuckle, Jordan Elijah Collins, Khaldoun Zine El Abidine, Khaled El Emam
  • Publication number: 20230100347
    Abstract: A method includes collecting one or more datasets of information. The method also includes separating the one or more datasets into respective blocks of data. The method further includes determining whether the information within the blocks of data are consistent, or if one or more violations occur within the blocks of data. In addition, the method includes applying a first noise function based on the determination that the information within the blocks of data are consistent, wherein the first noise function is applied when a loss of privacy and/or confidentiality exceeds a threshold. The method also includes displaying the blocks of data with the first noise function.
    Type: Application
    Filed: September 30, 2022
    Publication date: March 30, 2023
    Inventors: Lon Michel Luk Arbuckle, Devyani Biswal
  • Publication number: 20220115101
    Abstract: Methods and systems to de-identify a longitudinal dataset of personal records based on journalistic risk computed from a sample set of the personal records, including determining a similarity distribution of the sample set based on quasi-identifiers of the respective personal records, converting the similarity distribution of the sample set to an equivalence class distribution, and computing journalistic risk based on the equivalence distribution. In an embodiment, multiple similarity measures are determined for a personal record based on comparisons with multiple combinations of other personal records of the sample set, and an average of the multiple similarity measures is rounded. In an embodiment, similarity measures are determined for a subset of the sample set and, for each similarity measure, the number of records having the similarity measure is projected to the subset of personal records. Journalistic risk may be computed for multiple types of attacks.
    Type: Application
    Filed: December 23, 2021
    Publication date: April 14, 2022
    Inventors: Stephen Korte, Luk Arbuckle, Andrew Baker, Khaled El Emam, Sean Rose
  • Patent number: 11238960
    Abstract: A system, method and computer readable memory for determining journalist risk of a dataset using population equivalence class distribution estimation. The dataset may be a cross-sectional data set or a longitudinal dataset. The determine risk of identification can be determined and used in de-identification process of the dataset.
    Type: Grant
    Filed: November 27, 2015
    Date of Patent: February 1, 2022
    Assignee: Privacy Analytics Inc.
    Inventors: Stephen Korte, Luk Arbuckle, Andrew Baker, Khaled El Emam, Sean Rose
  • Publication number: 20200311308
    Abstract: A data anonymization pipeline system for managing holding and pooling data is disclosed. The data anonymization pipeline system transforms personal data at a source and then stores the transformed data in a safe environment. Furthermore, a re-identification risk assessment is performed before providing access to a user to fetch the de-identified data for secondary purposes.
    Type: Application
    Filed: March 27, 2020
    Publication date: October 1, 2020
    Inventors: Lon Michel Luk Arbuckle, Jordan Elijah Collins, Khaldoun Zine El Abidine, Khaled El Emam
  • Patent number: 10685138
    Abstract: There is provided a system and method executed by a processor for estimating re-identification risk of a single individual in a dataset. The individual, subject or patient is described by a data subject profile such as a record in the dataset. A population distribution is retrieved from a storage device, the population distribution is determined by one or more quasi-identifying fields identified in the data subject profile. An information score is then assigned to each quasi-identifying (QI) value of the one or more quasi-identifying fields associated with the data subject profile. The assigned information scores of the quasi-identifying values for the data subject profile are aggregated into an aggregated information value. An anonymity value is then calculated from the aggregated information value and a size of a population associated with the dataset. A re-identification metric for the individual from the anonymity value is then calculated.
    Type: Grant
    Filed: April 1, 2016
    Date of Patent: June 16, 2020
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Martin Scaiano, Stephen Korte, Andrew Baker, Geoffrey Green, Khaled El Emam, Luk Arbuckle
  • Patent number: 9990515
    Abstract: In longitudinal datasets, it is usually unrealistic that an adversary would know the value of every quasi-identifier. De-identifying a dataset under this assumption results in high levels of generalization and suppression as every patient is unique. Adversary power gives an upper bound on the number of values an adversary knows about a patient. Considering all subsets of quasi-identifiers with the size of the adversary power is computationally infeasible. A method is provided to assess re-identification risk by determining a representative risk which can be used as a proxy for the overall risk measurement and enable suppression of identifiable quasi-identifiers.
    Type: Grant
    Filed: November 30, 2015
    Date of Patent: June 5, 2018
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Andrew Baker, Luk Arbuckle, Khaled El Emam, Ben Eze, Stephen Korte, Sean Rose, Cristina Ilie
  • Publication number: 20180114037
    Abstract: There is provided a system and method executed by a processor for estimating re-identification risk of a single individual in a dataset. The individual, subject or patient is described by a data subject profile such as a record in the dataset. A population distribution is retrieved from a storage device, the population distribution is determined by one or more quasi-identifying fields identified in the data subject profile. An information score is then assigned to each quasi-identifying (QI) value of the one or more quasi-identifying fields associated with the data subject profile. The assigned information scores of the quasi-identifying values for the data subject profile are aggregated into an aggregated information value. An anonymity value is then calculated from the aggregated information value and a size of a population associated with the dataset. A re-identification metric for the individual from the anonymity value is then calculated.
    Type: Application
    Filed: April 1, 2016
    Publication date: April 26, 2018
    Inventors: Martin SCAIANO, Stephen KORTE, Andrew BAKER, Geoffrey GREEN, Khaled EL EMAM, Luk ARBUCKLE
  • Patent number: 9773124
    Abstract: A system and method of performing date shifting with randomized intervals for the de-identification of a dataset from a source database containing information identifiable to individuals is provided. The de-identified dataset is retrieved comprising a plurality of entries or records containing personal identifying information. Date quasi-identifiers in the dataset for the entries can be identified within the data set which may be used potentially identifiable for a patient. Date events are consolidated in the date quasi-identifiers and connected dates in the dataset. The date events are moved relative to an anchor date in a longitudinal sequence of the date events. De-identification of the entries in the dataset including the date quasi-identifiers is performed to meet a risk metric defining risk of re-identified patients associated with the records.
    Type: Grant
    Filed: May 22, 2015
    Date of Patent: September 26, 2017
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Khaled El Emam, Luk Arbuckle, Ben Eze, Geoffrey Green
  • Publication number: 20160154978
    Abstract: In longitudinal datasets, it is usually unrealistic that an adversary would know the value of every quasi-identifier. De-identifying a dataset under this assumption results in high levels of generalization and suppression as every patient is unique. Adversary power gives an upper bound on the number of values an adversary knows about a patient. Considering all subsets of quasi-identifiers with the size of the adversary power is computationally infeasible. A method is provided to assess re-identification risk by determining a representative risk which can be used as a proxy for the overall risk measurement and enable suppression of identifiable quasi-identifiers.
    Type: Application
    Filed: November 30, 2015
    Publication date: June 2, 2016
    Inventors: Andrew Baker, Luk Arbuckle, Khaled El Emam, Ben Eze, Stephen Korte, Sean Rose, Cristina Ilie
  • Publication number: 20160155061
    Abstract: A system, method and computer readable memory for determining journalist risk of a dataset using population equivalence class distribution estimation. The dataset may be a cross-sectional data set or a longitudinal dataset. The determine risk of identification can be determined and used in de-identification process of the dataset.
    Type: Application
    Filed: November 27, 2015
    Publication date: June 2, 2016
    Inventors: Stephen Korte, Luk Arbuckle, Andrew Baker, Khaled El Emam, Sean Rose
  • Publication number: 20150339496
    Abstract: A system and method of performing date shifting with randomized intervals for the de-identification of a dataset from a source database containing information identifiable to individuals is provided. The de-identified dataset is retrieved comprising a plurality of entries or records containing personal identifying information. Date quasi-identifiers in the dataset for the entries can be identified within the data set which may be used potentially identifiable for a patient. Date events are consolidated in the date quasi-identifiers and connected dates in the dataset. The date events are moved relative to an anchor date in a longitudinal sequence of the date events. De-identification of the entries in the dataset including the date quasi-identifiers is performed to meet a risk metric defining risk of re-identified patients associated with the records.
    Type: Application
    Filed: May 22, 2015
    Publication date: November 26, 2015
    Inventors: Khaled EL EMAM, Luk ARBUCKLE, Ben EZE, Geoffrey GREEN