Patents Assigned to PRIVACY ANALYTICS INC.
  • Patent number: 11782956
    Abstract: Disclosed is a method for an intermediary mapping an de-identification comprising steps of retrieving datasets and meta data from a data source; selecting a target standard; mapping the retrieved datasets and the metadata to the target standard, wherein the datasets and the metadata are mapped to the target standard using one of, a schema mapping, a variable mapping, or a combination thereof; infer one or more of, variable classifications, variable connections, groupings, disclosure risk settings, and de-identification settings using the dataset mapping and metadata; perform a de-identification propagation using the mapped datasets, the mapped metadata, the inferred variable classifications, the inferred variable connections, the inferred groupings, the inferred disclosure risk settings, the inferred de-identification settings, or a combination thereof.
    Type: Grant
    Filed: October 20, 2021
    Date of Patent: October 10, 2023
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Muhammad Oneeb Rehman Mian, David Nicholas Maurice Di Valentino, George Wesley Bradley
  • Patent number: 11748517
    Abstract: System and method to produce an anonymized cohort having less than a predetermined risk of re-identification. The method includes receiving a data query of requested traits for the anonymized cohort, querying a data source to find records that possess at least some of the traits, forming a dataset from at least some of the records, and grouping the dataset in time into a first boundary group, a second boundary group, and one or more non-boundary groups temporally between the first boundary group and second boundary group. For each non-boundary group, calculating maximum time limits the non-boundary group can be time-shifted without overlapping an adjacent group, calculating a group jitter amount, capping the group jitter amount by the maximum time limits and by respective predetermined jitter limits, and jittering said non-boundary group by the capped group jitter amount to produce an anonymized dataset. Return the anonymized dataset.
    Type: Grant
    Filed: April 27, 2022
    Date of Patent: September 5, 2023
    Assignee: Privacy Analytics Inc.
    Inventors: Sean Rose, Weilong Song, Martin Scaiano
  • Patent number: 11664098
    Abstract: Methods and systems to de-identify a longitudinal dataset of personal records based on journalistic risk computed from a sample set of the personal records, including determining a similarity distribution of the sample set based on quasi-identifiers of the respective personal records, converting the similarity distribution of the sample set to an equivalence class distribution, and computing journalistic risk based on the equivalence distribution. In an embodiment, multiple similarity measures are determined for a personal record based on comparisons with multiple combinations of other personal records of the sample set, and an average of the multiple similarity measures is rounded. In an embodiment, similarity measures are determined for a subset of the sample set and, for each similarity measure, the number of records having the similarity measure is projected to the subset of personal records. Journalistic risk may be computed for multiple types of attacks.
    Type: Grant
    Filed: December 23, 2021
    Date of Patent: May 30, 2023
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Stephen Korte, Luk Arbuckle, Andrew Baker, Khaled El Emam, Sean Rose
  • Patent number: 11620408
    Abstract: A data anonymization pipeline system for managing holding and pooling data is disclosed. The data anonymization pipeline system transforms personal data at a source and then stores the transformed data in a safe environment. Furthermore, a re-identification risk assessment is performed before providing access to a user to fetch the de-identified data for secondary purposes.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: April 4, 2023
    Assignee: Privacy Analytics Inc.
    Inventors: Lon Michel Luk Arbuckle, Jordan Elijah Collins, Khaldoun Zine El Abidine, Khaled El Emam
  • Patent number: 11380441
    Abstract: The present disclosure is related to a method of geo-clustering of data for de-identification of a dataset. The method includes generating a plurality of geoclusters based on a plurality of geocodes. The geocodes may include ZIP codes or postal codes. The method further includes identifying the geoclusters having the smallest population. The geocluster having the smallest population is iteratively merged with the nearest geocluster until a minimum population threshold is met. Once the smallest geocluster meets the minimum population threshold, the plurality of geoclusters can be used to cluster the geocodes within a dataset to be de-identified.
    Type: Grant
    Filed: May 10, 2017
    Date of Patent: July 5, 2022
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Andrew Richard Baker, Khaled El Emam
  • Patent number: 11334685
    Abstract: System and method to produce an anonymized cohort having less than a predetermined risk of re-identification. The method includes receiving a data query of requested traits for the anonymized cohort, querying a data source to find records that possess at least some of the traits, forming a dataset from at least some of the records, and grouping the dataset in time into a first boundary group, a second boundary group, and one or more non-boundary groups temporally between the first boundary group and second boundary group. For each non-boundary group, calculating maximum time limits the non-boundary group can be time-shifted without overlapping an adjacent group, calculating a group jitter amount, capping the group jitter amount by the maximum time limits and by respective predetermined jitter limits, and jittering said non-boundary group by the capped group jitter amount to produce an anonymized dataset. Return the anonymized dataset.
    Type: Grant
    Filed: February 26, 2020
    Date of Patent: May 17, 2022
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Sean Rose, Weilong Song, Martin Scaiano
  • Patent number: 11238960
    Abstract: A system, method and computer readable memory for determining journalist risk of a dataset using population equivalence class distribution estimation. The dataset may be a cross-sectional data set or a longitudinal dataset. The determine risk of identification can be determined and used in de-identification process of the dataset.
    Type: Grant
    Filed: November 27, 2015
    Date of Patent: February 1, 2022
    Assignee: Privacy Analytics Inc.
    Inventors: Stephen Korte, Luk Arbuckle, Andrew Baker, Khaled El Emam, Sean Rose
  • Patent number: 10803201
    Abstract: System and method to produce an anonymized electronic data product having an individually-determined threshold of re-identification risk, and adjusting re-identification risk measurement parameters based on individual characteristics such as geographic location, in order to provide an anonymized electronic data product having a sensitivity-based reduced risk of re-identification.
    Type: Grant
    Filed: February 26, 2018
    Date of Patent: October 13, 2020
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Hazel Joyce Nicholls, Andrew Richard Baker, Yasser Jafer, Martin Scaiano
  • Patent number: 10685138
    Abstract: There is provided a system and method executed by a processor for estimating re-identification risk of a single individual in a dataset. The individual, subject or patient is described by a data subject profile such as a record in the dataset. A population distribution is retrieved from a storage device, the population distribution is determined by one or more quasi-identifying fields identified in the data subject profile. An information score is then assigned to each quasi-identifying (QI) value of the one or more quasi-identifying fields associated with the data subject profile. The assigned information scores of the quasi-identifying values for the data subject profile are aggregated into an aggregated information value. An anonymity value is then calculated from the aggregated information value and a size of a population associated with the dataset. A re-identification metric for the individual from the anonymity value is then calculated.
    Type: Grant
    Filed: April 1, 2016
    Date of Patent: June 16, 2020
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Martin Scaiano, Stephen Korte, Andrew Baker, Geoffrey Green, Khaled El Emam, Luk Arbuckle
  • Patent number: 10586074
    Abstract: System and method to produce an anonymized cohort having less than a predetermined risk of re-identification. The method includes receiving a data query of requested traits for the anonymized cohort, querying a data source to find records that possess at least some of the traits, forming a dataset from at least some of the records, and grouping the dataset in time into a first boundary group, a second boundary group, and one or more non-boundary groups temporally between the first boundary group and second boundary group. For each non-boundary group, calculating maximum time limits the non-boundary group can be time-shifted without overlapping an adjacent group, calculating a group jitter amount, capping the group jitter amount by the maximum time limits and by respective predetermined jitter limits, and jittering said non-boundary group by the capped group jitter amount to produce an anonymized dataset. Return the anonymized dataset.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: March 10, 2020
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Sean Rose, Weilong Song, Martin Scaiano
  • Patent number: 10423803
    Abstract: System and method to produce an anonymized cohort, members of the cohort having less than a predetermined risk of re-identification. The method includes receiving a data query of requested traits to request in an anonymized cohort, querying a data source to find records that possess at least some of the traits, forming a dataset from at least some of the records, and calculating an anonymity histogram of the dataset. For each patient record within the dataset, the method anonymizes the dataset by calculating using a threshold selector whether a predetermined patient profile within the dataset should be perturbed, calculating using a value selector whether a value within the indicated patient profile should be perturbed, and suppressing an indicated value within the indicated patient profile. The anonymized dataset then is returned.
    Type: Grant
    Filed: December 23, 2016
    Date of Patent: September 24, 2019
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Martin Scaiano, Andrew Baker, Stephen Korte
  • Patent number: 10424406
    Abstract: A method includes receiving an initial dataset. Each record of the initial dataset comprises a set of quasi-identifier attributes and a set of non-quasi-identifier attributes. A processor assigns a link identifier to each record and replaces each set of quasi-identifier attributes with a range to form a generalized set. The processor removes duplicate records based on identical generalized sets to generate de-duplicated records. The processor generates a randomized record by replacing the generalized set of each de-duplicated record with a corresponding set of random values. The processor passes the set of random values of each randomized record through multiple hash functions to generate multiple outputs. The multiple outputs are mapped to a Bloom filter. The processor forms a dataset by combining each randomized record with one or more sets of non-quasi-identifier attributes. The set of random values is a fingerprint for a corresponding record of the dataset.
    Type: Grant
    Filed: February 12, 2017
    Date of Patent: September 24, 2019
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Yasser Jafer, Khaled El Emam
  • Patent number: 10395059
    Abstract: A computer-implemented system and method to reduce re-identification risk of a data set. The method includes the steps of retrieving, via a database-facing communication channel, a data set from a database communicatively coupled to the processor, the data set selected to include patient medical records that meet a predetermined criteria; identifying, by a processor coupled to a memory, direct identifiers in the data set; identifying, by the processor, quasi-identifiers in the data set; calculating, by the processor, a first probability of re-identification from the direct identifiers; calculating, by the processor, a second probability of re-identification from the quasi-direct identifiers; perturbing, by the processor, the data set if one of the first probability or second probability exceeds a respective predetermined threshold, to produce a perturbed data set; and providing, via a user-facing communication channel, the perturbed data set to the requestor.
    Type: Grant
    Filed: March 7, 2017
    Date of Patent: August 27, 2019
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Martin Scaiano, Grant Middleton, Varada Kolhatkar, Khaled El Emam
  • Patent number: 10380381
    Abstract: System and method to predict risk of re-identification of a cohort if the cohort is anonymized using a de-identification strategy. An input anonymity histogram and de-identification strategy is used to predict the anonymity histogram that would result from applying the de-identification strategy to the dataset. System embodiments compute a risk of re-identification from the predicted anonymity histogram.
    Type: Grant
    Filed: January 9, 2017
    Date of Patent: August 13, 2019
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Martin Scaiano, Andrew Baker, Stephen Korte
  • Patent number: 10318763
    Abstract: System and method to produce an anonymized cohort having less than a predetermined risk of re-identification. The method includes receiving a data query of requested traits for the anonymized cohort, querying a data source to find records that possess at least some of the traits, forming a dataset from at least some of the records, and grouping the dataset in time into a first boundary group, a second boundary group, and one or more non-boundary groups temporally between the first boundary group and second boundary group. For each non-boundary group, calculating maximum time limits the non-boundary group can be time-shifted without overlapping an adjacent group, calculating a group jitter amount, capping the group jitter amount by the maximum time limits and by respective predetermined jitter limits, and jittering said non-boundary group by the capped group jitter amount to produce an anonymized dataset. Return the anonymized dataset.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: June 11, 2019
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Sean Rose, Weilong Song, Martin Scaiano
  • Patent number: 10242213
    Abstract: System and method to produce an anonymized cohort, members of the cohort having less than a predetermined risk of re-identification. The system includes a user-facing communication interface to receive an anonymized cohort request comprising traits to include in members of the cohort; a data source-facing communication channel to query a data source, to find anonymized records that possess at least some of the requested traits; and a processor programmed to carry out the instructions of: forming a dataset from at least some of the anonymized records; calculating a risk of re-identification of the anonymized records in the dataset based upon the data query; perturbing anonymized records in the dataset that exceed a predetermined risk of re-identification, until the risk of re-identification is not greater than the pre-determined threshold, to produce the anonymized cohort; and providing, via a user-facing communication channel, the anonymized cohort.
    Type: Grant
    Filed: September 21, 2016
    Date of Patent: March 26, 2019
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Martin Scaiano, Andrew Baker, Stephen Korte, Khaled El Emam
  • Patent number: 9990515
    Abstract: In longitudinal datasets, it is usually unrealistic that an adversary would know the value of every quasi-identifier. De-identifying a dataset under this assumption results in high levels of generalization and suppression as every patient is unique. Adversary power gives an upper bound on the number of values an adversary knows about a patient. Considering all subsets of quasi-identifiers with the size of the adversary power is computationally infeasible. A method is provided to assess re-identification risk by determining a representative risk which can be used as a proxy for the overall risk measurement and enable suppression of identifiable quasi-identifiers.
    Type: Grant
    Filed: November 30, 2015
    Date of Patent: June 5, 2018
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Andrew Baker, Luk Arbuckle, Khaled El Emam, Ben Eze, Stephen Korte, Sean Rose, Cristina Ilie
  • Patent number: 9773124
    Abstract: A system and method of performing date shifting with randomized intervals for the de-identification of a dataset from a source database containing information identifiable to individuals is provided. The de-identified dataset is retrieved comprising a plurality of entries or records containing personal identifying information. Date quasi-identifiers in the dataset for the entries can be identified within the data set which may be used potentially identifiable for a patient. Date events are consolidated in the date quasi-identifiers and connected dates in the dataset. The date events are moved relative to an anchor date in a longitudinal sequence of the date events. De-identification of the entries in the dataset including the date quasi-identifiers is performed to meet a risk metric defining risk of re-identified patients associated with the records.
    Type: Grant
    Filed: May 22, 2015
    Date of Patent: September 26, 2017
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Khaled El Emam, Luk Arbuckle, Ben Eze, Geoffrey Green
  • Patent number: 9503432
    Abstract: A secure linkage between databases allows records of an individual in a first database to be linked to records of the same individual in a second database without disclosing or providing personal information outside of either database or system responsible for controlling access to the respective databases. As such, records of individuals may be securely linked together without compromising privacy or security of the databases.
    Type: Grant
    Filed: April 2, 2015
    Date of Patent: November 22, 2016
    Assignee: PRIVACY ANALYTICS INC.
    Inventors: Khaled El Emam, Aleksander Essex, Ben Eze, Matthew Tucciarone