Page History

...

Presentation by Khaled El Emam on Re-identification Risk Measurement - slides, slides with annotations.

Questions for group were how to pick a threat model, which identifiers to be concerned about, and how to establish a risk threshold for public data release.
Apply stratification principles to structured data. If you have unstructured data, structure it first.
Identity disclosure is when a person's identity is assigned to a record.
Trying to measure the risk of verification for a dataset
Quasi-identifiers are known by an attacker
Delete or encrypt/hash direct identifiers first. What we end up after that is synonymous data.
definition of identity disclosure
quasi-identifiers
attack in two directions - population to sample, sample to population
risk measure by group size (of 1 = unique)
generalize - group size gets bigger - risk reduces - maximum (k-anonymity)(public), average (non-public), unicity
risk denominator is not group size in sample but in population
risk threshold in identifiability spectrum
privacy-utility tradeoff
data transformations - generalization, suppression, addition of noise, microaggregation
for non-public data, can add controls (privacy, security, contractual)
motivated intruder attack

...

Versions Compared