Crime and privacy in open data
Wednesday, Oct 26, 2016, 02:37 AM | Source: Pursuit
By Ben Rubinstein
Anonymisation is the mathematical process of removing detail or perturbing a dataset so that individual identification is infeasible. If the anonymisation is weak or flawed, individual people might be reidentified - this is called "deanonymisation" or "reidentification".
The Federal Government has recently proposed amendments to the Privacy Act 1988, criminalising reidentification. With privacy clearly in the public interest, we welcome legal protections of privacy. However the bill in its initial form risks the very privacy the law intends to safeguard. We believe it is the use or distribution of deanonymised data that should be a crime, not the act of reidentification.
The Australian government has vast quantities of information about individual Australians. Such data has enormous potential for evidence-based policy development but is not really "government data" - it is data about people, entrusted to the government's care. The release of those datasets to policy experts, or publication as "open data", brings with it risks to individual privacy.
Testing and investigating deanonymization is critical for protecting privacy because it allows weaknesses to be found and fixed. Not knowing about a weakness does not mean it does not exist. Criminalising the analysis and reporting of weaknesses will only stop legitimate researchers, it will do nothing to inhibit criminals or overseas entities.
The new rules won't affect us directly, because the federal Privacy Act doesn't cover public universities, except the Australian National University. But some of the best research in privacy is done outside academia - that research should be free too."
Weakly anonymised data is a serious privacy risk. Prohibiting deanonymisation could mean that weakly-anonymised datasets stay online for longer, or weak methods are reused more widely. The best policy would be to make anonymisation and encryption methods public prior to the release of the data, to allow for a period of peer review and consultation. Either way, privacy protections would be improved by free and open research and analysis of deanonymisation. Criminalising it is a weakening of privacy protections, not a strengthening.
Effective methods of securing data and protecting privacy will be critical for the future use of data. Large companies like Apple and Google already experiment with novel techniques for deriving knowledge from encrypted or perturbed data. Innovation in this space would support the Australian economy and protect our privacy, but we won't know which methods are secure unless we can freely investigate which methods are weak.
The use or distribution of deanonymised data should be a crime, but research on deanonymisation should be encouraged.
Banner Image: Pixabay