Heightened protection for sensitive data is becoming quite trendy in privacy laws around the world. Originating in European Union (EU) data protection law and included in the EU’s General Data Protection Regulation (GDPR), sensitive data singles out certain categories of personal data for extra protection. Commonly recognized special categories of sensitive data include racial or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership, health, sexual orientation and sex life, biometric data, and genetic data.
Although heightened protection for sensitive data appropriately recognizes that not all situations involving personal data should be protected uniformly, the sensitive data approach is a dead end. The sensitive data categories are arbitrary and lack any coherent theory for identifying them. The borderlines of many categories are so blurry that they are useless. Moreover, it is easy to use non-sensitive data as a proxy for certain types of sensitive data.
Personal data is akin to a grand tapestry, with different types of data interwoven to a degree that makes it impossible to separate out the strands. With Big Data and powerful machine learning algorithms, most non-sensitive data can give rise to inferences about sensitive data. In many privacy laws, data that can give rise to inferences about sensitive data is also protected as sensitive data. Arguably, then, nearly all personal data can be sensitive, and the sensitive data categories can swallow up everything. As a result, most organizations are currently processing a vast amount of data in violation of the laws.
This Article argues that the problems with the sensitive data approach make it unworkable and counterproductive — as well as expose a deeper flaw at the root of many privacy laws. These laws make a fundamental conceptual mistake — they embrace the idea that the nature of personal data is a sufficiently useful focal point for the law. But nothing meaningful for regulation can be determined solely by looking at the data itself. Data is what data does. Personal data is harmful when its use causes harm or creates a risk of harm. It is not harmful if it is not used in a way to cause harm or risk of harm.
To be effective, privacy law must focus on use, harm, and risk rather than on the nature of personal data. The implications of this point extend far beyond sensitive data provisions. In many elements of privacy laws, protections should be based on the use of personal data and proportionate to the harm and risk involved with those uses.
GW Paper Series
118 Northwestern University Law Review (Forthcoming)