“Big data” refers to extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions – basically taking millions of bits of data, then making sense of it and learning from it. This can be extremely useful for medical research and other noble purposes – for example, using the personal health data and genetic information of a large number of people could help us to understand, combat or conquer disease.
But we’re reluctant to share our health and genetic details if we can be identified individually. Is the use of anonymous or pseudonymous data the answer? Anonymous data is not unique or tied to a specific person – “rendering anonymous” is masking personal data so that information concerning personal or material circumstances cannot be attributed to an identified or identifiable natural person. All person-related data that could allow backtracking is purged. But security is not foolproof because the data are still attached to your demographic information, including all the “digital DNA” you leave behind in the digital universe.
Pseudonymous data means the data subject’s name and other identifying features have been replaced with generic identifiers, making it impossible or extremely difficult to identify the data subject. When genetic data are involved, pseudonymization often reaches its limits because of the identifying nature of the genetic information. As a result, depersonalization cannot guarantee the data subjects’ identities cannot be recovered.
While anonymous and pseudonymous coding are part of the answer to how we can contribute to the big data need of public health and still preserve our individual privacy, it’s not without risk. And yet – solutions continue to be developed in the great race between privacy and technology.
For more information, please contact Joe Dehner or any attorney in Frost Brown Todd’s Privacy and Information Security Law Industry Group.