23 Research Data Things – 10: Sharing Sensitive Data

The UKAN decision making framework is a helpful tool for sorting through the issues of making data anonymous.

Replacing identifiable information, e.g. names, addresses, reference numbers that may be traceable to an individual, with a random unique identifer in the dataset is one important aspect of making data anonymous, but it’s not the whole story. UKAN’s advice is very good – to “know your data”, and go through each field and consider the potential risks. It’s also important to think about the risks of fields in association with other ones – e.g. one field, like the sex of the individual, might not mean much but it could be risker in combination with another variable.  This dataset is a good example of anonymised data – a “couple ID” field has been made for each couple so they can be identified for the sake of data analysis but there are no personal details about them.

Advertisements

2 Comments

  1. A good example my manager (who is a statistician and a data custodian of health and hospital data) gives is that even without the names and dates of birth of babies, with something like the birth of twins in a small country town, the babies can still be identified because of small population numbers. That is why potentially identifiable data needs to be considered within a dataset as a whole. I hope that makes sense, it is a bit hard to explain in writing.

    Like

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s