As you analyse your data and develop insights, you will anonymise personal data. Anonymising data means removing any information that could be used to identify an individual.
Once data has been fully anonymised, and is used in your research artefacts, it is not considered personal data, and does not need to be managed in the ways described in this guidance.
Page contents
- Understanding anonymisation and pseudonymisation in user research
- Follow the DfE Data Anonymisation and Pseudonymisation Standard
- Don't store anonymised research as personal data
- Pseudonymised research outputs are still considered personal data
- Tips for anonymising user research data
Understanding anonymisation and pseudonymisation in user research
Once data has been anonymised, it is no longer possible to identify an individual person. You should anonymise data through the analysis and synthesis of your user research.
Data can also be pseudonymised. This means that personal data could still be linked to an individual person, if it was joined up with to additional data.
You can use pseudonymised data in your work, but you must get consent from the participant to use their data in this way, and you must be careful to ensure the person is not immediately identifiable. You must still handle and store it as personal data.
Examples of anonymised data:
- A quote from a user, attributed in a way that could not be used to identify them (e.g. "Key Stage 2 teacher, London")
- A user persona, built from insight gathered across a range of research and data
- A video of a usability task, with any details on the screen or audio that could be used to identify the participant blurred or silenced.
Examples of pseudonymised data:
- Using a unique identifier for participants (e.g. "P5R30502") in a spreadsheet of insights shared with your team, with a separate file matching identifiers to people (e.g. "P5R30502 = Joe Blogs")
- A video showing somebody's face and voice. If this was connected to a detail about where they worked, or the village they live in, then the person could be identified.
Read the ICO definition of pseudonymisation.
Follow the DfE Data Anonymisation and Pseudonymisation Standard
You must follow the DfE Data Anonymisation and Pseudonymisation Standard (DfE users only) when anonymising your data.
For qualitative data, there are four principles you should consider:
- retracting individuals' names from documents
- also consider whether individuals could still be identified by other details contained within the document
- blurring video footage to disguise faces
- also consider whether individuals are identifiable by other attributes, such as clothing or voice
- electronically disguising or re-recording audio material
- also consider whether comments made or discussions in the re-recording could lead to potential identification
- changing the details in a report, such as precise place names or precise dates
- also consider whether removing precise details loses the integrity of the research
Don't store anonymised research as personal data
Research outputs that are fully anonymous (e.g. reports, personas, journey maps) are no longer considered personal data.
These therefore no longer need to be stored in your personal data SharePoint space using the 'user research' retention label: they can be stored in your team's normal project folders or elsewhere. They should have the standard 'business operational' retention label (or another retention label, if applicable to your team's work).
Pseudonymised research outputs are still considered personal data
Any document containing pseudonymised data must continue to be managed in the same way as personal data, with the same retention label.
Tips for anonymising user research data
Anonymise as you go
Most personal data can be anonymised whilst you're collecting it. For file names and titles relating to the participant, you should use a unique identifier instead of their name.
When taking notes during research sessions, participants will often mention identifiable information during research, for example the names of companies, colleagues, or the name of their child's school.
If you describe what they say but remove any identifiable information, this saves you time and effort later. Write your descriptions in square brackets to indicate that you're not quoting verbatim. Some examples are:
- "I'm employed by Royal Mail" could become "I'm employed by [a large postal carrier]"
- "I work really closely with Uzma" could become "I work really closely with [a colleague]"
- "My kids go to Broad Oak primary school" could become "My kids go to [local primary school]"
Be careful when using quotations
Sometimes the words and language used in direct participant quotations could identify them, if they are known to the people you are presenting your research to, such as in internal research. For example, if an individual is known for using a specific turn of phrase, then seeing that phrase in a quotation could identify them to a colleague.
Use video editing effects
If using videos, you could blur or block out people's faces (faces are personal data). You could bleep out or mute specific words that they say.
When you are editing videos, ensure that you delete the working/layered file in your video editor, or move it to your personal data SharePoint library: somebody who accessed that file would be able to extract the personal data from it.