The FAIR principles

In the previous section, you considered different types of data, and how open you can be when sharing them. Given the subtleties, it is useful to have a clear set of guidelines. The FAIR principles provide this. They state that shared data should be FAIR – findable, accessible, interoperable, and reusable:

Findable: Data should be easy to find for both humans and computers. This involves using unique identifiers and metadata (information about the data).
Accessible: Once found, data should be easy to access, either openly or through an authentication or authorisation process. This ensures data is available in a standardised format.
Interoperable: Data should be able to work with other data. This means using standardised formats and languages so that different systems can use the data together.
Reusable: Data should be well-documented and organised so that it can be used again in future research, potentially by different people. This includes clear information about how the data were collected and any licenses or permissions needed for its use.

In this video, Isabel Chadwick, a research data specialist from the Open University, talks about the FAIR principles, and how they can help researchers look after their data. As you watch the video, think about how you could follow Isabel’s advice in your own research.

Download this video clip.Video player: Looking after data

Show transcript|Hide transcript

Transcript: Looking after data

My name’s Isabel Chadwick. I've got a special interest in research data management. So my job involves helping researchers and research students to look after their data during their projects, thinking about all of the legal, ethical implications of looking after what is sometimes personal data, sometimes sensitive data, and sometimes just really, really big data.

The FAIR data principles are intended to give guidelines on the findability, the accessibility, the interoperability and the reusability of digital assets that are created during the course of research.

They go beyond merely saying that research data should be made open, and rather give more concrete guidance on how that data can be best exploited to enable reuse, replication, verification of results, any kind of scrutiny.

So they're really, really important because a huge amount of money and time is invested into generating research data globally.

And if that data isn't findable, if it can't be accessed, if it doesn't interoperate, then essentially it isn't reusable - and that sort of means that it's a huge financial loss for global research as well as a bit of a setback for research progress.

Sometimes there are legal, ethical or commercial reasons why research data cannot be made publicly accessible.

And FAIR data doesn't always mean open data. So even where data has to be restricted to only allow access to certain people, or maybe even no people, the really important thing is that the metadata that describes that data is made available.

Now, when we use the term metadata, what we mean is all of the information that builds up a picture about what that item or that data set might be. So a really good way of thinking about metadata is thinking about things that you use in your everyday life. So for example, you might have a record collection that you have organised according to different things.

So you might have put it in alphabetical order according to the name of the artist, or you might have put it into different sections for classical, pop, rock, for example. Or you might simply have just done it in a colour order to make it look like a pretty rainbow and all of those are ways of organising information using different types of metadata in order to be able to find things and understand things more easily.

And we do the same things with information that includes data sets, but also would be things like publications, so every published piece of research will have rich metadata assigned to it.

When it comes to research data, that information is really, really important, because in terms of transparency, allowing people to understand how you created your data, what the data is - and pretty key to this is - how they can reuse it, is really important.

To make your data FAIR, there are a few really key steps that come at different points in your research process. So right at the beginning of your research project, before you've even started collecting your data, we would always advise that you write a data management plan. And that plan should outline how you're going to look after your data during your project, and then what's going to happen to it after your project.

In terms of what happens after your project, the best piece of advice for making your data FAIR, would be to deposit it in a trusted digital repository. And you should do this whether your data is going to be openly available to the public, or whether it's going to have a restricted access placed upon it.

The reason why we say that you should put your data into a trusted digital repository is because it will provide you with a persistent identifier like a DOI, which will ensure findability and accessibility of your data.

In terms of reusability, it will also enable you to assign an open license to your data, so that people can understand what the terms of free use are, and they know what they are and aren't allowed to do with that data when they access it.

And finally, in terms of interoperability, what's really important is that you use those DOIs or those persistent identifiers that you are provided with by your repositories, or by your publishers, to link your different outputs. So we want to see you linking your data sets with your publications, with your other outputs, with your software, for example, and your materials.

That's a really important aspect of FAIR.

But, to start from the beginning, the data management plan is a really solid way to start.

End transcript: Looking after data

Download

Looking after data

Interactive feature not available in single page view (see it in standard view).

There's a lot of effort going into building exactly these trusted data repositories to make your data FAIR. For example, in Europe, the European Open Science Cloud covers a very wide range of science and social sciences, while the European Cultural Heritage Cloud aims to do something similar for cultural heritage institutions and professionals.

Allow about 10 minutes for this.

Use this box to write notes about:

Why researchers should follow the FAIR principles
What metadata is, and why it is important
The steps researchers should take at the beginning and end of their project to adhere to the FAIR principles

To use this interactive functionality a free OU account is required. Sign in or register.

Interactive feature not available in single page view (see it in standard view).

When you are ready, press 'reveal' to see our comments.

Discussion

Isabel explains that the FAIR principles make the best use of expensively acquired global research findings, given the limits to openness. She explains the key concept of ‘metadata’: something that allows you to organise your research data and publications. She advises researchers to write a data management plan at the outset of their study, and to place the material in a trusted digital repository at the end of the study (you will learn more about this later).

Pause for thought

Data shouldn't just be FAIR for humans. It needs to be FAIR for machines as well. Take ten minutes to think about the implications of living in a world that's becoming more and more computationally intensive, and where global research data is being generated so quickly that humans struggle to keep up. How can you organise your own open data so computers are able to find it without human intervention?

Open data

Licensing data

My OpenLearn Create Profile

Download this course

About this course

Open Research

The FAIR principles

Transcript: Looking after data

Discussion

Pause for thought