Skip to main content
Printable page generated Thursday, 27 November 2025, 12:25 PM
Use 'Print preview' to check the number of pages and printer settings.
Print functionality varies between browsers.
Unless otherwise stated, copyright © 2025 The Open University, all rights reserved.
Printable page generated Thursday, 27 November 2025, 12:25 PM

Transparency: As open as possible

Introduction

  

The image shows colourful, transparent glassware

  

In Week 1, you learned a bit about what transparency is, and why being transparent is important in research. Ensuring your study generates open data and materials is a good way to increase the transparency of your research.

This week, you will discover ways to increase the chances of people across the world finding your research, and how to explain the research in sufficient detail so that they know exactly how you carried out your study. You will also learn how to ensure your research is openly accessible while still protecting the anonymity of your participants.

  

Sharing data and materials

Making the outputs of your study more open means the data and materials you have gathered during your research can be used, reused, and redistributed by anyone. Data refers to the information or facts collected, observed, or generated during the course of a study or investigation. For example, in the quantitative chocolate chip cookie example from Week 1, the chocolate chip ranking would be the data.

Materials refers to any materials used in a research study. These can include (but are not limited to) the code used to run any statistical analyses, protocols outlining exactly what was done in the study, auditory and visual stimulus files shown to participants, questionnaires, documents used to obtain consent from participants, and videos of the study being run.

There are many benefits to sharing data and materials. One benefit is financial – the more products that are shared from any individual study, the more efficient the use of the funding used to conduct the study. Sharing data also allows others to check the data for quality and accuracy, reproduce the analyses reported in a research paper, and expand on the analyses through running alternative analyses. In addition, most datasets have uses beyond what is reported in a paper, including secondary data analysis that addresses different questions altogether. Sharing data may also be required by the project funder or the journal in which the article is published (see Top Factor for a list of journal requirements). Sharing materials has similar benefits to sharing data – readers can check what was done in the study, re-run the same study, or change the materials to run a slightly different study.

Open data

Data can be shared even when it is not related to a paper. However, researchers tend to share data alongside their papers, so that readers can see the structure of their data more clearly, re-run analyses from the manuscript, run additional analyses, and use the data to answer new questions.

Data can look very different depending on the research field, for example:

  • Biology: genomic data from projects like the Human Genome Project, providing sequences of human DNA and other organisms.
  • Social Sciences: Survey data on demographics, attitudes, and behaviours collected by organisations like the United Nations or national statistical agencies.
  • Medicine: Clinical trial data, including study protocols, patient demographics, treatment interventions, and health outcomes.
  • History: Archives of historical documents, such as diaries, letters, manuscripts, and government records, providing insights into past events, societies, and cultures.
  • Literature: Text datasets containing literary works, poetry, plays, and other written texts, facilitating analysis of language use, stylistic trends, and cultural themes.
  • Musicology: Musical score datasets, containing compositions from different composers, genres, and historical periods, for analysis of musical structure and style.

Even within one study, there will often be multiple levels of data. For example, in a study using interviews there might be video recordings of the interviews themselves, the source data, the transcript of the interviews, the processed data, and then the text from the transcript may be coded quantitatively or qualitatively, resulting in the coding data.

It is possible for all of these to be shared, if participants have agreed to this and don’t mind that they will be identifiable, but usually, it is important to protect the anonymity of participants. While this is often possible to do with transcript data (after any identifying information about participants had been anonymised), this would be very difficult if sharing video data of them.

When we talk about open data, the phrase ‘as open as possible, as closed as necessary’ is often used – meaning that researchers should strive to make their data open, but not where this would be unethical or illegal. Researchers must work within the ethical codes of their country and type of data collection. For example, in Europe, the General Data Protection Regulation sets out guidelines for dealing with ‘personal data’, i.e., any information related to an identifiable individual. To ensure human participants are not identifiable in our datasets, we as researchers must ensure we have removed all identifiable data from our datasets.

In some cases, this is obvious, simple, and doesn’t affect the usefulness of the data shared, for example removing IP addresses from data collected online. However, there are other cases where this is much more complex, and may result in the data not being possible to share at all, for example where qualitative data on a very specific topic makes participants identifiable.

The FAIR principles

In the previous section, you considered different types of data, and how open you can be when sharing them. Given the subtleties, it is useful to have a clear set of guidelines. The FAIR principles provide this. They state that shared data should be FAIR – findable, accessible, interoperable, and reusable:

  

  • Findable: Data should be easy to find for both humans and computers. This involves using unique identifiers and metadata (information about the data).
  • Accessible: Once found, data should be easy to access, either openly or through an authentication or authorisation process. This ensures data is available in a standardised format.
  • Interoperable: Data should be able to work with other data. This means using standardised formats and languages so that different systems can use the data together.
  • Reusable: Data should be well-documented and organised so that it can be used again in future research, potentially by different people. This includes clear information about how the data were collected and any licenses or permissions needed for its use.

  

In this video, Isabel Chadwick, a research data specialist from the Open University, talks about the FAIR principles, and how they can help researchers look after their data. As you watch the video, think about how you could follow Isabel’s advice in your own research.

Download this video clip.Video player: Looking after data
Copy this transcript to the clipboard
Print this transcript
Show transcript|Hide transcript
Looking after data
Interactive feature not available in single page view (see it in standard view).

There's a lot of effort going into building exactly these trusted data repositories to make your data FAIR. For example, in Europe, the European Open Science Cloud covers a very wide range of science and social sciences, while the European Cultural Heritage Cloud aims to do something similar for cultural heritage institutions and professionals.

Allow about 10 minutes for this.

Use this box to write notes about:

  1. Why researchers should follow the FAIR principles
  2. What metadata is, and why it is important
  3. The steps researchers should take at the beginning and end of their project to adhere to the FAIR principles
To use this interactive functionality a free OU account is required. Sign in or register.
Interactive feature not available in single page view (see it in standard view).

When you are ready, press 'reveal' to see our comments.

Discussion

Isabel explains that the FAIR principles make the best use of expensively acquired global research findings, given the limits to openness. She explains the key concept of ‘metadata’: something that allows you to organise your research data and publications. She advises researchers to write a data management plan at the outset of their study, and to place the material in a trusted digital repository at the end of the study (you will learn more about this later).

Pause for thought

Data shouldn't just be FAIR for humans. It needs to be FAIR for machines as well. Take ten minutes to think about the implications of living in a world that's becoming more and more computationally intensive, and where global research data is being generated so quickly that humans struggle to keep up. How can you organise your own open data so computers are able to find it without human intervention?

Licensing data

In the video, Isabel Chadwick recommended that when researchers share their data, they should choose a license to apply to the data. A license is a set of rules and permissions that tells you how you can use someone else's data. It is like an agreement between the person who created the data (the data owner) and the person who wants to use it (the data user).

A license specifies what you can and cannot do with the data, whether or not you need to give attribution to the data owner, and whether or not you can further share the data. For example, a common open license used for research data is CC BY-NC 4.0 which allows the person using your data to share and adapt the data, but only if they give attribution to you, the data owner, and don’t use the data for commercial purposes.

There are other types of license, which allow you to specify different levels of openness. You can choose to give your work over to the public domain, so people can do whatever they like with it. Alternatively, you can choose a type of license which prevents users from adapting your work. You can find out more about licensing by referring to this helpful list on the Creative Commons website.

Activity 1:

Allow about 10 minutes

In this activity, you can test your understanding of the importance of considering anonymity when it comes to data sharing.

Guest users do not have permission to interact with embedded questions.
Interactive feature not available in single page view (see it in standard view).

Where to share data

In the video, Isabel Chadwick explained that it is best practice to archive data and materials in an open access repository to make your research accessible. Whether you choose an institutional repository or a discipline’s repository, these trusted digital platforms provide a safe way to store research materials and data, and link to related content held elsewhere.

There are different repositories for different research fields and different types of data, but some examples are the Open Science Framework, the Qualitative Data Repository, and Zenodo. Here is an example of how to share data and materials on the Open Science Framework. The Open University also has its own repository (ORDO) where data and materials from researchers at The Open University can be shared.

  

Case studies:

  • Chemistry – This study by Lia et al. (2020) investigates the structure and mechanisms of one enzyme involved in the chain of reactions through which our bodies metabolise glucose. All the data in this study are openly available with the paper, which directs readers to RCSB Protein Data Bank and Zenodo for underlying data and extended data.
  • PsychologyPLAY (Play & Learning Across a Year) is a project that aims to explore infants and their mother’s natural behaviours in their homes, across 50 universities in the United States. All materials, home visit protocols and the video and questionnaire data collected are all openly available on their website.
  • Art – Quantitative and qualitative data and software related to an Open University PhD thesis by Kanter (2024) have been shared openly on ORDO. This thesis was about British portraiture in the 1900s.

Activity 2:

Allow around 15mins

In this activity, you will get the opportunity to explore an open access repository.

Have a look online for some open data and materials, preferably in your field of research. One way of doing this is to use keywords to search for projects on ZENODO. Use the ‘search records’ box at the top to select your keywords, and use ‘resource types’ to filter your search so that it only includes datasets. Think of ways you could use the research products you find to answer a question that interests you.

To use this interactive functionality a free OU account is required. Sign in or register.
Interactive feature not available in single page view (see it in standard view).

When you are ready, press 'reveal' to see our comments.

Discussion

Your response will depend on your discipline and interests, and those of the researchers whose work you found. You might decide to use the data to generate new knowledge by analysing the original researchers’ datasets in new ways, or by running a related study based on their materials.

Here’s a real-life example of how someone used data from the Open Science Framework platform for a secondary data analysis.

Prinzing (2024) reused data from an experience sampling study (where participants are repeatedly asked about their daily experiences related to a particular topic) on pro-environmental, sustainable behaviour. Prinzing used these data to investigate whether engaging in sustainable behaviour increased a person’s wellbeing. They shared some of the original data that they used and their analysis code on a separate OSF project. There are a few other things they could have done: they did not share a data dictionary, and they didn’t apply a license to the materials on their OSF project. Nevertheless, the author of this course, Silverstein (2020), was still able to reproduce their analyses using their data and code. Interestingly, in the process of conducting this reproduction, Silverstein found a typo in one of the values in the paper! The authors have now updated this.

Reproducibility in quantitative research

Open data is key to understanding one of the big concerns in quantitative research: reproducibility. Assessing reproducibility means assessing the value or accuracy of a scientific claim based on the original methods, data, and code. So, when you run the same analyses on the same data, do you get the same results?

Running the same analyses on the same data can mean different things depending on what materials the reproducer has access to. Investigating the reproducibility of a study can mean taking the original data and:

  • Following the description of analyses in the paper.
  • Following an analysis plan created by the original authors.
  • Re-running the analysis code that has been shared with the data.

  

As you can imagine, it’s easier to get the same results as the original researchers if there is less uncertainty around what they did. So, re-running the analysis code will be more likely to produce the same results than following the description of analyses in the paper. Going back to our baking analogy, it would usually be easier to produce the same cake as a professional chef if they shared the recipe they used than if they just described what they did, and the more detail they provided in the recipe, the easier it would be. However, even if a professional chef shared both a detailed recipe and a description of what they did, your cake might end up with a soggy bottom! Similarly, in research, when we have both the code and the data, it can still be difficult to reproduce results.

Here are a few tips for making it more likely that others will be able to reproduce the results of a study:

  • Share a data dictionarylist all the variables in your dataset, what they mean, how they were manipulated, and how they’re structured
  • Annotate your code – make notes of what you did at each stage of the data pre-processing and analysis and why
  • Make a note of software versions – analyses might stop working with future versions
  • Make sure your data and code are suitably licensed – for example, a CC-BY-NC 4.0 license means that anyone can share or adapt the material as long as they give you appropriate credit and do not use the materials for commercial purposes.

  

Activity 3:

Allow about 10 minutes

This activity will allow you to test your understanding of reproducibility.

Guest users do not have permission to interact with embedded questions.
Interactive feature not available in single page view (see it in standard view).

Open data in qualitative research

In the previous section we considered transparency in a quantitative study. To recap, in quantitative studies your data will usually be numerical. You might measure how quickly people respond to stimuli on a computer, or how much people would be willing to pay for a certain item.

Open data and materials can mean something quite different in qualitative research. This type of research focuses on patterns and themes in non-numerical data such as words, images, or observations. Imagine you are taking part in a qualitative study and are being interviewed about something close to your heart or your experiences. Try to imagine a topic that feels personal or emotive. Your data – instead of being a number – would be the actual words you said.

  • How would you feel if you took part in an interview study about an emotive topic and your data was made open and accessible?
  • Are there any situations where you would be happy for your data to be open?
  • Are there any situations where you definitely wouldn’t want your data to be open?

  

Use the text box below to write down your thoughts.

To use this interactive functionality a free OU account is required. Sign in or register.
Interactive feature not available in single page view (see it in standard view).

Activity 4:

Allow about 20 minutes

Now read the vignette below, about a qualitative researcher considering sharing their data. Consider the benefits of making the data open, and the ethical issues that the researcher should consider. Make sure you work out your own responses before revealing our notes.

A researcher is conducting a qualitative study with LGBTQ+ students about their experiences of mental health problems. The students that participated took part in in-depth interviews, which were video recorded, transcribed and analysed using thematic analysis. They gave consent for their data to be used in this study. The researcher is trying to work out whether or not to make the data from this study open.

  1. What would be the benefits of making this data open?
  2. What issues should the researcher consider when making this decision?
To use this interactive functionality a free OU account is required. Sign in or register.
Interactive feature not available in single page view (see it in standard view).

When you are ready, press 'reveal' to see our comments.

Discussion

Benefits:

  • Promotes transparency as others can see exactly how the research conclusions were derived.
  • The data can be used in future studies, maximising the usefulness of the data and meaning further insights can be gained from the same data.
  • The data can be used by a broader audience, including policymakers, practitioners, and researchers outside of the original researcher’s team.

 

Issues:

  • There could be risks to participants if they are identifiable, e.g., they might not be ‘out’ as LGBTQ+ or want people beyond the research study to know this information.
  • Thematic analysis usually uses short quotes from interviews. Making the full interviews open and accessible can increase the likelihood of participants being identified.
  • Participants only gave consent for their data to be used in this study, and did not have the opportunity to consent to their data being shared openly.
  • Video recordings were made, but these are even more likely to make participants identifiable so likely shouldn’t be shared.
  • Will participants be less likely to discuss their experiences if they know the data will be open?

How can qualitative researchers overcome some of these challenges when planning their studies?

We hope the suggestions in this section have helped you think about this. Qualitative researchers should consider using a data management plan at research inception, carefully anonymising their data, licensing the data or only making the data available to other researchers on request, getting consent from participants for open data up-front.

Just as quantitative researchers aspire to make their research repeatable, for qualitative researchers, a bit of forward planning is important to make studies as transparent as they can be.

Quiz

The image shows an abstract pattern which reminds you of a brain or a maze.

Throughout the course, we offer you self-test quizzes to help you test your understanding of the course concepts. These quizzes are there to help you consolidate your knowledge.

This week’s quiz covers concepts underlying the principle of transparency. The feedback we have given is important: you will learn more by engaging with this feedback, which explains why the answers are correct.

Answer the following questions about key terms:

  Question 1

Guest users do not have permission to interact with embedded questions.
Interactive feature not available in single page view (see it in standard view).

  Question 2

Guest users do not have permission to interact with embedded questions.
Interactive feature not available in single page view (see it in standard view).

  Question 3

Guest users do not have permission to interact with embedded questions.
Interactive feature not available in single page view (see it in standard view).

  Question 4

Guest users do not have permission to interact with embedded questions.
Interactive feature not available in single page view (see it in standard view).

  Question 5

Guest users do not have permission to interact with embedded questions.
Interactive feature not available in single page view (see it in standard view).

Summary

This week you learned about transparency in research, particularly focusing on open data and materials. You learned about the benefits of sharing data and materials, and practical ways you can share your materials. You learned about some of the nuances of different data across disciplines, and the importance of protecting participant anonymity and complying with legal regulations.

In Week 3 you will learn about integrity. You will discover the ‘replication crisis’ which is gripping parts of the research community, explore some questionable research practices, and learn how to find out whether the results of your research can be applicable to a wider context.

References

Center for Open Science (2024): Open Science Framework
Available at: https://osf.io/

Center for Qualitative and Multi-Method Inquiry, Maxwell School of Citizenship and Public Affairs, Syracuse University (2023): The Qualitative Data Repository
Available at: https://qdr.syr.edu/

Creative Commons (2024): Share your work
Available at: https://creativecommons.org/ share-your-work

European Organization for Nuclear Research and OpenAIRE (2024): Zenodo
Available at: https://zenodo.org

FORRT (2024): Lesson plan 8: open data and qualitative research (lesson template with a CC-By Attribution 4.0 licence).
Available at: https://osf.io/ nyfqx

Kanter, D (2024): Open data relating to 'Collecting and connecting portrait sittings: a re-evaluation of portrait-sitting accounts in enhancing knowledge and understanding of British portraiture 1900-1960'.
Available at: https://doi.org/ 10.21954/ ou.rd.c.6693558

Lia, A, Dowle, A, Taylor, C, Santino A, Roversi, P (2020): Partial catalytic Cys oxidation of human GAPDH to Cys sulfonic acid
Available at: https://wellcomeopenresearch.org/ articles/ 5-114/ v2

The Open University (2024): Open Research Data Online (ORDO)
Available at: https://ordo.open.ac.uk/

Prinizing, M (2023): Pro-environmental behavior increases subjective well-being: evidence from an experience sampling study and a randomized experiment
Available at: https://osf.io/ preprints/ psyarxiv/ ac89k

Play and Learning across a Year (PLAY)
Available at: https://play-project.org/ index.html

RCSB (2024): Protein Data Bank
Available at: https://www.rcsb.org/

 

Click here to move on to the next week

Glossary

Accessible
Research data is accessible if it can be accessed by anyone in the world, either openly or through an authentication or authorisation process. This requires metadata describing it in a standardised format.
Data
The information or facts collected, observed, or generated during the course of a study or investigation.
Data dictionary
A data dictionary is a collection of names, definitions, and attributes about data elements being used in shared data.
Findable
Research data is more findable by interested parties if it is stored in a well-organised repository with detailed metadata and persistent identifiers.
Interoperable
Datasets are interoperable if they can be used together to generate new studies. The metadata uses standard vocabularies, which are consistent across lots of different datasets.
Open data and materials
Data and materials from an open study are freely available to be used, reused, and redistributed by anyone.
Materials
Anything used in a study, eg: questionnaires, consent forms, protocols outlining what was done, the code used to run any statistical analyses, etc.
Metadata
Information that accompanies a piece of research, organising the materials, data and publications.
Reproducibility
A study is reproducible if, when you run the same analyses on the same data, you get the same results.
Open access repository
An open access repository is a digital platform that holds research output and provides free, immediate and permanent access to research data and materials.
Qualitative
A qualitative method is used to identify, analyse and report patterns (themes) in non-numerical data.
Quantitative
Quantitative methods deal with numbers, aiming to quantify phenomena and establish patterns or relationships.
Reusable
Research data is reusable if it can be used, modified or analysed, potentially by other researchers, to generate new knowledge. It needs to include clear information to facilitate re-use.
Secondary data analysis
A type of study where you use existing data to answer new questions.