The standard view of this forum does not always work well with assistive technology. We also provide a simpler view, which still contains all features. Switch to simple view.

Sabine Barthold Post 1

10 November 2019, 9:04 AM Edited by the author on 10 November 2019, 9:04 AM

What are your concerns about sharing data?

In this week's forum discussion, we want you to think about some concerns you have or that you heard from other researchers about making data open or FAIR.

Now use the 3 resources below to find the responses to those concerns and C+P them under your or another participants post. And you can always add your own original responses!

  1. http://tiny.cc/GoogleDoc 
  2. http://tiny.cc/lseblog  
  3. http://tiny.cc/UKDA


Gunnar De Winter Post 2 in reply to 1

11 November 2019, 9:35 AM

My main concern would be - I think - potential misinterpretation. The suggestion to counter this is to provide enough information to prevent this, e.g.:

"People will misinterpret the data

C&A suggest this: “Document how it should be interpreted. Be prepared to help and correct such people; those that misinterpret it by accident will be grateful for the help.” From the UK Data Archive: “Producing good documentation and providing contextual information for your research project should enable other researchers to correctly use and understand your data.”

It’s worth mentioning, however, a second point C&A make: “Publishing may actually be useful to counter willful misrepresentation (e.g. of data acquired through Freedom of Information legislation), as one can quickly point to the real data on the web to refute the wrong interpretation.”

However, I still think this remains a challenge. How much information to include? What if people who misinterpret do not reach out for help? ...

Beatrice Gini Post 3 in reply to 2

11 November 2019, 12:15 PM

Misinterpretation is a concern I have heard from a lot of researchers. In answer to your question 'What if people who misinterpret do not reach out for help?', well, then it's on them, right? You have the original data and documentation to show that the mistake was theirs, not yours. 


As to 'How much information to include?', how about showing your data and documentation to someone who is not directly involved in the project and seeing if that's enough for them to understand it? I know it would take a bit of time, but perhaps it could be a reciprocal exchange of 'data reviewing services'..? 

Deirdre Winrow Post 4 in reply to 3

11 November 2019, 10:33 PM

This sounds like a good idea. I always struggle with providing, at first, too much data and then, after paring it down, too little! the question of "exactly how much do they need to know is a tricky one!! Perhaps bringing in a fresh set of eyes is the solution!

Charlotte Coales Post 17 in reply to 4

2 December 2019, 10:57 AM

I agree, a fresh pair of eyes is a good idea and should reduce the chance of misinterpretation.

Sam Groves Post 5 in reply to 1

12 November 2019, 12:57 PM

 

I think my main concern would be accidentally leaking information that isn't meant to be shared for reasons I am unaware of. This is especially the case because I will primarily be working with either people or animals. That would hurt both my career an the institution I work for, as well as setting a bad example of making data open not being successful. I suppose the following address most of my concerns:

·         We’re worried about the Data Protection Act (UK Law)
Mirror what’s published in a non-machine-readable way.
Strip out, aggregate or anonymise the bits that contain personal data.
Seek permission from data subjects to publish data about them (opt-in).

·         I don’t mind making it open, but I worry someone else might object
This implies that the person is nervous about being blamed for making an error.
Go up the management chain to find someone who can reassure them that they won’t get in trouble.

·         Some of what you asked for is confidential
Which bits? Can they be excluded, leaving something that’s still useful?

Essentially, if I go through the proper protocols to anonymise the data and/or seek permission from subjects, and check higher up the management chain, I shouldn't need to worry.


Veronica Phillips Post 6 in reply to 1

12 November 2019, 2:43 PM

The most common concern that researchers raise with me when I provide training/support in research data management is worry over size of the data, and fear about sharing personal data. While I think a lot of the responses provided on the three resources linked above, they work only at early stages of planning a new research project, whereas I often find researchers come to me with concerns long after research is underway and they've already collected their data. It's hard to provide advice about anonymisation and/or getting participant consent in the sharing of personal data if the data collection phase is already finished, or if the researchers are making use of personal data collected by a previous research group when open sharing of data was not a requirement.

Joanne Bakker Post 7 in reply to 1

12 November 2019, 5:41 PM

My main concerns would be:

I don’t mind making it open, but I worry someone else might object

  • This is a common deflection, rather than a genuine issue

  • This implies that the person is nervous about being blamed for making an error

  • Go up the management chain to find someone who can reassure them that they won’t get in trouble

  • Ask for a less controversial subset

I don’t own the data, so can’t give you permission

  • Sometimes it’s as easy as just finding out who does own the data

  • Sometimes nobody knows who owns the data. This often seems to occur when someone has moved into a post and isn’t aware that they are now the data owner. 

  • Going up the management chain can help. If you can find someone who clearly has management over the area the dataset belongs to they can either assign an owner or give permission.

  • Get someone very senior to appoint someone who can make decisions about apparently “orphaned” data.

Both would be quite easy to get around, by just talking to people higher up the chain.

Deirdre Winrow Post 8 in reply to 1

12 November 2019, 9:10 PM

I would be worried about people taking the wrong conclusions away from my data. it's hard to relinquish control! I suppose this is the best response to that:

People will misinterpret the data

Document how it should be interpreted

Be prepared to help and correct such people; those that misinterpret it by accident will be grateful for the help

Publishing may actually be useful to counter willful misrepresentation (e.g. of data acquired through Freedom of Information legislation), as one can quickly point to the real data on the web to refute the wrong interpretation.

 

Jennifer Leggat Post 9 in reply to 1

13 November 2019, 12:12 PM

I think the argument I hear most against making data open or FAIR is that researchers are afraid they will be 'scooped' by others if they give open access to their data prior to publishing. On the websites listed above, these are the responses:

"Data sharing will not stand in the way of you first using your data for your publications. Most research funders allow you some period of sole use, but also want timely sharing. Also remember that you have already been working with your data for some time so you undoubtedly know the data better than anyone coming to use them afresh."

"One option is to have an automatic or optional embargo; require people to archive their data at the time of creation but it becomes public after X months. You could even give the option to renew the embargo so only things that are no longer cared about become published, but nothing is lost and eventually everything can become open."

"As the original collector of the data, you are at a huge advantage compared to others that might want to use your dataset. You have knowledge about your system, the conditions during collection, the nuances of your methods etc. that could never be fully described in the best metadata."

I think these are all very valid points and have convinced me, but I think it will take a lot of work to persuade some researchers who have been in the game longer and have the fear of being scooped ingrained in them!

Sara Hanson Post 10 in reply to 1

13 November 2019, 2:41 PM

This concern listed in the Google Doc link summarizes what my greatest concern is, which is the idea of releasing data prior to publication. The embargo option as a solution is one I have used in the past.

We might want to use it in a research paper

I’ve heard this about datasets produced in crystallography

One option is to have an automatic or optional embargo; require people to archive their data at the time of creation but it becomes public after X months. You could even give the option to renew the embargo so only things that are no longer cared about become published, but nothing is lost and eventually everything can become open.


Svenja Steinfelder Post 11 in reply to 10

14 November 2019, 3:33 PM

I wish I had done that with data, which sits now in a drawer. It was meant to be used for a paper, which due to too many researchers and an "unfavourable result" will probably never see the light of day.

Ciara Lynch Post 12 in reply to 1

16 November 2019, 8:12 AM

I have heard from other researchers that the added workload is a concern in needing to openly publish everything they collect. Their answer was:

  • By making open data replace a current business process, rather than adding to the load of busy staff, it can make producing it neutral or even a saving.

I agree that adding open publishing from the get-go will reduce the amount of work to be completed in the long run. 

Subrat Behera Post 13 in reply to 1

17 November 2019, 7:03 AM
My concerns for opening up data are
1) We’re not sure that we own it (when working with a big team or being part time member of an assignment) and 2) People will misinterpret the data.

Mary Anderson-Glenna Post 14 in reply to 1

17 November 2019, 11:22 PM

I think one of the biggest fears of the researchers at my university is the fear that

I might want to use it in a research paper BUT someone else has beat me to it and used my data before I got that far.


Olivia Tort Post 15 in reply to 14

19 November 2019, 11:25 AM

I agree with Mary, that one of the biggest concerns of the researchers around me is LOSING NOVELTY in order to publish in high impact journals. I know this is not FAIR but we are all still evaluated by means of high impact journals and it is well known that if you lose novelty (even during the revision period of your paper) this paper may be refused. I think that a change in the evaluation systems (and this non-sense hypercompetitivity) is needed to overpass this burden.


Erika Cerrolaza Post 16 in reply to 1

25 November 2019, 1:58 PM

My main concerns would be:

-personal data protection, as the Data protection Law and procedures involve many aspects that, as a researcher, I´m not expert on. I´m concerned about making restricted data accesible by mistake.

We’re worried about the Data Protection Act

The DPA only covers data on people. If the data doesn’t contain anything to do with people, the DPA does not apply.

Mirror what’s published in a non-machine-readable way.

Strip out, aggregate or anonymise the bits that contain personal data

Seek permission from data subjects to publish data about them (opt-in)

-another concern is the protection of the people behind that data, that making the data accesible might put them at risk as the information contained in the data may make a target of them. On another level, informants might not feel comfortable with the possibility that information provided by them, even though anonimyzed, might be made accesible to others. Trust is hard to gain in certain research contexts.

Some of what you asked for is confidential

Which bits? Can they be excluded, leaving something that’s still useful?

Is it actually confidential? (Is it already published on the web in some disaggregated way?)




Emma Dorris Post 18 in reply to 1

4 December 2019, 10:01 AM

I work in biomedical research. We hear a lot about compliance with GDPR/confidentiality. Our governemt also introduced a more stringent health data protection regulations. What this really highlighted was how poor (occasionally non existant) the data stewardship between hospitals and research sites had been. 

The new research data rules were also to be applied retrospectively and many older studies and hence datasets did not meet them. The whole debacle was handled really poorly from the leadership down, and led to huge fear, misconceptions and misunderstandings. Health research that used any archived data or biological material were halted for upwards of a year. This has left huge fear around data rules and regulations.  

Whereas I understand the logic in what is said in the recommended responses, I think more work needs to be done to make people feel more comfortable with data again. There was such a panic, cost and delay to research that I think we need a more philosphocal approach to understanding the imprtance of it before we use the pure practical reasoning such as that outined in response to the "We’re worried about the Data Protection Act (UK Law)"

The DPA only covers data on people. If the data doesn’t contain anything to do with people, the DPA does not apply.

Mirror what’s published in a non-machine-readable way.

Strip out, aggregate or anonymise the bits that contain personal data

Seek permission from data subjects to publish data about them (opt-in