[open-science] Brief persuasive case for open science/data sharing?
cameron.neylon at stfc.ac.uk
Tue Sep 14 15:26:58 BST 2010
I think it is probably worth separating quite strongly the issues of how
much we publish (i.e. How much of the "research compendium" to borrow
Victoria Stodden's terminology is made available) vs when it is published. I
think it is a given that expectations are rising over how much of the
relevant material, ideas, materials, data, software, etc will be made
available. We are gradually winning that battle, which is fundamentally
about the _quality_ of publication.
The timeframe is a separate issue, and I think fighting the battles above is
made easier by punting somewhat to one side for the minute. However I think
there are both practical and philosophical arguments for publishing more
1) It's a pain to go back and publish things. Much easier to publish them in
some form as they come out. Formal publication can then be kept for those
ideas and stories that deserve them.
2) Your interim results may be useful to others in ways that you don't
expect. This might particularly be the case e.g. for software but equally
for data and materials.
3) Finding collaborators: We have good evidence that publicly being "out
there" doing a particular piece of work can bring both academic and
commercial collaborators to the table who are willing to help. Avoiding
unneccessary replication is part of this.
4) Engagement: The attitude shift involved in putting stuff online
automatically makes you think harder about how people see it and to make
some effort to make it comprehensible. This can be low level engagement on
its own and can also lead to more direct engagement as opportunities arise
through community contributions.
5) Accessibility: This is a core argument for me. If it's accessible to all
it means its also accessible to you. This is a non-trivial argument in
favour of radical openness. Pretty much all the stuff I've lost in the last
five years involved collaborations where it wasn't possible to make things
Essentially these arguments boil down to two types, the selfish and the
community minded. The argument for the selfish is:
"Well these funders are going want me to to make all of this data available
and to publish more. Putting all of this stuff out immediately both keeps
them off my back and means I don't have to put any effort into making it
available later. Plus it means I'm sure of where all my students' stuff is!"
The community minded version:
"Well science is about replication and re-use and through making things
available I make it easier to re-use. By making things available I'm
allowing anyone to do all the things I haven't thought of. This will lead to
more citations, more publications, and more collaborations to take forward
Hope that's helpful. Just off the top of my head as I try and upload some
(now _properly_ analysed) data.
On 14/09/2010 14:53, "Chris Rusbridge" <c.rusbridge at googlemail.com> wrote:
> I've gone back to this, as my FAQ develops. The problem is, I think, that
> there is a significant difference between making data open once the research
> is published (or as part of publication), and making data either open or even
> available during a project, in advance of publication. FoI does bring the risk
> that researchers may be forced to make their data available before they have
> finished the research, even to their rivals. (Of course there are exemptions
> which may be able to be invoked in some circumstances.)
> The "Open Science" approach such as Cameron Neylon advocates is closer to
> this, if perhaps a bit more extreme. (By this I mean that I understand Open
> Science to want to put all data in the open as it is gathered, rather than
> making some data identified by a requester available under FoI when asked.)
> So is there a persuasive case for making data available during your research,
> before publication?
> Chris Rusbridge
> Mobile: +44 791 7423828
> Email: c.rusbridge at gmail.com
> On 3 Sep 2010, at 22:11, Heather Piwowar wrote:
>> Thanks Dorothea!
>> That study was also published in PLoS ONE, in case you prefer a
>> non-dissertation citation:
>> Similar, but earlier and with a way cooler title, is Gleditsch and Strand's
>> "Posting Your Data: Will You Be Scooped or Will You Be Famous?"
>> Chris, another argument I've often heard: publicly archive your data so that
>> you can find it again later, yourself :)
>> There's also lots to be said about "being the change you want to see,"
>> supplemented with stats on the frequency and implications of data
>> withholding, etc. Let me know if you want refs, or you can brave a mongo
>> list of refs on data sharing/withholding at Mendeley. I have many relevant
>> papers tagged with "motivation" or similar.
>> You might also find something useful in the latter part of these
>> presentations? (1, 2)
>> Let me know if I can be of more help....
>> Heather Piwowar
>> DataONE postdoc with NESCent and Dryad
>> remote from Dept of Zoology, UBC, Vancouver Canada
>> hpiwowar at nescent.org
>> On Fri, Sep 3, 2010 at 12:52 PM, Dorothea Salo <dorothea.salo at gmail.com>
>> Maybe try this:
>> ³Publicly available data was significantly (p=0.006) associated with a
>> 69% increase in citations, independently of journal impact factor,
>> date of publication, and author country of origin.² Piwowar, Heather.
>> ³Foundational studies for measuring the impact, prevalence, and
>> patterns of publicly sharing biomedical research data.² Dissertation,
>> University of Pittsburgh, 2010.
>> I just popped it into a slideshow of mine. I've also seen people use
>> the recent NYT story about data-sharing and Alzheimer's, though it's
>> not quite a paradigm case because the data there weren't fully open.
Scanned by iCritical.
More information about the open-science