How to Store and Manage Your Data

September 16, 2020 stran

Ensure that you’re publication-ready and ensure future reproducibility through good data management

How you store your data matters. Even after you publish your article, your data needs to be accessible and useable for the long term so that other researchers can continue building on your work. Good data management practices make your data discoverable and easy to use, promote a strong foundation for reproducibility and increase your likelihood of citations.

Watch this quick video for an overview of data sharing and repository use, and read on for detailed advice and strategies you can use to achieve data-sharing success.

https://vimeo.com/865126608

What are you looking for?

Making a Data Management Plan

Choosing a Repository

Reproducibility Checklist

Have a Data Management Plan

Before you begin your study, be sure to have a thorough and robust data management plan. Knowing what you’ll need to do to optimize the storage and sharing of your data before you begin collecting it will set you up for better reproducibility and demonstrate the rigor of your study.

To start, check to see if your funder or institution offers a template that you can adapt. The Digital Curation Center provides a thorough overview of data management plans if you’re starting from scratch. The Gurdon Institute also offers a great resource if you don’t have one, or just want a second opinion. The article suggests asking yourself:

What types of data will be created?
How will these data be processed?
How will they be stored and backed up?
How will they be documented (inc. naming conventions, directory structures etc)?
How will these data be of benefit to the broader scientific community?
How will they be archived and will they comply with any data/metadata standards?
How will they be made available and discoverable to the broader community?
What are the policies for sharing, re-use etc?

Download Free Data Management Templates and Resources

If your institution or funder doesn’t already offer a Data Management Plan template or requirements, consider starting with one of these open resources:

DMPonline helps you to create, review, and share data management plans that meet institutional and funder requirements. It is provided by the Digital Curation Centre (DCC)

Choosing a repository

There are a number of repositories to choose from, each with unique pros and cons. Centralized, Open Access repositories make it easier for a broad audience to discover, analyze, and reuse your data, though a specialized repository may be the best solution for you if you require unique formatting.

Having more options available enables you to choose a repository that fits your specific needs, but more sources can also make it difficult for other researchers to find your data if they aren’t looking in the right place.

Before you decide on a repository, consider the following:

1. Your Audience

Your data should be accessible and easy to find by the people (or machines) most likely to use it. This might include:

Other researchers in your field and the data analysis, search, and retrieval software they rely on to find and reuse your datasets. If this is your primary audience, you might consider a centralized, field-specific repository.
Other researchers outside of your field or professional groups. This is especially important for highly interdisciplinary work. You might consider a public repository that is widely known and serves many disciplines
Funding agencies that you may wish to apply to for future grants to continue the work. Choosing an Open Access data repository will ensure there are never future issues in accessing and sharing your data. Some funders may even require you to do so.
Editors and reviewers at the journal. Even if your data is confidential or proprietary, you will need a way to make this accessible to key stakeholders in the evaluation of your manuscript.

2. Long-term Accessibility

Can those who need access to your data find it easily? For example, if you’ve used a field-specific repository for interdisciplinary work, other researchers may not know where to look.
Is your data optimized for machine readability? Your data should be structured in a simple, consistent format that makes it easier for researchers to access and reuse without manual intervention.
Will you be able to access your own data when you leave your institution? If your institution provides a repository but restricts access to the public, you may lose access if you move on to a different organization.
Should access to your data be restricted? If your data contains sensitive information about a vulnerable population or other sensitive topic that cannot be made widely available, have you provided a clear path for those who do need verified access to be able to obtain it?

3. Compliance

Other key stakeholders in the publication of your research may have specific policies in place for ensuring the long-term accessibility and use of data. Before choosing a repository, be sure to check the following:

Institutional requirements. Check with your institution to see if they offer a data management plan, institutional repository, or other guidelines for sharing your data.
Funder requirements. Some funders require a data management plan to be submitted along with any grant proposal. Many publicly funded projects require open and accessible data.
Journal requirements. Even if it is not required by your institution or funder, the journal you’ve chosen may require you to make all available data Open, and/or provide a data availability statement indicating how your data can be accessed.

Proprietary data

You may find yourself in a situation where your ideal sharing method or repository is at odds with one of these requirements. For example, your institution or funder may insist the data you’ve collected is proprietary which could limit you from publishing in journals where Open Data is required. In either case, you should make sure your data is available to editors and reviewers at your selected journal so they can properly evaluate the work

Reproducibility Checklist

Making your data easy to follow ensures that other researchers will be able to confirm your results. This is the first step in building a reliable foundation for future research. Read more

Data follows the FAIR Principles (see below)

Data is Open and accessible for broad use

There is a plan for the long-term accessibility

Data and metadata are clearly labeled and interpretable

Data and metadata are formatted to allow maximum use by both humans and machines

Datasets are complete–including negative and null results

Commit to Open Data

Your impact goes further when it reaches a broader audience. You can help advance progress in your field more swiftly by removing barriers to access and reuse your published work. By making all of your data available in open repositories, you help increase our collective knowledge and make it easier for other researchers to build upon your study

70% of surveyed researchers say they’re likely to use open datasets for their future research.

79% of surveyed researchers support mandates
for making primary research openly available.

Papers linking to open data can have up to 25.36% higher citation impact.

FAIR Principles

The FAIR data principles (standing for Findable, Accessible, Interoperable, and Reusable) are a set of community-designed guidelines to provide measurable, consistent data standards for data sharing and increase data reusability.

Findable: Metadata and data should be easy to find for both humans and computers.
Accessible: There is a clear path for a user to retrieve your data and obtain any necessary authentication and authorization.
Interoperable: The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
Reusable: To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.

Read more: https://www.go-fair.org/fair-principles/

Sample repositories

FAIRsharing provides a comprehensive list of repositories filterable by discipline, journal recommendation, and region.

For quick access to some of the most common repositories, browse the list below.

OA Repositories (by discipline)

PLOS Recommended Repositories

Open Access Directory

OA Cross-Disciplinary Repositories

Dryad Digital Repository

figshare

Harvard Dataverse Network

Kaggle

Network Data Exchange (NDEx)

Open Science Framework

Zenodo

< Back to all Author Resources