Research Data Management Plan

adapted from the UK Data Archive
adapted from the UK Data Archive

In a nutshell, a RDM plan is a strategic document that outlines the handling of data throughout its lifecycle, covering measures during and after your research project. It is a very useful tool for determining and documenting actions as well as supporting the overall project planning. As a basis for the reusability of data, it can prevent many data management issues or help to handle others. There are many ways to draft an RDM-plan, mostly depending on research field, complexity and size of the project. But, typically, researchers are asked to cover:

On this site:

Data description

Characterize the data you are working with. Categorize the source (observational, experimental, simulated, derived, compiled ), form (text, numeric, audio visual, models, computer code), format, data stability (fixed, constantly growing, revisable) and the expected volume of the data (file size, amount of data).
As to accessibility and reusability, it makes sense to consider very common and open data formats or to make sure that it is feasible to archive them in a better usable format. You will find more infomation at the Library of Congress: Sustainability of Digital Formats

Data organization

When working in a team, it is crucial to agree on ordering principles and to document them. This means you need to agree on conventions for files (e.g. naming, versioning, the directory structure). Here is some general advice regarding file naming:

  • Use descriptive names that indicate what the file/folder contains
  • Use short names (less than 50 characters)
  • Use simple names that are easy to understand
  • Use alphanumeric characters
  • Use underscores (_) or dashes (-) rather than spaces
  • Avoid special characters such as: \ ‘= /,<>^:;()#*?%,”@!+{}~`[]
  • Avoid using internal project codes or acronyms that individuals outside of your laboratory or research group would not understand
  • Incorporate the temporal or spatial information when applicable

Metadata and documentation

Metadata – data about the data – helps to find, understand, analyze and reuse data. Metadata is the structured element of documentation with fixed formal criteria and only the most important contextual information, which makes it also machine-readable.
In general, data documentation describes the who, how, what, where, when and why of the data set. Depending on various data characteristics, data documentation varies in length and structure and often comes in no standardized form. The procedure varies from one research area to the other (e.g. lab notebook in experimental research, description of data processing in computational research). It should be easy to read and to understand (clear, short, familiar words) to allow traceability and reproducibility of data operation even beyond research fields. Here are some issues that need to be addressed:

  • The scientific reason why data were collected
  • What data parameters were collected, including units and formats
  • What instruments/platforms were used to collect/generate the data
  • A list of the data files that make up the dataset
  • Codes used in the dataset and definitions of what each code means
  • When and how frequently the data were collected
  • How each parameter was measured or produced (methods)
  • For each parameter, the units of measure and the format
  • For each parameter, the precision and accuracy (if known)
  • Data processing that was performed
  • Standards or calibrations used, if applicable
  • Software used to open and manipulate the data
  • Quality assurance and quality control methods
  • Date when the dataset was last modified
  • Known problems that limit data use

Ethics and intellectual property

When it comes to research data, the following regulations could be of importance:

Privacy / data protection:applies if personal data is collected during the research (especially life sciences). For storing and further use, a declaration of consent is needed and the data has to be anonymized to avoid disclosure of individuals.

Copyright:Primary research data (i.p. measurement data) itself are in many cases not affected by copyright but are subject to the public domain. As soon as they are accumulated in a specific order or processed, they are protected (e.g. database right). If you use external data which is copyright protected, you have the option to obtain the usage rights from the copyright holder via a transactional transfer.

Clarifying ownership of and rights relating to research data is recommended before a project starts. If you plan to share your data, provide clear guidance on what re-users can do with it. One way of clarifying the terms of use is to license your data. The traditional methods are data sharing agreements or collaborations. Another option is to use open licenses and granting rights to anyone.

For sharing and reusing data, CC0 public domain dedication is the recommended license. For explanation and further information see the DCC’s guide How to License Research Data.

Data access and sharing

As to data access during the project, you have to be aware of data security, authentication, access rights, and data synchronization.

Data sharing is especially an issue after the project. When preparing your data for sharing, things like formats, documentation, ownership and confidentiality are main factors to think about. There are various ways for data sharing. The most common ways are sharing via email or physical device after an individual request, putting it online on a personal webpage, adding it as supplementary material to the publication at the journal’s platform, depositing it in an open repository and publishing a data paper.

A number of funding agencies (EC, FWF) and science publishers (Nature Publishing Group, BMC, PLOS) require data sharing underlying published research via an open repository. Nevertheless there are reasons to restrict access to certain data or parts of it (e.g. sensitive data, copyright protected data). Present a strong case for any restrictions on sharing, such as embargo periods or restricted access, and ensure these are properly justified.

At IST Austria, you have the opportunity to deposit your data in the institutional data repository IST Austria Research Explorer. A DOI will be assigned to the datasets and they will be publicly accessible and stored for the long term. In your RDM plan you may insert the subsequent phrase to articulate your intentions of storage and sharing:
At the Institute of Science and Technology Austria an institutional  and publicly accessible data repository (IST Austria Research Explorer: for data publication and sharing is provided. Furthermore deposited data is registered with DataCite and therefore assigned a DOI which enables data citation.
(Name of the project/researcher/etc.) commit(s) to data deposit in IST Austria Research Explorer within the period agreed, for data which are appropriate to share.

If you find a suitable subject repository, we recommend you choose it over the institutional repository as subject repositories can provide very particular services to subject specific requirements.

Here are some issues you might deal with:

  • Risks to data security and strategies to avoid them
  • Handling of sensitive data
  • Risk of data being procured illegally and manipulated
  • Safe transfer of field data into the main storage
  • Access rights of project members and collaborators
  • Providing password protection for data access
  • Funder’s requirements regarding data sharing
  • Repository and its requirements for data deposit

Storage and long-term archiving

How to store and backup data during the project are issues that need to be considered. Besides taking into account the resources already available, it is important to evaluate and decide on further measures if necessary.

After the project data might have to be retained for different reasons (Austrian data protection law DSG, funder requirements, long-term value of the data). Preparing data to expected standards for archiving are time-consuming processes, for which you should allocate significant resources. Data, which underpin publications, should be extracted, captured in machine-readable form and deposited in a repository so they remain accessible. Make sure you know about any repository policies that might affect your data (e.g. accepted data, preferred formats, normalization processes).

Recurring topics regarding storage and long-term archiving might be:

  • Availability of storage space
  • Charges for additional services
  • Data back up strategy
  • Responsibilities for backup and recovery
  • Data recovery in the event of data loss or corruption
  • Location to archive data for the long term
  • Availability of discipline-specific repositories
  • Archiving software or tools necessary to use the data
  • Time span of data retainment

Below you can find links to material that assist you in creating an RDM plan:

General Guidance
Data Management General Guidance (DCC)

Guidelines on FAIR Data Management in Horizon 2020

Guidelines on Open Access in Horizon 2020 to Publications and Data for IST Austria affiliates
Framework for creating a data management plan

Checklist for RDM planning

Elements of a RDM plan

Examples for a RDM Plan
Horizon 2020
Biology and Chemistry

Templates for RDM Plans
DMP Online Tool
E-infrastructures Austria: Template for Data Management Plan

Back to Top