Research Data Management Plan (RDMP)
Researchers invest time in Data Management(DM) but the more effort spent on DM upfront, the easier and more efficient it becomes to work with it (for yourself and others) in the long term. Have a look on our information regarding RDM.
A RDMP is a living document, which has to be updated and checked on a regular basis (this is also a funder requirement). Especially if your work is based on a collaboration with different institutions, it is essential to have a common basis regarding name conventions, used ISO standards, and shared file organization (also perhaps a shared reference management) – all summarized in a joint RDMP.
On this information page, we focus on the RDMP templates of the funders FWF and EC H2020. To support you further, IST implemented the tool RDMO (IST login is required) where you can create, store, manage and export your RDMPs.
Depending on your discipline and the kind of research you do, these are the main topics to cover in a RDMP:
On this site:
Outline shortly your data creation by answering questions like:
- Who is responsible for RDM in your project? (Recommended to add the ORCID iD of the contact person.)
- Are you re-using existing datasets? (Where are they from? Is there a usage agreement/waivers/open license? – see also further details on re-use)
- Are you creating new datasets?
- What kind of data will be generated/collected/re-used (observational, experimental, simulated, derived or compiled)?
- What is the data stability (fixed, constantly growing, revisable)?
- What is the expected volume of the data (file size, amount of data)?
- Data utility: who will benefit? Is there a target audience?
- Depending on your chosen data storage, the costs for long-term preservation should be estimated and calculated. Describe different cost categories (server storage, backup solution, etc.) and how you plan to cover these costs.
Everyone benefits on having a precise and thorough documentation. Without adequate documentation, research data is worthless (and difficult to defend if – because of lacking documentation – the research is not reproducible). It is recommended to write the documentation of research data in a clear and simple language so that you can provide the traceability and reproducibility of data operation even beyond research fields. Therefore, keep track of data parameters (including units and formats), instruments/platforms used to collect/generate the data, codes including definition of variables, methods used, standards or calibrations, etc.
Funders will ask on how FAIR your data is. We have summarized the concept on our RDM page.
Making data findable
- Metadata: Metadata are “data about data” in a sense that they provide information about the data in a highly structured digital form. It is human- and machine-readable. The better the metadata, the easier it is for other researchers (and search machines) to find your research data.
Funders may ask what metadata standard/metadata schema will be used. We recommend to use Dublin Core, a basic, domain-agnostic, and widely used one. There are many discipline specific standards that might be mandatory for different data repositories.
- Persistent identifier/unique identifier: plan on using persistent identifiers for your datasets (like a DOI), so that your data can be cited and found easily via search engines. Consider using unique identifiers for researchers as well. The most common (and partly funder/journal mandatory) is the ORCID iD. This is a permanent iD for researchers, allowing to clearly identify who is responsible for the research.
- Naming conventions: Be precise on how to label your files. Use enough information to immediately identify what is inside. File names should be descriptive, consistent, short and without special characters. Agree on standardized date convention (like ISO 8601), and avoid spaces (use underscores or dashes instead).
- File versioning: If you work on a paper and/or a complex analysis, save the files frequently with a new version number like “Research_Concept_v1”, “Research_Concept_v7”… “Research_Concept_final”. Disadvantage of this system: you cannot track the changes between the versions within this system (but this could be perfectly traced in a Readme file!). However, if you work on research code, consider using a tool like Git where you can track every change and revert to earlier versions easily. Learn more about our IST Git (IST login required) on the IT website.
- Standards: Standards define the allowable values on a particular topic, such as ISO 8601 governing date formats (YYYY-MM-DD or YYYYMMDD) or ISO 6709 for latitude and longitude.
- Search keywords: provide specific search keywords to make discovery easier (i.e. use of discipline specific thesauri).
- File Format: Try to use open, standardized, well documented, and widely used formats, especially for long-term preservation. Remember to allocate enough time for converting proprietary software formats to standardized ones! For example, use .txt files for text documents (instead of proprietary file types), .csv for tabular data, or .wav for audio files.
- File organization: This sounds simpler than it actually is. You can manage your files by project/researcher/date/research book number/sample number or any other field that seems reasonable for you. However, be aware that many researchers not only have digital but a combination of analog and digital information to manage. Therefore, the easiest way is to choose one schema for using both analog and digital. A common file organization in a collaboration is critical for the project.
- Readme files: one of the most important tasks for achieving good data management is describing your data. However, research data and research documentation are often saved in different file locations. The easiest way to have data and description together is creating a simple readme file (plain text files), which is stored directly alongside the research data. Use several readme files on different structure level (i.e. one to explain the folder structure and how to use it, and another to explain the data structure). Readme files take very little time to create but provide an easy and simple way to keep files organized and documented.
- Physical research notebooks/electronical lab notebooks: Remember to write legibly. Use notebooks with acid-free paper (due to preservation reasons), and agree on the minimum amount of information that has to be included. Think/Discuss about advantages and disadvantages of using an electronical lab notebook (IST offers different solutions on that; please get in contact with us for further information). Be aware that both physical and electronical notebooks have to be preserved for at least 10 years after finishing your research project and that both kind of notebooks need a backup plan.
- Templates: Consider using templates by creating a list of information to record for every experiment/report. It can help extremely to add a structure to your notes.
Making data accessible
Here are some questions and actions that you might consider to make your data accessible:
- How and where will you make your data accessible? (Data repository, project website…)
- Specify what methods or software tools are used or needed to access the data
- Is all data openly available? If not explain the reason for that.
- Where will the data be deposited?
- Think about the possibility of writing data papers. These papers describe the dataset but do not include scientific analysis or conclusions from the dataset. Data papers are published in journals, while their datasets are stored in a separate repository and linked only via persistent identifier (i.e. a DOI). A data paper offers several advantages: it provides greater documentation to an important dataset, it goes through the peer-review process, and furthermore, it enhances the re-use of the dataset. One important (Open Access) data journal is Scientific Data from Nature.
Making data interoperable
Here, the focus lies with normalized vocabulary and standards to enable data exchange and re-use between researchers, institutions, or countries.
Which metadata standards and metadata vocabularies/methodologies are you using?
If no standard is used – will you provide a mapping to discipline ontologies?
Making data re-useable
One of the advantages of sharing your research data is that you can build on the work of others and do not have to start from scratch. Here, the most important information is how and where to find re-usable data. Look in data portals like Re3data, fairsharing.org, DataCite, European Union Open Data Portal, or OpenAIRE to find discipline specific data repositories and datasets. If you need multi-discipline repositories, try searching in Zenodo, Dryad, Figshare, or in institutional repositories like IST Research Explorer. Keep in mind to check the re-use license and conditions of the repository before re-using data!
For your own data, think about licensing it to permit and clarify re-use terms. If sharing, describe data quality processes (i.e. repeated samples or measurements, peer review of data…). Furthermore, before you decide to share your data, consider copyrights, licenses and contracts, as well as intellectual property rights/patents. Do you need an embargo for your data and why (e.g.journal requirement/funder requirement/…)? Keep in mind – especially in collaborative international teams – that national laws and funder requirements may differ!
Different data licenses you can choose:
- CC0 https://creativecommons.org/share-your-work/public-domain/cc0/
- ODC-By https://opendatacommons.org/licenses/by/1-0/
- PDDL https://opendatacommons.org/licenses/pddl/1-0/
- ODbL https://opendatacommons.org/licenses/odbl/
Data storage, security and preservation
Take your time to think through your IT environment: You will generate research data before and during your project, and you will have to take care of IT questions after finishing your project as well.
Discuss within the team who needs access to which data (access/permission level – role), define if you need external access from a collaboration partner and where to store shared data (for example, do not use commercial cloud storage partners but use our IST Cloud solution). Keep in mind that data security also includes thinking about destroying data after the end of the archival deadline and sensitive data after it is no longer needed. IST IT solutions created a Guideline for storing research data at IST (IST login is required to access).
Think about the file format (see also RDMP/Data documentation) and try to use open, standardized, and well documented formats whenever possible. Prepare your research data and convert non-proprietary data in open file formats before finally preserving. Take into account to check on your data periodically, such as every second year, if it is necessary to update file formats, hardware, and documentation. Try to cover these questions: Has the data become corrupt? Are the backups working correctly? Is an update on hardware necessary? Can you still understand the documentation?
General Guidance and RDMP templates
RDMO, a tool to manage your RDMP (IST login required)
FWF RDMP template
EU Horizon 2020 RDMP template