Breakout group: best practice on how to structure & publish open data

The ideas expressed are:


-options to store data are:

  1. GIT LFS:
    https://git-lfs.github.com/
    “Git Large File Storage (LFS) replaces large files such as audio samples, videos, datasets, and graphics with text pointers inside Git, while storing the file contents on a remote server like GitHub.com or GitHub Enterprise.”

  2. Zenodo:
    https://zenodo.org/
    "Zenodo is a research data repository. It was created by OpenAIRE and CERN to provide a place for researchers to deposit datasets.[1] It launched in 2013, allowing researchers in any subject area to upload files up to 50 GB.[2]

Zenodo has integration with GitHub to make code hosted in GitHub citable. Zenodo is a research data repository. It was created by OpenAIRE and CERN to provide a place for researchers to deposit datasets. It launched in 2013, allowing researchers in any subject area to upload files up to 50 GB.

  1. Open Energy Platform: you can store data here and document it using the factsheets with the possibility to add metada. The scenario factsheet documents the assumptions used to generate the data. The model used can also be documented using the model factsheet.
    http://oep.iks.cs.ovgu.de/

  2. re3data.org:
    ‘The Registry of Research Data Repositories (re3data.org) is an Open Science tool that offers researchers, funding organizations, libraries and publishers an overview of existing international repositories for research data.’

some information is also mentioned here:
https://wiki.openmod-initiative.org/wiki/Data#Data_sharing_techniques
but maybe it should be edited with the results of this discussion.


Best practices for open data publishing:

  • http://frictionlessdata.io
    "Frictionless Data is about removing the friction in working with data. We are doing this by developing a set of tools, standards, and best practices for publishing data. The heart of Frictionless Data is the Data Package standard, a containerization format for any kind of data based on existing practices for publishing open-source software."

Some guidelines from the European Commission for FAIR Data Management:

FAIR: findable, accessible, interoperable and re-usable

From the document:

The Research Data Alliance provides a Metadata Standards Directory that can be
searched for discipline-specific standards and associated tools.
The EUDAT B2SHARE tool includes a built-in license wizard that facilitates the
selection of an adequate license for research data.
Useful listings of repositories include:

  • Registry of Research Data Repositories

  • Some repositories like Zenodo, an OpenAIRE and CERN collaboration), allow
    researchers to deposit both publications and data, while providing tools to link
    them.

  • Other useful tools include DMP online and platforms for making individual
    scientific observations available such as ScienceMatters.

1 Like

Suggestions from the break out group:
It would be great if many of us could fill out the fact sheets at http://oep.iks.cs.ovgu.de/factsheets/overview/ in order to make the models easy to find and then refer to the data (independent from where it is documented).
For the openmod community, it seems to be great if we store as much of useful data here: http://oep.iks.cs.ovgu.de/dataedit/ In order to have this data at the same place and thus make it easy to find, sort, select and edit for others in the openmod community.