@robbie.morrison took the liberty of transferring a recent posting on data hacking by @tom_brown across from the openmod mailing list to this forum:
- title : the unfulfilled potential of the openmod community
- posting : https://groups.google.com/g/openmod-initiative/c/pUtZ1lHRkOM/m/uqjGJCx5AwAJ
- timestamp : Tue, 2 Mar 2021 13:30:18 +0100
- original poster : @tom_brown
There are some copy‑edits here to better suit markdown and improve clarity. But that said, no substantial changes to the content were made.
Introduction
The open energy modeling world has changed since the openmod was founded in September 2014. There are more open modeling frameworks than anyone can count. Research journals and funding bodies increasingly demand open models for studies. In some senses, the initial vision of openmod has been realized, but at the same time I think it is under‑performing compared to its potential.
In particular, the community could do more on data and data quality, which could be called “the new frontier”, and specifically:
- catalog on the openmod wiki what data is available to help us better discover data
- crowdsource currently unavailable data
- catalog model implementations for different regions, to avoid people constantly building new models from scratch
More details below, along with concrete action steps.
Obviously this posting is a personal view by @tom_brown and reflects his own biases and interests.
Cataloging data on the openmod wiki
There is still a need to catalog data and make known what is available. This is distinct from projects like the Open Energy Platform (OEP) which host data. We’re still missing the first step of identifying what is available and linking it. This is not in competition with hosting platforms like OEP, which can be used in a second step to host data in a uniformly accessible way.
We have data pages on the wiki which need filling!
This has worked well on “transmission network datasets”:
but there are big gaps elsewhere, including industry, detailed demand data, and gas networks.
There are virtuous network effects here: if the wiki pages become a standard reference — meaning the first place to look for data, then everyone will want to list their data there.
We can also use the wiki pages to identify gaps in the datasphere.
Concrete action
Take a few minutes to look at the pages like that below and add links to databases (open or commercial) that you know about:
Crowdsourcing data
We should also be crowdsourcing data.
Example: datasets like worldwide steel and cement plants typically have 2000 entries — this is more than a single person can collect, but doable for a team of 10–20.
Openmod could be organizing data hackathons.
We could be creating open datasets where every data point is referenced from a press release or other official source.
Or even better, contributing to existing open projects, like the Global Steel Plant Tracker:
We seem to spend 90% of our time talking about metadata and licensing, only 10% about the data itself, so let’s reverse this ratio!
Concrete action
Volunteer to organize a hackathon! Identify some missing data that could be coordinated over a one‑day hackathon and recruit over the mailing list for volunteers. Make sure it’s doable, and try to make sure there is some social element, such as regular coffee breaks and maybe a zoom dinner afterwards, so it’s not all hard work.
Cataloging model implementations for different regions
I have seen or heard of three different open models of the Chinese power system in the past few days. I’m not sure these projects know about each other. I’ve started a page here:
There is also some overlap with the OEP factsheets here:
But the idea would be to organize by region.
Concrete action
Contribute to the above page on the wiki that lists all the various implementations for regions that are open, with links, for example, all the different implementations in SWITCH, TIMES, calliope, oemof, PyPSA, and so on.
Thanks! All comments, feedback welcome!