1. What does the the energy modelling community look like? What kinds of researchers are involved in this work (e.g. disciplinary and organizational affiliations), how do they collaborate, and what kinds of formal structures have been established to organize them?
The community arose in Berlin, Germany in September 2014. Most people involved are completing their higher education or classify as early‑stage researchers. A few are mid‑stage researchers and beyond. And some work for consultancies, companies, start‑ups, or government agencies.
Geographically, the community started in the German‑speaking DACH world, later spread to the United States, and is now making inroads into the United Kingdom. Other participants are sprinkled throughout the planet, including the Russian Federation, India, and the Global South. English has always been the working language.
The community has no formal structures. Its ethos derives from open source software development. By common understanding, those running the various online services or twice‑annual physical workshops are accorded complete dominion. The mailing list is the principle place for making community decisions.
Much of the discussion that follows below centers on European law — in part, because Europe provides a more restrictive legal context for data than that found in the United States. But this focus is equally a reflection of our roots.
2. How and what kinds of data are typically incorporated into energy modelling?
Modelers do not generally deal with personal information (as defined under EU law). If such information is required for numerical models, it can normally be anonymized from real data, generated using estimated statistics, or otherwise synthesized — the key issue is that the information remains representative but need not be exact.
Energy system models require general information about component technologies and their engineering and cost characteristics. Technologies such as windfarms, coal‑fired electricity generation, and high‑voltage transmission lines. Cost information is necessarily estimated in most cases because this information is normally commercially sensitive. Notwithstanding, the European Commission could collect cost and performance information under a public interest rationale and make key statistics available in generic form. Future costs and performance projections, sometimes also subject to technological learning, are necessarily speculative.
Energy system models require specific details about the system being modeled — including the location, age, and connectivity of all represented assets. That includes information about the networks under investigation — usually the electricity grid but perhaps also gas and district heat infrastructure. Current and potential future demand profiles are needed. Locational resource potentials are needed too, including solar and wind assessments and land availability. And possibly also information concerning the built environment and mobility, depending on the scope of the model. Some models may also require historical market clearance information or information on how households and firms take short and long‑run decisions.
The bulk of models capture national and supra‑national systems but some research groups investigate municipal systems, islanded microgrids, and standalone systems.
Some of the information indicated above is subject to statutory reporting. But the processes for harvesting and publishing that information are often archaic and error prone, leading to poor quality disclosure. Projects within the openmod community assemble and curate this information so it can be more readily utilized by modelers and analysts. One such project is the OPSD portal.
Information on future climate patterns is sometimes required but this information can be readily sourced from the climate science literature and is not legally encumbered.
Most of the modeling within the community is intended to inform public policy options for our rapid trajectory to net‑zero carbon. Research either concentrates on methodologies or seeks to provide policy‑usable results and insights.
3. What infrastructure is currently available to facilitate the sharing of this data among researchers?
Within the orbit of the openmod, the Open Energy Platform (OEP) is the primary resource. This platform is specifically designed to handle the needs of energy system modeling and, in particular, scenario analysis. Energy system modeling differs from other forms of computational science in that testable outcomes are not possible and a range of speculative scenarios — each with their own explicit objectives, constraints, and assumptions — must instead be analyzed and traded‑off against one other.
In addition, there are initiatives specifically aimed at allowing data to be transferred between different modeling projects in order to facilitate cross‑model comparisons. Each model has necessarily evolved its own data interface and internal semantics.
4. Why is open data sharing important to energy modelling? What are the typical positions on this issue among stakeholders engaged with energy modelling?
We adopt the European Commission description for open data (EU Directive 2019/1024, recital 16):
Open data as a concept is generally understood to denote data in an open format that can be freely used, re‑used and shared by anyone for any purpose
Data sharing reduces duplicated work, improves data quality and coverage, and facilitates cross‑model comparisons — that last point being necessary for strengthening confidence in both the direct results and subsequent interpretations.
5. What challenges or barriers to widespread data sharing are unique to research involving energy modelling?
That means that suitable open licensing is key. In most cases, such licenses do not grant binding permissions but rather confer certainty. Particularly given the presence of Directive 96/9/EC database protection within the European Economic Area (EEA) in which one cannot know if a data extraction from a public portal was insignificant or not.
The power exchanges that run the wholesale electricity markets are particularly resistant to providing disclosed information in any kind of usable form — and deploy techniques like serving data that cannot be highlighted and copied to evade recovery. This is certainly against the spirit of the legislation, even if technically compliant.
Another emerging problem is the proliferation of national open data licenses — such as the recent German Government dl‑de/by‑2‑0. Such licenses could well lead to legally siloed data when not inbound compatible with the CC‑BY‑4.0 license, even if only on some trivial legal point.
6. What are the most important supports needed in order to cultivate a thriving data community among energy modelers?
Recognition by science funding organizations of several necessities would help. First, the need to require suitable licenses on all appropriate outputs. Second, support for ongoing maintenance, once the underlying data projects have completed. Third, to provide stable online archiving for non‑deliverable artifacts such as project websites, wikis, public mailing lists, and code repositories.
But beyond that, most solutions have to come from within the modeling community.
7. How is openmod working to address the open data sharing needs of the energy modelling community? Who else is doing important work in this area and what else is on the horizon?
For the openmod, the concept of genuinely open data was central from day one. But maturity has brought forward two vitally important related agendas:
- a community ontology — a shared worldview
- agreement on collection protocols and metadata — that being data about data
Both initiatives are interconnected, both involve deep buy‑in from within the community, and both will take significant effort to bed‑in. The Open Energy Ontology is addressing the first and the EERAdata initiative is pursuing the second. The EERA and openmod communities have begun to work together on the latter.
Open is not the only paradigm for energy system modeling. Another is the closed consortium model that is effectively only available to government ministries and multilateral agencies. How that paradigm evolves in an ever increasingly open world remains to be seen. In any case, there is no crossover between these two realms at present. A third paradigm is the single‑institution closed model. Again, the future of such models in the face of open development remains unclear.
An upcoming challenge is the tracking of both data provenance and data versioning at scale — taken together these represent active research questions in computer science and are certainly not unique to the domain of energy analysis.
The prospect of supporting and using linked open data (LOD) is now surfacing. Among other initiatives, our community has connections with the DBpedia Databus project.
Finally members from within the openmod community make written and oral submissions to European Union public consultation on law reform and science policy. Making ones voice heard in such processes is an important and necessary activity.