A review of the following paper:
- Oberle, Stella and Rainer Elsland (1 November 2019). “Are open access models able to assess today’s energy scenarios?”. Energy Strategy Reviews. 26: 100396. ISSN 2211-467X. doi:10.1016/j.esr.2019.100396. Licensed under Creative Commons CC‑BY‑NC‑ND‑4.0 (attribution, non‑commercial, no derivatives).
Oberle and Elsland (2019), published in November, should be of interest to this community — it covers quite a number of projects (17 by my count) that have developers active within the openmod. Rather than use open source and open data as touchstones (tagged in the paper as “open source models”), the authors instead rely on open access (tagged “open access models”). Open access, in this context, is essentially a measure of the ease of access to a particular energy system model by any independent researcher. This includes the effort necessary to establish a suitable runtime environment, with an unstated presumption that everyone uses Microsoft Windows (I last did so around 2005). Note that I use the term “model” throughout, whereas “model framework” might be more accurate. For disclosure, my 2018 paper (doi:10.1016/j.esr.2017.12.010) on open energy system modeling is cited by the authors.
The open access metric means that the closed source EnergyPLAN (maintained by Aalborg University, Denmark) rates above, for instance, the open source PyPSA (which resides on GitHub). My associating PyPSA with GitHub, and not Karlsruhe Institute of Technology, may seem provocative, but anyone is free to clone and fork the PyPSA codebase and use and develop the code as they see fit, unencumbered by the wishes and thoughts of the principal developers. Models that have proprietary dependencies, such as the GAMS environment, are regarded as the least accessible. This leads to Balmorel (Denmark) being rated below PyPSA. I am not suggesting the typology employed is wrong in any way, just intrinsically different from that derived from an open source development perspective. I guess it depends on which particular Kool‑Aid one normally drinks.
The paper locates 36 of the 40 models under consideration on various two dimensional planes and these plots are quite illuminating. But I should add that some of the open source projects are under rapid development and their coordinates will doubtless shift, if they haven’t already.
The paper then compares the DESSTinEE (Imperial College London) and PRIMES (Athens University) models (section 3.2). PRIMES is not accessible and not described as such — rather the widespread historical reliance of the European Commission on PRIMES for analyzing low carbon and market liberalization policies underpins its selection. Also PRIMES, with other tools, was used to establish the 2016 EU reference scenario (Capros et al 2016). The choice of PRIMES is somewhat odd in my view because PRIMES is currently being retired by the Commission, due to a desire to improve policy transparency. Indeed, I’ve never seen a policy‑support model so regularly and stridently criticized for opaqueness in technical meetings as was PRIMES.
Anecdotally, the 2016 EU reference scenario, based on 80–95% decarbonization, is now stale. I spoke to a European Commission CGE modeler recently and only 100% decarbonization scenarios are of any interest. This is a problem for all researchers, not just the authors of this paper, because the adopted policy and associated targets have lagged well behind the political debate. My preference would be to work from a newer policy document, such as European Commission (2018) which, although not ratified by member states, is clearly more current.
I contributed the Wikipedia entry for DESSTinEE so I have some awareness of the project. The model is implemented in Excel/VBA and carries the not-recommended-for-software-or‑data Creative Commons CC‑BY‑SA‑3.0 license. The latest release that I could locate was dated 24 October 2015, although I presume newer versions have been developed as there are associated peer‑reviewed publications as recently as December 2018. A screenshot of my google search (yes, google!) trying to access DESSTinEE is shown below:
The Oberle and Elsland paper does not provide more current information (see reference 15) than that shown above either. Which means over five years old regarding accessibility. In contrast, many open source energy system projects use test-driven development and continuous integration with potentially multiple updates, or at least pull requests, per day— both practices (TDD+CI) being the subject of a well attended break‑out track at the recent Berlin 2020 openmod workshop.
Figure 8 is worthwhile studying. It appears that DESSTinEE is short on measured 2010 electricity production by about 15%, mostly in relation to oil (or did I misread that?). The work-in-progress open energy ontology (OEO) should go some way (plus better communal data from projects like OPSD) to resolving these kinds of discrepancies and improve confidence in cross‑model scenario comparisons more generally.
The paper compares “state-of-the-art commercial models” with open access models in the conclusion (p7) but there is no treatment in the main text, nor are the commercial models named — I’m just curious more than anything because I rarely deal with proprietary models. I also wonder how that particular passage cleared review.
The tables in the appendix provide useful comparative information between models, small errors aside.
I believe there are several other model comparison exercises underway at present (DIW, RLI, and ETHZ might like to comment below). So it will be interesting to see what these other projects reveal in due course.
It is useful to examine other key criteria in light of accessibility. Accessibility alone is not sufficient for either policy transparency or scientific reproducibility — both of which are largely failing today. Moreover being able to study, build, and run a computer program is a prerequisite for reproducibility — and only open source software licenses guarantee access to the source code and provide the necessary permissions (Stodden 2009). Therefore, an open source license on an energy system model, rather than ease‑of-establishment criteria, remains the necessary condition for open science.
I have not really traversed the question of open data here — and neither do Oberle and Elsland. But both transparency and reproducibility require the underlying datasets, suitably open licensed, be available as well.
Current open source energy system models, to the best of my knowledge, admitted fall short of low barrier scientific reproducibility. But most teams are working hard to improve and automate their workflow and their publicly accessible dataset storage protocols. Moreover, there is very considerable effort within the openmod community to improve shared data infrastructure and interoperability (these developments to be the subject of a future blog). Closed source projects often rely on non‑free datasets and are consequently less well placed to capitalize on these emerging common pool assets. And certainly less able to contribute to open science more generally.
Looking ahead, it will be interesting to see how cloud computing impacts on accessibility. Historically the odd project has provided remote accounts to their stand‑alone servers (deeco, for instance, via SSH) and some projects currently run web‑interfaces to stripped‑down educational versions (including PyPSA). Well‑resourced projects could well offer students and other individuals cloud time on maintained environments with fast hardware and high‑performance proprietary solvers, leaving only the data preparation and analysis local. That would certainly improve access in the sense of Oberle and Elsland.
Personally, I try to avoid the term “open access” wherever possible. Unlike “open source” and “open data”, open access has no consensus definition and can range from no paywall and full copyright (Cambridge University described Stephen Hawking’s unlicensed PhD thesis thus) through to the very good 2003 Berlin Declaration. Further background here. More recently, a number of qualifiers have been added by commercial publishing houses, including “green”, “gold”, and “platinum” and these colors seem destined to increase confusion. I, for one, ignore them and instead check the deployed license. And, as noted four paragraphs back, it makes relatively little sense to talk of “open access models” absent of the legal context because genuine open licensing is a prerequisite for open science.
A minor point perhaps, but the Paris Agreement states “well below 2°C” and not simply “below 2°C” (p1). International lawyers have suggested that a court would interpret that threshold as something like 1.85°C.
Readers should note the Open Energy Modelling Initiative rates a mention on page 1 and inclusion as a milestone in figure 1.
If corrections are required, either respond below (preferred) or contact me directly and I’ll make the necessary changes.
It subsequently dawned on me (two days later) that the focus of the Oberle and Elsland paper on model accessibility may well derive from the FAIR principles for scientific data: the “A” in “FAIR” being “accessibility” (Wilkinson et al 2016, Collins et al 2018).
Generally speaking though, code and data have very different characteristics. For instance, data provenance is vital and necessary when assessing quality. Whereas code can be refactored endlessly and its version history, except when debugging, is of little consequence in terms of quality assurance. And even less so with test‑driven development (TDD) (also discussed earlier).
To continue, data should only be licensed under a Creative Commons CC‑BY‑4.0 license or assigned a public domain waiver, whereas code can be licensed under a range of dedicated software licenses, each with different characteristics and a different location on a quite intricate legal compatibility graph. Moreover, selecting a software license is both tactical and strategic and your choice of license goes quite some way to defining the ethos of your project and who might contribute. There is no such analog with data.
Finally, the concept of accessibility is not nearly as significant an issue for code as it is for data. And for software libraries at least, code has a tendency to converge very substantially over time. For example, the number of linear programming solvers in use in this community is probably just two and certainly not more that four.
These considerations may well explain why I had evident difficulty relating to the paper by Oberle and Elsland. Nonetheless, I see no particular reason to modify my original thoughts.
Collins, Sandra, Françoise Genova, Natalie Harrower, Simon Hodson, Sarah Jones, Leif Laaksonen, Daniel Mietchen, Rūta Petrauskaité, and Peter Wittenburg (November 2018). Turing FAIR into reality: final report and action plan from the European Commission expert group on FAIR data — KI-06-18-206-EN-N. Luxembourg: Publications Office of the European Union. ISBN 978-92-79-96546-3. doi:10.2777/1524. Directorate-General for Research and Innovation.
Capros, P, A De Vita, N Tasios, P Siskos, M Kannavou, A Petropoulos, S Evangelopoulou, M Zampara, D Papadopoulos, L Paroussos, K Fragiadakis, S Tsani, P Fragkos, N Kouvaritakis, L Höglund-Isaksson, W Winiwarter, P Purohit, A Gomez-Sanabria, S Frank, N Forsell, M Gusti, P Havlík, M Obersteiner, HP Witzke, and Monika Kesting (20 July 2016). EU reference scenario 2016: energy, transport and GHG emissions trends to 2050. Brussels, Belgium: European Commission.
European Commission (28 November 2018). Communication from the Commission to the European Parliament, the European Council, the Council, the European Economic and Social Committee, the Committee of the Regions and the European Investment Bank. A clean planet for all — a European strategic long-term vision for a prosperous, modern, competitive and climate neutral economy — COM (2018) 773 final. Brussels, Belgium: European Commission.
Stodden, Victoria (January 2009). “The legal framework for reproducible scientific research: licensing and copyright”. Computing in Science Engineering. 11 (1): 35–40. ISSN 1521-9615. doi:10.1109/MCSE.2009.19.
Wilkinson, Mark D et al (15 March 2016). “The FAIR Guiding Principles for scientific data management and stewardship — Comment”. Scientific Data. 3: 160018. doi:10.1038/sdata.2016.18. 53 authors in total.
Morrison, Robbie and contributors (22 February 2019). Definitions for open. openmod forum.