Memory error in regional simple dispatch model

Fanni_HU · 15 January 2018 19:32

Hello everyone,

I am building a regional, hourly simple dispatch model of three countries, from which only one country is connected to two others (not each with each). I used the simple_dispatch example as a basis, and I added to it hourly defined electricity load, renewable production from cvs file (as nominal values), and some more power plants, by country. I was able to run it after connecting two countries, but when connecting the third one, I got a Memory error:

kép

in models.py, line 171 in bold:
# ######################### FLOW VARIABLE #############################

    # non-negative pyomo variable for all existing flows in energysystem
    self.flow = po.Var(self.FLOWS, self.TIMESTEPS,
                       within=po.NonNegativeReals)

    # loop over all flows and timesteps to set flow bounds / values
    for (o, i) in self.FLOWS:
        for t in self.TIMESTEPS:
            if self.flows[o, i].actual_value[t] is not None and (
                    self.flows[o, i].nominal_value is not None):
                # pre- optimized value of flow variable
                self.flow[o, i, t].value = (
                    self.flows[o, i].actual_value[t] *
                    **self.flows[o, i].nominal_value)**
                # fix variable if flow is fixed
                if self.flows[o, i].fixed:
                    self.flow[o, i, t].fix()

            if self.flows[o, i].nominal_value is not None and (
                    self.flows[o, i].binary is None):
                # upper bound of flow variable
                self.flow[o, i, t].setub(self.flows[o, i].max[t] *
                                         self.flows[o, i].nominal_value)
                # lower bound of flow variable
                self.flow[o, i, t].setlb(self.flows[o, i].min[t] *
                                         self.flows[o, i].nominal_value)

    self.positive_flow_gradient = po.Var(self.POSITIVE_GRADIENT_FLOWS,
                                         self.TIMESTEPS,
                                         within=po.NonNegativeReals)

    self.negative_flow_gradient = po.Var(self.NEGATIVE_GRADIENT_FLOWS,
                                         self.TIMESTEPS,
                                         within=po.NonNegativeReals)

I had similar, Memory error previously (when I had only one country in the model) and I managed to solve it with shorter (less characters) hourly data in the csv file. Now it is not helping any more, it goes further (takes more time), but then I get Memory error in the end again.

Is it possible, that three countries’ electricity data is too much for my computer with 8GB RAM? Or what else could be the problem? My memory is on full utilization indeed, when I run the model, which is unusual, and I have the feeling, that it shouldn’t.

robbie.morrison · 16 January 2018 11:56

Hello Safian. That python message is not very explicit. I had a quick look on google but couldn’t see anything useful. It could mean that your operating system refused to allocate memory when requested. It probably doesn’t mean that your python process abused its memory space, because that would result in an abrupt termination and a core dump. Are you on Linux? You could run your program under memusage to determine its memory consumption, perhaps with different sized models. You could also run you program under valgrind and turn on the appropriate warnings. Valgrind sits between your program and the operating system and intercepts all system calls: as a result your program runs an order of magnitude more slowly. (Valgrind is a godsend for debugging and validating C/C++ code.) You could try the python debugger and insert break points to examine the runtime state at various locations (sorry, I have no experience with that tool). But you have a file:line-number to work with (which is a really good start). A more basic approach is to put print statements in your code and report the size the objects you think are causing the problem using sys.getsizeof(objectname) (that is a bit clumsy but it will work nonetheless). Or you could find another high-spec machine and see if that solves the problem. You just need to be really methodical and work through all the options. Finally, you might need to redesign or refactor your code to use less memory. HTH, Robbie. PS: is this an oemof question, if so, can you tag it as such so others can find it please.

uwe.krien · 18 January 2018 12:42

The memory usage does not depend on the number of regions but on the number of nodes and the type of nodes. A storage for example needs more memory than a sink because it has more Constraints. So it is difficult to say if this behaviour is normal or unusual.

Furthermore, the memory usage of the system before you start the optimisation problem is important, too.

Fanni_HU · 18 January 2018 16:47

Dear Robbie,

Thank you for your answer! I am using Win10, and, in fact, python really uses all my memory: it runs on 99-100% percent after running my model. And I don’t know in details, how memory works, but after a few seconds of high share of “Used”, most of my memory is “Modified” (like here: http://www.techsupportforum.com/forums/f217/runaway-modified-memory-issue-1066778.html). This goes for about half a minute, then I got the memory error. I tried to learn about it, but all I got is that virtual memory can help. I already used that, however, tried it with more, and also with full automatic handle - but no change in memory error.

So I still don’t know if my code is wrong, or something with my system.

Thanks,

Fanni

robbie.morrison · 18 January 2018 17:55

Hello @Fanni_HU . The bulk of my experience is C++ on Linux. But let me try some hints. Generally speaking, the data structures your basic model uses should not consume much memory. But your solver can demand a lot of memory and space complexity is certainly an issue in solver design. Indeed, you could profile your runtime memory usage and determine which calls are eating your dynamically allocated heap memory. For instance: massiv, a part of valgrind. But that is hardcore debugging. Alternatively, your solver might have an option to report its memory consumption on completion, given that your model run gets that far. Moreover there may be some python tools for doing memory profiling too? I guess you are using Pyomo? I have heard that Pyomo eats memory, even before submitting the problem to the solver. (I’ve written this stuff in C++ and these kind of issues simply don’t apply.) Also I don’t know anything about memory management under Windows so I cannot comment on the used/modified/virtual memory classifications. It is also possible that your oemof code has some ugly bug that is chewing through memory, outside of the Pyomo library and the solver call. All you can do is repeatedly partition the problem until you isolate the cause. (Salami tactics in German, I believe.) And as @uwe.krien suggests, shut everything else down before you start your model run. And of course, your model might just be too big for your hardware. Good luck. HTH Robbie

Fanni_HU · 22 January 2018 10:43

Dear All,

Thank you for your time, tips and recommendations!

The issue is solved: after trying a lot of directions, finally, I checked and found missing data in the input data file. This means, that the Memory error notificaiton was really misleading - theprogram should check the integrity of the input data first. However, my model is working now with three countries!

uwe.krien · 22 January 2018 11:11

oemof is a generic framework and it is used in totally different models. So it might be difficult to find a generic way to check the input data. But you should do it within your application. If you think this is generic we are open to integrate it into oemof.

Fanni_HU · 22 January 2018 11:24

searching for N.A. / N/A / #Value! could be a generic one. I bet lot of people are using energy production data in the csv files, which can potentially contain these values, now ending up in memory error.

The other way could be, that in case of memory error some hint could be given that the cause of the error can be also in the scv file/ data input (e.g. memory / data input error)

uwe.krien · 22 January 2018 13:00

If you like to bring that forward please start an issue on GitHub - oemof/oemof: Open Energy Modelling Framework - Python toolbox for energy system modelling and optimisation. But I still think it is better to solve that on application level because there you have a knowledge of the data structure oemof will never have. Therefore you will be able to adapt the test to your structure.

This is a Python message and we cannot change it.

robbie.morrison · 22 January 2018 14:09

Hello @uwe.krien. But you should be able to catch the exception, analyze it, report to console or file, and rethrow as appropriate. Another thought: does oemof provide logging? I am a big fan of debug logging with user determined verbosity. Robbie

uwe.krien · 22 January 2018 15:40

Thank you for your ideas but I cannot say whether they become part of oemof and when.

So long I recommand you to use pandas to handle your input data, because this package is very powerful in reading, writing, handling, analysing, manipulating and checking your data. I myself use it in all my oemof applications.

Checking for nan-values: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.isnull.html