stack bounds checking

There seems to be a recurring problem of sketches misbehaving due to running out of RAM, or people worrying that this might be happening and trying to optimise their code to avoid the possibility.

We all know that premature optimisation is the root of all evil, and I'd like to have a way to detect when the stack is exhausted. I would think this was easy enough to do using using a guard word, and perhaps it's already been done, but I can't find anything to suggest it has been.

Is there any practical way to find out how much stack is available now, and the lowest stack availability (i.e. when stack use was at its peak) during the current execution?

This seems like the sort of thing that would be very easy to do within the runtime library and much more difficult to do outside it.

Statically allocated memory (string constants and global or static variables) are at the bottom end of RAM memory. They stop at the address:

extern unsigned int __bss_end;

The heap (dynamically allocated memory) is growing up from __bss_end. If heap memory has been allocated it stops at address:

extern void *__brkval; // Set to 0 if no heap memory has been allocated. If 0, use __bss_end

The stack is growing down from the top of memory.

The only way to detect a collision is to check for it on every stack push (function call, interrupt) and every heap allocation (malloc).

You could put guard values between the heap and stack and check for them in loop() but that might not check often enough to catch a collision before the program crashes.

You could put guard values between the heap and stack and check for them in loop() but that might not check often enough to catch a collision before the program crashes.

Even if it did, what would you do about it?

PaulS: Even if it did, what would you do about it?

Good question; not sure. We don't have exception support (as far as I know) but we've got setjmp/longjmp support, so we could just bail into a top-level error handler? Or we could have an error handler somewhere which can stick a flag in the EEPROM so that a subsequent incarnation can detect the problem, and then either hang or reset the chip. I don't think there are any really good options, but almost anything is better than just ignoring the problem and carrying on regardless.

I don't think there are any really good options, but almost anything is better than just ignoring the problem and carrying on regardless.

Since there isn't any way to alert the user of a problem, and not a thing that the user can do about the problem, it still seems like the best approach is to understand the hardware and the architecture, and write code that respects those limits.

Obviously, this means that silly stuff like

const int size = 50;
long array[size][size][8];

is to be avoided.

PaulS:

I don't think there are any really good options, but almost anything is better than just ignoring the problem and carrying on regardless.

Since there isn't any way to alert the user of a problem, and not a thing that the user can do about the problem, it still seems like the best approach is to understand the hardware and the architecture, and write code that respects those limits.

Obviously, this means that silly stuff like

const int size = 50;
long array[size][size][8];

is to be avoided.

While that's true, non-trivial programs can easily approach the memory limit without using poor coding practices, and there doesn't seem to be any simple way for a developer to know when they are straying into dangerous territory. I am new to this forum but even in the brief time I've been here there have been several threads where developers were advised to reduce memory consumption and see whether that makes problems 'go away'. This is a very inefficient way to deal with the problem. A better approach would be to provide some way for developers to detect whether they are near or at the limit of available memory. I seems clear to me that this is a recurring issue which has caught a lot of people out and will continue to do so.

I seems clear to me that this is a recurring issue which has caught a lot of people out and will continue to do so.

That is one way to look at the problem. The other is that the people who seem to be caught unaware really are unaware of how much (or how little) memory is available, and how that memory is used. They see a message that the sketch is using x bytes out of y bytes, see that x is considerably less that y, and assume that they are not bumping up against memory issues.

That, in my opinion, is not understanding the hardware and the architecture.

Even the compiler is constrained in how much it can do to help. It knows how much data is stored in SRAM. If the amount of data that the sketch tries to store in SRAM exceeds the size of SRAM, the compiler should certainly be able to detect that.

What it can't detect is how much of the heap will be used for function calls. Is that function with 27 arguments going to be called recursively, or is it unlikely to ever be called, because it will only be called under circumstances that are never expected to occur?

How many functions are called and how many arguments there are for those functions determines how much stack space is used. Local variables in those functions determine how much heap space will be needed for that function.

Since the order of the function calls is dynamic, the compiler can not tell whether there will be a problem at run time, or not.

So, adding additional code to detect a situation, when there is nothing to be done if that situation occurs, would simply make the situation more likely to occur.

Local variables in functions consume stack space, not heap, unless specifically malloced

Local variables in functions consume stack space, not heap, unless specifically malloced

My bad.

I don't agree that there's nothing to be done. I've suggested a couple of ways that the problem could be recorded so that the developer can tell it has happened.

I hope you will agree that it is very important not to run out of memory. Poor programmers may use more memory than necessary, but even good programmers following best practice can easily be at risk of running out of memory. It is also difficult for a developer to predict exactly how much memory a given sketch will use and also very difficult to tell whether the sketch has, in fact, run out of memory at any point.

Suggesting that all developers should become better programmers and thereby use less memory, does not really address the issue.

One common thing to do is paint the RAM with a known pattern then check the pattern periodically or just after you’ve exercised most program paths as part of a test suite.

Of course this won’t prevent a crash but is can be used while developing to keep an eye on how things are going.


Rob

I've suggested a couple of ways that the problem could be recorded so that the developer can tell it has happened.

What good would a notification after the fact be? If you are going to check stuff, I would think that the reason for additional code/checking would be to enable the program to do something useful about the lack of memory. I can't really see that there is anything that the program can do about insufficient resources.

I seem to recall this came up a little while back.

Really, there is nothing that can be done on the architecture. Within the constraints of the underlying instruction set, if you call a function you cause a few things to happen. One is that the stack space is consumed by the return address, and the other is that auto variables (local variables) are also allocated stack space. So for a moment forgetting about dynamic memory allocation, the processor has no way of knowing if a function call is going to hit the "top end" of the statically allocated variables (and if you use dynamic memory, the dynamically allocated variables AKA the heap).

Some sort of check when doing a malloc doesn't solve the problem, because the problem can rear its head a moment later when you do a function call.

You could try allocating a "safety region" of X bytes, but when you only start with 2K bytes, the amount you have over for this purpose would be small.

Even then, what do you do? Stop working? Well it will probably do that anyway of its own accord. Do a "blue screen of death"? Hardly. Put a message out the serial port? What if the port is connected to a motor?

Plus, the memory (program and RAM) you consume implementing such a scheme may itself push the program over the edge. In other words, the debugging stuff might actually cause the problem it is trying to solve.

PeterH: Is there any practical way to find out how much stack is available now, and the lowest stack availability (i.e. when stack use was at its peak) during the current execution?

This seems like the sort of thing that would be very easy to do within the runtime library and much more difficult to do outside it.

No, because no runtime library calls are involved in invoking a function.

I really think the best you can do is good design. For example, you leave yourself a margin. Put things into program memory (eg. string constants) where possible. Be aware of how much your variables take. Be aware of the likely depth of function calls. Be aware that local variables consume stack space, and go easy on their use. For example, none of this:

void foo ()
  {
  char buf [500];

  ...

  }

If you use dynamic memory be aware of fragmentation issues. Make sure you free things no longer needed.

PeterH: It is also difficult for a developer to predict exactly how much memory a given sketch will use and also very difficult to tell whether the sketch has, in fact, run out of memory at any point.

The "free memory" function, or whatever it is called, used judicially during development, should give you an idea of how much you have used up.

PaulS: What good would a notification after the fact be?

There are several questions open right now along the lines of: my system is doing something weird; has it run out of memory? There should be an easy way for developers to answer the question, but at the moment the only answer we seem to have is "I don't know; try reducing the amount of memory you're using and see whether that makes any difference".

Even if it was as crude as providing a function that could be called to find out how much stack space was free at the time of the call, that would be a start. If I called it where I suspected there was a problem and it came back with 500 bytes free, then that tells me to look for other causes. But if it says 50 bytes free, I know I'm in dangerous territory.

But I think there are less crude things to be done. Using setjump/longjump it would be possible to unwind the whole stack and return execution to a user-defined error handler. The system would be in an undefined state and would need to be restarted, but a carefully coded error handler would still be capable of doing something to tell the user it had been called, whether it's flashing an LED or writing a flag to EEPROM where the next incarnation can find it.

I don't think a perfect solution is feasible, but anything would be better than what we have now.

PeterH: There are several questions open right now along the lines of: my system is doing something weird; has it run out of memory? There should be an easy way for developers to answer the question, but at the moment the only answer we seem to have is "I don't know; try reducing the amount of memory you're using and see whether that makes any difference".

Even if it was as crude as providing a function that could be called to find out how much stack space was free at the time of the call, that would be a start. If I called it where I suspected there was a problem and it came back with 500 bytes free, then that tells me to look for other causes. But if it says 50 bytes free, I know I'm in dangerous territory.

That much exists today, as Johnwasser refered to slightly obliquely in reply #1:

// this function will return the number of bytes currently free in RAM      
extern int  __bss_end; 
extern int  *__brkval; 
int freemem()
{ 
 int free_memory; 
 if((int)__brkval == 0) 
   free_memory = ((int)&free_memory) - ((int)&__bss_end); 
 else 
   free_memory = ((int)&free_memory) - ((int)__brkval); 
 return free_memory; 
}