Pages: [1]   Go Down
Author Topic: Out-of-memory check  (Read 675 times)
0 Members and 1 Guest are viewing this topic.
Ontario
Offline Offline
God Member
*****
Karma: 20
Posts: 835
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I have seen a few problems recently that are, or look to be, situations where the application has simply run out of memory and clobbered itself.

I would like to propose an out-of-memory check that could be incorporated into the wiring logic.  What I have in mind is for the timer0 interrupt handler to have a couple of instructions in it to test whether the SP and __brkval have crossed.  In this event, I would have control branch to a function that blinks the pin 13 LED in a characteristic way.

The user experience is that an OOM (that persists for more than 1,000 microseconds) would cause the program to halt and a diagnostic blink pattern akin to a PC POST code to be generated.  Maybe this could be generalized into a lightweight monitor that could handle other catastrophic  bugs in a more graceful way that beginners could deal with -- though I can't think what others would qualify right now.

At some point I will have a crack at implementing this -- I don't think it would be very difficult.  My Q is what do folks think about the idea?  Has it been tried before and shown to be a waste of time?  My worry is that the timer0 interrupt doesn't run all that often and there's a million and one ways a program could bork itself long before the timer had a chance to see and respond to the situation.
Logged

nr Bundaberg, Australia
Offline Offline
Tesla Member
***
Karma: 121
Posts: 8461
Scattered showers my arse -- Noah, 2348BC.
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I've got a simple monitor that runs in the background on the watch dog timer and I had thought of adding this memory test, but it only runs every 16mS and as you say a lot of shit can hit the fan in that time.

______
Rob
 
Logged

Rob Gray aka the GRAYnomad www.robgray.com

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 178
Posts: 12290
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset


An effective strategy for detecting this kind of failure is to fill the void with a known value (e.g. Microsoft uses 0xCC; I prefer to use a histogram of the target application).  If the first byte below __brkval is not the expected value the stack and heap have crossed.  The problem with using the next address below __brkval is that the stack sometimes grows in chunks; essentially skipping sections of memory.  To compensate, you might require several bytes below __brkval to be the expected value.  This, unfortunately, takes memory away from the application.

The problem with periodic probes is that they are postmortem.  By the time the failure has been detected, the offending code is very likely not running.  I suspect a much more effective strategy would be to modify malloc (and its ilk) to check for a failure.

Finally, providing default behaviour (e.g. blinking an LED) is a good idea but to be widely accepted there needs to be a way to override the default behaviour.  For example, someone controlling a large motor may want to turn the motor off if a failure occurs.
Logged

0
Offline Offline
Newbie
*
Karma: 0
Posts: 25
Hacking since the 70s
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

This is kind of reminiscent of some Corewars programs I wrote a long time ago...
Logged

Phoenix, Arizona USA
Offline Offline
Faraday Member
**
Karma: 36
Posts: 5519
Where's the beer?
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

This is kind of reminiscent of some Corewars programs I wrote a long time ago...


LOL - now you're really showing your our age...

 smiley-lol
Logged

I will not respond to Arduino help PM's from random forum users; if you have such a question, start a new topic thread.

Pages: [1]   Go Up
Jump to: