I don’t think that this is a frequently answered question, but my apologies if it turns out to be a canard.
I’m testing some code here on an unbadged “Uno compatible”. I fully accept that it might not be manufactured to the same standard as a genuine board, and that the problem I’ll describe might not occur on better hardware.
The program I’m running outputs, among other things, the maximum time that the main loop takes to run and the SRAM usage. The latter uses code from Measuring Memory Usage | Memories of an Arduino | Adafruit Learning System, in general both values are stable:
SERIAL.print(F(" < ")); SERIAL.print(avgLoop); SERIAL.print(F(" < ")); SERIAL.print(busyLoop); SERIAL.print(F(" 0x")); SERIAL.print(busyLoop, HEX); SERIAL.print(F(" uSec")); SERIAL.print(F(", SRAM: ")); int zz = freeMemory(); SERIAL.print(zz /* freeMemory() */ ); SERIAL.print(F(" 0x")); SERIAL.print(zz, HEX);
Output normally looks like this:
Loop: 216 < 237 < 18728 0x4928 uSec, SRAM: 2285 0x8ED
In addition, the test version of the program outputs cryptic messages showing the transitions of a state machine, where those messages do not use F() so they come from RAM rather than being pulled directly from Flash.
What I’m seeing happen is that after roughly a week of operation (10 million lines of logged output), some of the state machine messages are overwritten with garbage, busyLoop- which is rarely recomputed- is corrupted, the operation of freeMemory() is corrupted, but avgLoop- which is updated regularly- is not corrupted. Specimen output looks like this, with … placeholders for stuff I believe to be irrelevant:
D0b R->L R0 ... Loop: 216 < 237 < 18728 0x4928 uSec, SRAM: 2285 0x8ED ... ... D0b �a>L�bYU bYU ... Loop: 216 < 244 < 621043328 0x25045E80 uSec, SRAM: 8960 0x2300
Once I get the suspect numbers etc. they remain unchanged. The program starts over with sensible output if the reset button is pressed, which suggests that Flash is not corrupted. Since both dismal and hex values are affected, it’s probably not some weird output library problem.
Has anybody looked at protecting areas of memory with canaries or checksums that could warn that something’s wrong and trigger a restart?
If not, what symbolic information is available defining the extent of rarely-changing messages etc. in RAM, so that I know what should be checksummed?