String corrupts the heap and crashes

This example, which crashes before printing "Done", demonstrates a serious bug in String.

#include <MemoryFree.h>
void setup() {
  Serial.begin(9600);
}
void loop() {
  int n = 0;
  while(n <=100) {
    int u = 20*n;
    int v = 30*n;
    String str = String(n) + ", " + String(u) + ", " + String(v) + ", " + String(freeMemory());
    Serial.println(str);
    delay(500);
    n++;
  }
  Serial.println("Done");  
}

The output looks like this and the value of freeMemory() in the last column indicates a corrupt heap.

0, 0, 0, 1740
1, 20, 30, 1738
2, 40, 60, 13048
3, 60, 90, 24358
4, 80, 120, -29870
5, 100, 150, -17536
6, 120, 180, -3921
7, 140, 210, 10462
8, 160, 240, 23054
9, 180, 270, -29122
-----crash and restart here---

The bug is from a larger program here http://arduino.cc/forum/index.php/topic,115451.0.html

It would be interesting to modify the String library, and print when the constructor and destructor get called. That code calls the constructor a lot, including the copy constructor. Be nice to know that everything created is properly deleted.

The bug is really ugly since the String calls complete successfully but the heap has been destroyed. The crash often happens in some other part of the program much later.

I looked at String and thought about modifying it with traps like PaulS suggested but soon realized there are lots more things to check.

I decided finding bugs in String is above my pay grade.

Besides, I was trained to never use the heap/malloc in embedded systems. Most coding standards for critical aerospace systems http://www2.research.att.com/~bs/JSF-AV-rules.pdf and motor vehicle coding standards like MISRA have statements like this:

MISRA C++ rule 18-4-1, dynamic heap memory allocation cannot be used.

I looked through the source code a bit and I think the answer is to modify the changeBuffer method to allocate more than what's needed at that point in time. A common desire for new programmers is to read a stream and keep adding one extra character to a String. If the changeBuffer method added something like 10 or 20 extra bytes each time, there would be much less fragmentation.

On the other hand, it uses realloc() -- how much of this padding does realloc() do?

Paul Stoffhegan made a post a while back about patches he made to the String library to fix these problems that were never adopted by the 'team'. Perhaps he would be willing to share those changes?

In my opinion shot-in-dark things like allocate 10 or 20 extra bytes are the wrong way to fix bugs like this.

You then stand the chance of making it sill happen but less often.

There are lots of cases where this type of fix was used in safety critical systems with horrible outcome. Then you trust the system because it seems to work but the bug is still there.

Paul usually tries to get the Arduino group to accept his fixes. I hope the Arduino group would apply a fix if Paul submitted it.

This is directly related to the problem with free, as discussed at some length in this thread:

http://arduino.cc/forum/index.php/topic,95914

Reported as a bug here:

http://code.google.com/p/arduino/issues/detail?id=857

Still has status of "open"

Adding the "free fix" to your code, fat16lib, it runs to completion without crashing:

0, 0, 0, 1707
1, 20, 30, 1704
2, 40, 60, 1704
3, 60, 90, 1704
4, 80, 120, 1703
5, 100, 150, 1701
6, 120, 180, 1701
7, 140, 210, 1701
8, 160, 240, 1701
9, 180, 270, 1701
10, 200, 300, 1698
...
88, 1760, 2640, 1695
89, 1780, 2670, 1695
90, 1800, 2700, 1695
91, 1820, 2730, 1695
92, 1840, 2760, 1695
93, 1860, 2790, 1695
94, 1880, 2820, 1695
95, 1900, 2850, 1695
96, 1920, 2880, 1695
97, 1940, 2910, 1695
98, 1960, 2940, 1695
99, 1980, 2970, 1695
100, 2000, 3000, 1692
Done

MISRA C++ rule 18-4-1, dynamic heap memory allocation cannot be used.

MISRA does not allow any heap work at all, but many "safety" systems allow it as long as it's only done in the setup phase

From the JSF document

Allocation/deallocation from/to the free store (heap) shall not occur after initialization.

Which I think is a better approach.


Rob

Now you see why malloc/heap is not used in serious embedded projects. This sort of thing happens over and over.

I agree with Graynomad, one could consider allocation of memory at start-up as in the JSF standard.

I never use dynamic memory in my libraries or applications but waste lots of time chasing bugs that are reported in my software that turnout to be due to dynamic memory.

In the future I simply won't look for bugs if an application uses dynamic memory. Well, maybe I will look for obvious stuff.

This was good for a laugh, failure of free, wow.

I have more respect for the decision of the real embedded system pros when they forbid use of dynamic memory in standards for critical systems.

However as String is supplied as part of the core library, it should work, as should the supporting libraries: malloc/realloc/free.

There is no excuse for them not to.

Sure, we know about fragmentation, but an egregious bug is something else.

Paul Stoffhegan made a post a while back about patches he made to the String library to fix these problems that were never adopted by the 'team'. Perhaps he would be willing to share those changes?

The original contribution I tried to make is still on Arduino's issue tracker. It was a complete rewrite of String which I created for Teensyduino. It had the fixes to the memory allocator. I didn't want anyone using a Teensy board to suffer a buggy String library.

Unfortunately, the Arduino Team didn't use all of my code. They made many changes, some small, but some that really hurt efficiency. They also didn't use my fixes for the memory allocator, which is probably why this crashes on Arduino. They ignored the special compiler options, despite numerous messages I wrote on the developer mail list to explain what a tremendous improvement they provide.

If you want my latest code, it's available as a free download from my website. Here's the exact link:

http://www.pjrc.com/teensy/td_download.html

Just run the installer, and then grab the files from hardware/teensy/cores/teensy. You don't need to buy a Teensy board..... but of course I would prefer if you do. Sales of Teensy are what's what funds all my work and the many contributions I (try to) make back to Arduino.

I've been running the example code on a Teensy board for the last 30 minutes without any problems. It's reached "Done" many times. I changed the delay to only 5ms, so it runs the entire test quickly. This bug absolutely does not happen with my code in Teensyduino.

The installer patches your Arduino IDE, but I'm very careful to never change behavior for non-Teensy boards. The modified compiler settings and other stuff are only used when you upload to a Teensy board. The source code for those changes is installed to a "src" directory within your arduino directory.

Sure, we know about fragmentation, but an egregious bug is something else.

Indeed. I completely rewrote String in Fall/Winter 2010. While developing this, I used lots of code to log all malloc/realloc/free usage. I spent MANY long hours carefully analyzing and tweaking so realloc would be used to best effect (depends on a couple compiler options, which they never accepted into Compiler.java). I found and fixed these terrible memory allocator bugs, almost TWO YEARS AGO.

They simply didn't want to use my code as-is. They decided to accept it piece-wise, making changes. Some parts, like the special compiler option to elide constructors (saves you from all sorts of memory fragmentation by avoiding unnecessary copies) were never used. That's really a shame, and one that really sours my attitude about contributing to arduino, because I spent so many long hours carefully studying disassembly of the generated contructor/destrutor code and very long logs of every malloc/realloc/free operation for so many test cases.

They never used the memory allocator fix (which was more-or-less just code I lifted from a newer version of avr-libc). That's probably why this code crashes on Arduino. The result is a bad combination of inefficient runtime performance, which only hastens the inevitable crash due to the allocator bugs. It's really very sad, when I put so much work into avoiding those problems.

I'm still very unhappy at how poorly String turned out for all non-Teensy users. I tried to contribute a good String implementation back to all Arduino users. Really, I tried. I wrote lots of messages explaining the issues. Much discussion was held. In particular, Alan Burlison was very unhappy with me and my code (he had planned to do a String rewrite, but never did). Much heated discussion occurred. Nobody seemed to appreciate my effort. The entire process of trying to contribute this back to Arduino was incredibly difficult and painful. The code sat unused for many months (but of course I shipped it for Teensy boards). When it finally was partially used, all my suggestions were pretty much ignored.

I fixed all these problems, and if you use a Teensy board you'll get my code as it should be. If you have a Teensy board, please try running any problematic String examples. I'm sure you'll see it works quite well (unless you run out of memory, in which case you get empty strings but never a crash).

There's just nothing I can do about Arduino's buggy code. I really, really did try. Sorry.

As I was saying above, Paul, I guessed the problem wasn't in your library, and my test proved that.

Thanks for the extra work you have done, pity it wasn't incorporated.

All I can suggest is that anyone who wants String to work properly (to say nothing of malloc/free/realloc which are used, amongst other things by the STL) add a comment to the bug report linked above, indicating that you believe this fix should be expedited into the soonest possible release of the IDE.

To me,

its amazing that there is an identified bug, with what seems to be a documented and identified fix, and its not either accepted or the reasons the proposed fix is not to be used highlighted, so that others don't make the same fix.

Here's the bug report:

http://code.google.com/p/arduino/issues/detail?id=857

You may want to comment there about your belief that it should be urgently fixed. Maybe it'll move to "implemented" in version 1.0.2.

I added a comment to issue 857, linking back to the old issue where I posted the String stuff (after it had been on my website and discussed publicly for months on the Arduino developer mail list).

While posting a "please fix this" comment might do some good, posting a "I install the malloc.c and/or java patch and tested sketch XYZ and my results were [insert results]" is FAR more useful.

Admittedly, testing the java patch is difficult, because you need a working JDK, ant (plus lots of other stuff if using Windows) to recompile the Arduino IDE. There are instructions, however.....

http://code.google.com/p/arduino/wiki/BuildingArduino

But testing the malloc.c file is a simply matter of copying the file to the right directory within your arduino-1.0.1 directory. Even if you're a novice Arduino user with only just enough coding skill to blink an LED and use String, you can certainly contribute by merely copying a file to the correct location and writing up a detailed report of what problems it did & didn't fix (or if it caused any new troubles).

Relatively few people contribute actual code and bug fixes to Arduino, but also very few people contribute by actually testing and writing up a detailed report of the few and contributions fixes that do exist. Even if you're not a developer capable of fixing bugs, just testing proposed fixes and writing results is a great way to contribute to the project.

The bug report has a link to this thread:

http://arduino.cc/forum/index.php/topic,95914

On page 3 of that thread I posted the fixed code for 'free', showed how to temporarily incorporate it in the current IDE, and showed that it fixed the problem.

I just thought I'd point out (now that I'm back from vacation and have a real keyboard) that there is a significant logistic problem here. The Arduino team doesn't "support" the C compiler or library provided with the Arduino environment. They don't even "pick and choose" particular versions of gcc/avr-libc/etc. What is included in avr-gcc is a compiler package that someone else has put together (WinAVR for Windows, CrossPack for Mac.) And yes, it's a fairly old set of utilities, mainly because it's been a while since the WinAVR folks have put together a new package (Crosspack was based on WinAVR, so the Mac and PC versions of the Ardunio code are supposed to match.) It's also been a while since a newer C compiler hasn't had "known bugs" that were critically unacceptable to Arduino (understand that Arduino is one of the few "consumers" of the gcc C++ support for AVRs.)

Applying a patch to avr-libc's free() means that the Arduino team would need to maintain their own version of the gcc tools distribution. Updating the tools in the absence of a pre-packaged set (ie upgrading avr-libc without getting a new winAVR) is nearly as bad (equivalent to maintaining a WinAVR package.) All of this is HARD. Even upgrading to a new WinAVR (which is supposed to come out "real soon now") is likely to be a testing and compatibility nightmare. (for instance, the gcc folk have decided that the way that avr-libc implements most of the "pgmspace" library )for storing constants in flash memory instead of ram) is "wrong and has never been supported." There's a replacement scheme, but it IS likely to be different.

It's a tough problem. It's very common for large projects (commercial and otherwise) to be several years behind "current" on compiler tools, just because upgrading the compiler is such a pain in the neck...

(And I noticed that the "official" avr-libc response to the bug in free was a complete rewrite of the memory allocator. I can't tell you how little confidence that inspires :frowning: )

OK, well my work-around solution of adding "myfree" to somewhere that is likely to be included in a normal compile (eg. wiring.c) and then defining free to be myfree, might be the least intrusive.

This avoids any issues of what other changes might be made to the toolchain, it simply replaces one function with another that fixes a particular bug. And the linker should keep the code sizes the same, by using the one that is actually invoked.

I don't particularly like it for various reasons, but it is better than having a version of free that basically can't be used with confidence.

Oh come on Bill, this really isn't so hard. If it were, how would you explain String working for the last 2 years on Teensy?

The free bug() explains another problem I have searched for in SD.h.

Often users of the SD.h library open and close files frequently. Some of the SD.h examples open a file, write a line, and close the file each time through loop. These programs sometimes crash in strange ways.

SD.h allocates and frees memory to for the SdFat file object and file name. I got really close but never suspected that free could be the cause.

I did a search for free in 1.01 and found these lines in the core and libraries:

D:\arduino-1.0.1\hardware\arduino\cores\arduino\new.cpp
10: free(ptr);
D:\arduino-1.0.1\hardware\arduino\cores\arduino\WString.cpp
104: free(buffer);
121: if (buffer) free(buffer);
172: free(buffer);
D:\arduino-1.0.1\libraries\Firmata\Firmata.cpp
391: free(tmpArray);
D:\arduino-1.0.1\libraries\SD\File.cpp
134: free(_file);

Looks like String and SD.h are the big problems. I suspect new() and Firmata are used very little.