The HATRED for String objects - "To String, or not to String"

For example
void loop()
{
char c[10];
int n;

}

//what happens exactly here

with each iteration of the loop does a local variable get created (c/n) and recreated over and over again?... at the time i did not know so I simply made a global var c/n (but whatever i called them in my project)
so i knew i was never going to fill the memory full of undestroyed pointers and unused memory allocations (just to be on the safe side)

so with each loop would all the local variables be destoyed along with any memory allocation?

In your example the stack is incremented* by 12 (10 plus 2 for the int) to make room for those variables. They are thus uninitialized because they have whatever was on the stack. When loop exits that stack space is reclaimed.

If you make them global the stack doesn't get altered but you have 12 bytes less available for the stack because they have to go somewhere.

The stack and heap share the same piece of memory (RAM) starting at wherever your global variables end. The heap grows upwards and the stack grows downwards from the top of memory. If they happen to collide: trouble!

* decremented really, because the stack grows downwards.

I don't know if there's any time savings but a global gets allocated once while or making non-static locals or passing a load of parameters to a function happens over and over.

True, a stack-based variable will take a couple of machine cycles to reserve the extra stack. If you only ever need the one copy, a global (or static) variable would suit your needs.

Ok, guys. For your reference, let me clear up how memory works :wink: (for those who know already, this is for those who don't.)

There are basically two types of "dynamic" memory: the "stack", and the "heap".

The stack can be thought of like a pile of coins. Coins are put on to the pile in order, and then removed afterwards in reverse order.

Any "local" variables, along with any variables passed to functions, are placed on the stack, used within the function they are defined in, and then removed from the stack again.

The stack starts at the top of the memory space, and grows downwards.

Then there is the "heap". This is more like a mound of coins. Coins can be put into the mound wherever they will fit. If there is no room inside the mound for them, then they are dropped on the top of the mound. Coins can be pulled out from anywhere in the mound. The same goes for variables. These are all "dynamically allocated" variables. Anything that uses "malloc()" or "free()", and any classes or functions which use these functions within themselves.

The heap is located at the bottom of the memory area.

The two main issues with the heap are memory leaks and memory fragmentation.

Memory leaks occur when some function places coins into the mound, and then forgets that it put them there, so they never get taken out again. Variables created in the heap must be removed after use, or the heap will just grow and grow and grow.

Memory fragmentation occurs when small variables are removed from the heap to be replaced by larger variables. The space left by the small variable is too small to accept the larger variable, so the larger variable is instead placed on the top of the heap, causing the heap to grow. This most often occurs when working with strings and you want to add something to the end of a string (concatenate) or join two strings together.

As the heap gets more and more fragmented and grows bigger and bigger over time, and as more and more local variables get pushed into the stack, there is a big risk that the heap and the stack will both grow so big that they meet in the middle. When this happens you have problems. Big problems. This is when crashes occur.

You can get "memory de-fragmentation" programs for the PC which effectively re-arrange the heap to remove the holes in it. This makes programs run faster, as they can allocate bigger chunks of memory, and helps to reduce the overall size of the heap. Quite useful. There is nothing (as far as I have seen) like it for micro-controllers - mainly due to the lack of space in the first place :wink:

So, the issue with the String class isn't with the String class itself - it's a very useful class (in the right situations). The issue is with dynamic allocated memory causing fragmentation in the heap, which overflows and crashes into the stack.

So yes, a fix to "free()" will improve things, but on a small micro-controller with very limited RAM space dynamic memory allocation in itself is something to be shunned whenever possible.

1 Like

So, the issue with the String class isn't with the String class itself - it's a very useful class (in the right situations). The issue is with dynamic allocated memory causing fragmentation in the heap, which overflows and crashes into the stack.

Part of the problem IS with the String class itself. When you want to append one character to an existing String, the length of the new String is the length of the old String plus the length of the String to append. That amount of space is allocated, and the old String is copied there, then the String to be appended is tacked on, then the old String's space is freed.

Append another character, and the whole process is repeated, because the String has no room to grow.

On other platforms, extra space is allocated, so that there is some room to grow. Perhaps 10 extra bytes, so you can add 10 characters, before a malloc/copy/free operation is required again.

PaulS:

So, the issue with the String class isn't with the String class itself - it's a very useful class (in the right situations). The issue is with dynamic allocated memory causing fragmentation in the heap, which overflows and crashes into the stack.

Part of the problem IS with the String class itself. When you want to append one character to an existing String, the length of the new String is the length of the old String plus the length of the String to append. That amount of space is allocated, and the old String is copied there, then the String to be appended is tacked on, then the old String's space is freed.

Append another character, and the whole process is repeated, because the String has no room to grow.

On other platforms, extra space is allocated, so that there is some room to grow. Perhaps 10 extra bytes, so you can add 10 characters, before a malloc/copy/free operation is required again.

Well, yes the String class could be written better, but even written better it's still not a good class in the context of the Arduino - even with extra space it is still using dynamic allocation, and thus is bad.

PaulS is quite right, and I think the String library initially allowed for that. However with only 2 Kb of RAM to play with they decided to eliminate (possibly) 9 extra bytes which may or may not be used. This gives an advantage in some situations and a disadvantage in others.

cjdelphi:
with each iteration of the loop does a local variable get created (c/n) and recreated over and over again?

Yes. Automatic local variables are allocated on the stack and thrown away when the stack is unwound at the end of the call. In general, the more tightly you restrict the scope of the data the less chance there is for dependencies and unintended interactions between different parts of your code. This is why use of global data is best avoided where possible, and use of local data and arguments is generally preferred.

I think the String library initially allowed for that.

Paul Stoffgren's implementation did. The Arduino one removed that.

The option to allocate data in other-than-one-character larger increments needs to be part of the String class.

Typically, the user of the class has an idea of how much data is going to be stored in the String instance. Setting the minimum increment to some small value for small Strings and a larger value for larger Strings (say 5 and 15) would drastically reduce the amount of malloc/copy/free operations performed, which would have a big impact on the fragmentation issue. The default minimum increment should, of course, be 1.

PeterH:

cjdelphi:
with each iteration of the loop does a local variable get created (c/n) and recreated over and over again?

Yes. Automatic local variables are allocated on the stack and thrown away when the stack is unwound at the end of the call. In general, the more tightly you restrict the scope of the data the less chance there is for dependencies and unintended interactions between different parts of your code. This is why use of global data is best avoided where possible, and use of local data and arguments is generally preferred.

With 2k RAM I prefer globals and keeping control of my code. A few locals don't hurt but whole buffers in short often-called functions, it's good to be aware of because some day you might be writing code that needs to be fast rather than 'do it this way every time'.

awareness != mandatory practice

I've likened using C++ Strings on Arduino because C++ Strings are okay on a PC to putting a bathtub on a bicycle just because your house has one. I still feel the same. There's a lot of coding practices I've happily done on PC's that I wouldn't think of doing on Arduino.

GoForSmoke:

PeterH:

cjdelphi:
with each iteration of the loop does a local variable get created (c/n) and recreated over and over again?

Yes. Automatic local variables are allocated on the stack and thrown away when the stack is unwound at the end of the call. In general, the more tightly you restrict the scope of the data the less chance there is for dependencies and unintended interactions between different parts of your code. This is why use of global data is best avoided where possible, and use of local data and arguments is generally preferred.

With 2k RAM I prefer globals and keeping control of my code. A few locals don't hurt but whole buffers in short often-called functions, it's good to be aware of because some day you might be writing code that needs to be fast rather than 'do it this way every time'.

awareness != mandatory practice

I've likened using C++ Strings on Arduino because C++ Strings are okay on a PC to putting a bathtub on a bicycle just because your house has one. I still feel the same. There's a lot of coding practices I've happily done on PC's that I wouldn't think of doing on Arduino.

I too like the use of global variables in my arduino projects. I like that I can define them all at the top of the sketch with comments on what they represent. I kind of limit my use of local variables to things like the for statements and such. I know this is against the concept of good variable scoping practice that C/C++ encourages programmers to use and does make sense for larger projects where teams of programmers are working on the same application program and have to not step on each others coding functions, but with the limited SRAM space and me being the only one writing the code, global variables work better for me and I haven't heard of a good reason I should avoid them in my Arduino sketches. Sometime I just have to be an outlaw in heart I guess. :wink:

Lefty

Instead of removing strings, they just have to make about a 1-line change to the library code in free() that has a bug in it. Then some, at least, of the problems will go away.

So what is this one line change? Can it be posted so I can fix my library? I'd just rather fix things instead of hanger flying on the subject. Might eliminate some peoples constant string bashing and often mis diagnosis of problems people are having.

The problem is that the fix is not to source that is built during normal sketch compilation, but to the libraries provided pre-compiled in Arduino.

Have a look at this bug Google Code Archive - Long-term storage for Google Code Project Hosting.
and this thread: http://arduino.cc/forum/index.php/topic,95914.30.html

Personally, I am not a fan of String, even if some of the problems people run into are due to this bug. Anything that smells of dynamic allocation has no place, in my opinion, on such a memory-constrained platform. The problem is that statically allocated and strings and all the string.h stuff is complicated to a beginner. The String seems easy, so you can get something going faster than if you have to grok pointers and in-memory layout of strings and crap just to print hello world. But I think folks do not appreciate the limits of what you can do with String -- and with lots of other things like arrays and C strings too -- with so little memory, and wind up being forced to learn the details anyhow.

PaulS:
Part of the problem IS with the String class itself. When you want to append one character to an existing String, the length of the new String is the length of the old String plus the length of the String to append. That amount of space is allocated, and the old String is copied there, then the String to be appended is tacked on, then the old String's space is freed.

No this is not true. String uses realloc() which, if you'd care to read the implementation in realloc.c will always attempt an in-situ block-expansion. For most cases this will result in no pointer change at all and a simple adjustment of the allocated length - usually out into free space in most use-cases on a small system like this. A block-copy is only required if the original allocation either came from the free list and that space is now exhausted or there is a new block allocated in front of the one that needs to be expanded.

majenko:
Memory fragmentation occurs when small variables are removed from the heap to be replaced by larger variables. The space left by the small variable is too small to accept the larger variable, so the larger variable is instead placed on the top of the heap, causing the heap to grow. This most often occurs when working with strings and you want to add something to the end of a string (concatenate) or join two strings together.
to be shunned whenever possible.

It's actually quite hard to cause significant memory fragmentation to occur in the avr-libc implementation of the heap. free() will always coalesce blocks above and below the one it's being asked to release to create one larger block in the free list so even a poor String concat implementation that used malloc()+free() instead of realloc() would generally only end up with one block in the free list due to every second free() being coalesced with the previous one and then being big enough to satisfy the next malloc().

If you understand it, use it. If you don't, don't.

Assuming the String is at the top of the heap in 'most cases'. Otherwise not.

Andy Brown, I think you will find that Nick knows more about the code than you credit. And he isn't given to make unfounded claims.

So is there anything in the external programming that can be done to free the no longer used allocated memory space?

zoomkat:
So is there anything in the external programming that can be done to free the no longer used allocated memory space?

Perhaps enable the WDT, let if time out, and have the sketch start all over? :smiley:

Perhaps enable the WDT, let if time out, and have the sketch start all over?

Close to what my magic 8-Ball suggested. :slight_smile: