Understanding the evil String()

I have a working wood boiler Aquastat, I am not trying to add Ethernet to it so it will report the temperature of the boiler to my website. I have it working, regulating and logging the temperatures as I want. However....

I was reading about the evils of the Arduino String So I was hoping that I didn't cause myself problems down the road.

Every 5 min my code calls the httpRequest() function this function uses String() extensively to convert numbers and concatenate strings.

My questions are this:
Because this is called from a function does the arduino release the memory after the function is finished?

Is there a better way to build the post string and or should I not be using the String() at all?

Thank you

Below is the function:

//My variables
float mtemp = 0.0;
long mintemp = 165;
long maxtemp = 180;
long overheat = 200;
float maxhitemp = 0.0;
float maxlowtemp = 250.0;
long overheatcnt = 0;
boolean overheatswtch = false;
boolean fanstate = true;

void httpRequest() {
  client.stop();

  String PutString;
  PutString = "GET /boilerset.cfm";
  PutString = PutString + "?curtemp=" + String(mtemp);
  PutString = PutString + "&mintemp=" + String(maxlowtemp);
  PutString = PutString + "&maxtemp=" + String(maxhitemp);
  PutString = PutString + "&overcnt=" + String(overheatcnt);
  PutString = PutString + "&fanstate=" + String(fanstate);
  PutString = PutString + "&isoverheat=" + String(overheatswtch);
  PutString = PutString + "&SetFanOn=" + String(mintemp);
  PutString = PutString + "&SetFanOff=" + String(maxtemp);
  PutString = PutString + "&SetOverHeat=" + String(overheat);
  PutString = PutString + " HTTP/1.1";

  Serial.println(PutString);
  
  if (client.connect(server, 80)) {
    client.println(PutString);
    client.println("Host: www.mydomain.com");
    client.println("User-Agent: arduino-ethernet");
    client.println("Connection: close");
    client.println();
  }
  lastConnectionTime = millis();
}

Most of us recommend to not use Strings at all. They cause memory problems and program crashes with Arduinos.

The sprintf() function is easy to use and far more versatile than using Strings to format numbers.

An exception is that by default sprintf() does not support floating point numbers on Arduino. You can enable that option at cost of program memory, or use dtostrf() to format them instead (or not use floats at all, the preferred option for sensor data).

There's no reason to assemble one big String or one big char array from sprintf. Just print the pieces:

 if (client.connect(server, 80)) {
    client.print( F("GET /boilerset.cfm") );
    client.print( F("?curtemp=") );
    client.print( mtemp );
    client.print( F("&mintemp=") );
    client.print( maxlowtemp );
    client.print( F("&maxtemp=") );
    client.print( maxhitemp );
       ...
    client.println( F(" HTTP/1.1") );
    client.println("Host: www.mydomain.com");

This saves a mess o' RAM, both for the double-quoted strings (F macro places them in FLASH) and for the sprintf buffer. These prints all go into the ethernet RAM buffer that is already allocated!

Since you are printing the same thing to two output streams (client and Serial), make a routine that takes a Stream argument:

void printGET( Stream & s )
{
  s.print( F("GET /boilerset.cfm") );
  s.print( F("?curtemp=") );
  s.print( mtemp );
  s.print( F("&mintemp=") );
  s.print( maxlowtemp );
  s.print( F("&maxtemp=") );
  s.print( maxhitemp );
       ...
  s.println( F(" HTTP/1.1") );
}

... and call it twice:

 printGET( Serial ); // debug output

  if (client.connect(server, 80)) {
    printGET( client );  // network output
    client.println("Host: www.mydomain.com");

Easy-peasy! You'll save 1600 bytes of program space, too.

Cheers,
/dev

Which processor are you using?
Some of this depends on which processor you are using.
The AVR is the most limited in terms of resources and has the most issues with its limited RAM.
Also, none of the other processors used on Arduino boards require jumping through hoops to avoid your string constants from eating up RAM.

--- bill

jmanatee:
My questions are this:
Because this is called from a function does the arduino release the memory after the function is finished?

No, not always. It depends on how the memory is allocated inside the function. You can have a function malloc() an area of memory and then pass it back to the caller so that the caller can do something with that memory area. This is a good way to make a memory leak - C provides you all the tools necessary to shoot yourself in the foot, metaphorically.

I have a working wood boiler

What are you boiling the wood for?

PaulS:
What are you boiling the wood for?

So you can bend it. When it dries (while held bent) it stays.

/dev:
There's no reason to assemble one big String or one big char array from sprintf. Just print the pieces:

 if (client.connect(server, 80)) {

client.print( F("GET /boilerset.cfm") );
   client.print( F("?curtemp=") );
   client.print( mtemp );
   client.print( F("&mintemp=") );
   client.print( maxlowtemp );
   client.print( F("&maxtemp=") );
   client.print( maxhitemp );
      ...
   client.println( F(" HTTP/1.1") );
   client.println("Host: www.mydomain.com");



This saves a mess o' RAM, both for the double-quoted strings (F macro places them in FLASH) and for the sprintf buffer. These prints all go into the ethernet RAM buffer that is already allocated!

I like this solution the best

Can you explain what the F(" ") does
IE:
client.print( F("?curtemp=") );
client.print( mtemp );

/dev:
Since you are printing the same thing to two output streams (client and Serial), make a routine that takes a Stream argument:

I don't understand this statement.... As far as I knew I was only printing to client, Can you elaborate?

Thank you for all the response.

jremington:
Most of us recommend to not use Strings at all. They cause memory problems and program crashes with Arduinos.

The sprintf() function is easy to use and far more versatile than using Strings to format numbers.

An exception is that by default sprintf() does not support floating point numbers on Arduino. You can enable that option at cost of program memory, or use dtostrf() to format them instead (or not use floats at all, the preferred option for sensor data).

String() type used to have a memory leak bug that would crash programs. That was properly fixed AFAIK.
However it is memory hungry compared to char*, but String is more convenient for text processing. I believe
it's unsafe to use in ISRs too.

1 Like

bperrybap:
Which processor are you using?

Its an Arduino R3 Uno (ATmega328)

PaulS:
What are you boiling the wood for?

PaulS:
What are you boiling the wood for?

GoForSmoke:
So you can bend it. When it dries (while held bent) it stays.

Sorry Wood Fired boiler for heating a home.

but String is more convenient for text processing. I believe

For those too lazy to use the same underlying string functions that the String class uses.

it's unsafe to use in ISRs too.

Even some handwaving "proof" of this ought to keep this thread going for another week. There is NOTHING that the String class does that shouldn't be done in an ISR. If you are going to piss away resources using it anyway, that is.

/dev:
There's no reason to assemble one big String or one big char array from sprintf. Just print the pieces:

Karma for you, you saved me a lot of typing!

Really jmanatee, serial data goes out 1 char at a time. The time in Arduino terms is slow to very, very slow.

Your print statements put chars into the Serial output buffer (normally 64 chars) from where they get sent as time goes by. You don't need to assemble a whole message to go out instantly, it can't. Your program can add chars to the buffer fast enough to fill it and have to wait (Serial does this, your code is made to wait, you don't have to write any extra unless that wait is undesireable) until there's room.

An Uno or other ATmega328-based board only has 2048 bytes for everything including the stack.
When your code uses a loop or function call with possible passed data and/or local variables within functions, ALL of that including return addresses go on the stack and get taken off when the loop or function is done. If the loop calls a function that calls a function, etc, the stack can grow quite a bit. That should be all the dynamic RAM use you need... emphasis on need. The stack starts at memory top and 'grows' down, it goes up as the added data goes out of scope.
All of your global variables and allocated space (yours and every library you use) go on the heap that grows from the bottom of memory up. If the stack and heap cross each other, one will be overwritten with unpredictable results where the best thing is if it crashes so you notice and worst is where you have a bug that you spend a long time trying to fix.

With only 2048 bytes RAM, do you want to use C++ Strings instead of C strings?
Every time you add a char to a C++ String it copies itself (costs cpu cycles) then deletes the old copy leaving a hole in the heap. Add another char, it copies itself bigger again, unable to use the too small hole but next time the hole is bigger so the 3rd, 5th, etc, don't push heap top up but 2nd, 4th, etc do.
With C strings you make a char array big enough to hold any final text your code will allow, you keep track of how many chars are in the array (or use a function, strlen(), if speed and code size are not priority) and copy chars to your easily accessible char array string.

It is dead easy to directly get at your own data with C strings, you don't really need to use string.h functions to do good work on C strings. Just make sure to have a final char == 0 to let print know where the end of your C string is and never write past the end of the array.

I have code that reads serial data and tries to match words in the stream to key words stored in AVR flash (where the program is kept is also for stored constant data) one char at a time as each arrives. The function 'walks' through the stored table fast enough to keep up with 25000 chars per second serial.

I tell you this to try and impress on you the single char at a time (really, 10 single bits per char; start bit, 8 data bits, and stop bit makes 10) and the fast/slow nature of Arduino/Serial. It should change how you view sending and receiving serial data. You can do many small things in between even fast serial.

What /dev wrote is a very good start. You can pack your code to fit smaller, cheaper devices with just even that rather than having to go the bigger hardware route. If there will be many end-products it becomes a matter of code written once for all of them. Will it be a $5 thing or a $10 thing really adds up.

What you practice is what you learn well, what may serve your future efforts. A bit harder the first few times but then you're not burning oil which is much easier than wood or coal.

1 Like

PaulS:
There is NOTHING that the String class does that shouldn't be done in an ISR.

Actually, malloc and free do not disable interrupts. You could be in the middle of a malloc or free when an interrupt occurs. If the ISR tries to do a malloc or free, the linked list pointers for the memory chunks in the heap could be corrupted. Bad juju. Related Arduino Issue.

jmanatee:
Can you explain what the F(" ") does
IE:
client.print( F("?curtemp=") );
client.print( mtemp );

The UNO has two kinds of memory: RAM (program execution can change it whenever) and FLASH (it cannot be changed). When you use a "double-quoted" string constant by itself, your program copies the characters from FLASH (a "secret" part of the uploaded executable) to RAM, and the characters are accessed from that RAM location. It seems silly, but these string constants use modifiable RAM.

When you wrap the F("double-quoted") string with the F macro, it guarantees that the string constant is directly accessed from FLASH memory. They will not be copied to RAM. The "double-quoted" strings are actually character arrays, of this type:

    char []

This is a RAM address, while the F macro is like a function that returns a

    const __FlashStringHelper *

This is a FLASH memory address of the constant character array. This forced the Arduino authors to provide two Serial.print functions. You can see them here.

NOTE: This is called overloading. There are lots of print routines: one takes a char (e.g., 'A'), one takes an int (e.g., 7), one takes a float (e.g., 2.59)... They're all called "print", they just take different types of arguments.

Cheers,
/dev

Actually, malloc and free do not disable interrupts. You could be in the middle of a malloc or free when an interrupt occurs.

In an ISR, interrupts are disabled, so it is very unlikely that an interrupt will happen while malloc() or free() are doing there thing.

Now, the problem COULD be that some code is trying to append to a String when an interrupt happens, and in the ISR, it tries to append to a String, too. I can see how that could create an issue, IF malloc() and free() do not disable interrupts.

I find it hard to believe, though, that malloc() and free() do not disable interrupts for the critical sections. I'll admit that I did not bother trying to very this.

PaulS:
the problem COULD be that some code is trying to append to a String when an interrupt happens, and in the ISR, it tries to append to a String

That's what I tried to say... :slight_smile:

PaulS:
I find it hard to believe, though, that malloc() and free() do not disable interrupts for the critical sections.

Believe it. Some OS's require that they are thread-safe, but not necessarily interrupt safe.

One fundamental reason malloc/free don't disable interrupts: determinism and predictability. malloc is actually a search for a memory block, and that is non-deterministic. You can never predict or guarantee how long it takes to find a block, so you can't be sure how long interrupts would be disabled.

This would be a problem when called from the foreground (i.e., non-interrupt context), because ISRs would be blocked during the search. This would also be a problem when called from a background ISR (i.e., interrupt context), because it would add an unknown search time to the ISR doing the malloc.

jmanatee:
Quote from /dev: Since you are printing the same thing to two output streams (client and Serial), make a routine that takes a Stream argument:"

I don't understand this statement.... As far as I knew I was only printing to client, Can you elaborate?

Your original function has this code:

  Serial.println(PutString);
  
  if (client.connect(server, 80)) {
    client.println(PutString);

You are printing the same thing (PutString) to two different places (client and Serial). If you still want to do that, make a subroutine that takes client or Serial as an argument and prints the pieces to it. Then call that subroutine twice, once for each destination:

  printGET( Serial ); // print pieces to Serial
  
  if (client.connect(server, 80)) {
    printGET( client ); // print the same pieces to client

client and Serial are different types of variables, but their types have a common base class, the Print class (uppercase matters). You can treat them both like a Print object inside the printGET routine (see subroutine in reply #2).

However, you can only use the methods defined in the base class Print. printGet only needs the print and println methods (overloaded on char [], char, int, const __FlashStringHelper *, float, etc.). C++ allows you to "pretend" they're both the same type of variable.

Cheers,
/dev

One fundamental reason malloc/free don't disable interrupts: determinism and predictability. malloc is actually a search for a memory block, and that is non-deterministic. You can never predict or guarantee how long it takes to find a block, so you can't be sure how long interrupts would be disabled.

I haven't looked at how malloc() maintains information about allocated memory, but I'd expect a doubly linked list. I'd expect that walking the list, determining if the amount of memory between two nodes was sufficient, or not, would be pretty quick. Maybe not.

Not playing with dynamic allocation is a good foundation for solid, fast, dependable code.

I've used pre-allocated buffers and pointers into those with my code controlling the lot to make solid code. I didn't let unvetted data in either, chased too many data bugs down for that. My last big data collector/analyzer ran literally months at a time before being interrupted by power out or user command. It didn't crash mainly because it didn't stray from the constraints built in.

BTW, over on the AVR-LibC site they don't much like dynamic memory use on AVR's either.