Large strings and concatination with strcat and F()

I'm using an ESP as a wifi AT slave of another arduino.
When I send data to be transmitted I need to form and count the length of large strings ~300-400 bytes.
This involves a lot of concatenation.

I want to avoid any bad practices here that come with strings to not waste memory.

How can I combine the benefits of a fixed sized string buffer (to avoid memory fragmentation) with something like the F("abc") macro?

The form of the string is similar to "ABC=1.24&DEF=9&GHI=24.5" .... etc

More specifically this is the code I want to make more efficient:

SendESP8266CommandSet(
      "GET /grafana.php?DHT11_Temperature=" + String(DHT.temperature)
    + F("&DHT11_Humidity=") + DHT.humidity
    + F("&BMP085_Temperature=") + BMP085.readTemperature()
    + F("&BMP085_Pressure=") + BMP085.readPressure()
    + F("&TSL2591_IR=") + (TSL2591_IR)
    + F("&TSL2591_Full=") + (TSL2591_Full)
    + F("&TSL2591_Visible=") + (TSL2591_Visible)
    + F("&TSL2591_Lux=") + (TSL2591_Lux)
    + F("&TSL2561_Visible=") + (TSL2561_Visible)
    + F("&TSL2561_IR=") + (TSL2561_IR)
    + F("&TSL2561_Lux=") + (TSL2561_Lux)
    + F("&MQ7=") + (analogRead(3)) //Carbon Monoxide
    + F("&MQ135=") + (analogRead(2)) //Air Quality (CO, Ammonia, Benzene, Alcohol, smoke)
    + F("&MQ5=") + (analogRead(1)) //Natural gas, LPG
    + F("&MQ135_Acetone=") + MQ135_Acetone
    + F("&MQ135_CO=") + MQ135_CO
    + F("&MQ135_Alcohol=") + MQ135_Alcohol
    + F("&MQ135_CO2=") + MQ135_CO2
    + F("&MQ135_Tolueno=") + MQ135_Tolueno
    + F("&MQ135_NH4=") + MQ135_NH4
    );

strcat will not work with F(""), but does with regular strings.

All of the data has to be sent in ONE packet to the ESP.

look at strcat_P() for strings in PROGMEM.

A link to above - strcat_P

You may well have to also create an array of pointers, in PROGMEM to the target strings.

More reading - avr-libc: Data in Program Space

Thanks for the suggestion, I'm working through this now:

At some point I need to concatenate floats (like DHT.temperature) which means they have to be converted to strings, which AFAIK will mean temporary allocations in RAM, thus back to square 1
But I think that's unavoidable anyway as they values change and PROGMEM is supposed to be static?

So this is what I have so far for one key/value pair in the string that will be sent over WIFI
I'm not sure I've saved anything over using F() when the value has to be processed in RAM anyway apart from memory fragmentation issues.

    //Allocate these buffers once at program start
    char send_buffer[512]; //No specific size for now
    char DHT11_Temperature_Key[] PROGMEM = "DHT11_Temperature="; //Stored in PROGMEM
    char temp_float_buffer[7]; //Stored in RAM

    //Run this every time you are creating the string
    strcpy_P(send_buffer, (PGM_P)pgm_read_word(&(DHT11_Temperature_Key))); //Copy the key string from PROGMEM into the main RAM buffer
    dtostrf(DHT.temperature,6,4,temp_float_buffer); //Convert the float value into a char* pre-allocated buffer
    strcpy(send_buffer, temp_float_buffer); //Copy the key value onto the end of the main RAM buffer

As you are using an ESP8266 are you actually short of memory ?

UKHeliBob:
As you are using an ESP8266 are you actually short of memory ?

The ESP8266/ESP-01 is a slave and is only being used for WiFi processing via AT commands.
The master/main logic is running on a 328 based Arduino.
There's legacy reasons for this beyond the scope of my question.

RAM is getting low and I want to prepare before I hit a wall.

Can you keep the constant strings on the ESP, transfer the data to be published to it and do the building of the messages there before publishing it ?

UKHeliBob:
Can you keep the constant strings on the ESP, transfer the data to be published to it and do the building of the messages there before publishing it ?

Unfortunately not with the way the project has been set up (the legacy thing). All business logic has to be on the 328 Arduino.

Do you even need to concatenate strings before transmitting them? Usually it is not necessary because they all end up in the same place in the same order anyway, and asynchronously too.

If you must send this as a single string all at once, there is no way you can avoid using the RAM needed to store the entire string. Have a look at the SendESP8266CommandSet() function to see if the single string can be avoided, otherwise you are already in trouble because using String will require enough free memory to hold the entire text, and that will not show up in the compiler's estimate of memory usage.

All of the data has to be sent in ONE packet to the ESP.

That seems unlikely, for several reasons. (TCP is a stream protocol. It doesn't know about packets. The usual interface to an ESP is a UART, which doesn't know about packets...)

Look for a different function...

(or is SendESP8266CommandSet() your own creation? I can't find it mentioned on the WWW. If so, re-write it.)

aarg:
Do you even need to concatenate strings before transmitting them? Usually it is not necessary because they all end up in the same place in the same order anyway, and asynchronously too.

I have to concatenate in order to calculate the complete string length which it needs before the string is sent.
I could send it as several smaller packets, but it ends up having the same number of overall allocs

david_2018:
If you must send this as a single string all at once, there is no way you can avoid using the RAM needed to store the entire string. Have a look at the SendESP8266CommandSet() function to see if the single string can be avoided, otherwise you are already in trouble because using String will require enough free memory to hold the entire text, and that will not show up in the compiler's estimate of memory usage.

I want to try and avoid sending multiple separate TCP messages for logistical reasons. But in the long run I may have to. This will introduce other issues.
Holding the entire string in RAM was never the issue really, it's that different ways of concatenating can cause a lot of memory fragmentation, while pre-allocating avoid a lot of that Also the storing the constituent constants in PROGMEM helps too.

westfw:
That seems unlikely, for several reasons. (TCP is a stream protocol. It doesn't know about packets. The usual interface to an ESP is a UART, which doesn't know about packets...)

Look for a different function...

(or is SendESP8266CommandSet() your own creation? I can't find it mentioned on the WWW. If so, re-write it.)

The reasons why have nothing to do with TCP or UART as you've speculated.
I already have an overload of SendESP8266CommandSet() that uses multiple smaller strings sequentially but I'd like to avoid using it for reasons that are way outside the scope of my question which is specifically about concatenation.

Unfortunately not with the way the project has been set up (the legacy thing). All business logic has to be on the 328 Arduino.

you should really change your point of view. Program the ESP with all business logic and use the Arduino only for things what the ESP can't do. And by the way - this isn't much. If you just need more GPIOs, consider to use i2c port expanders (or the 328).

There is no way to concatenate 400bytes worth of separate PROGMEM strings without using 400 bytes of RAM.

Well, I suppose you could implement a python-esque "generator" scheme where you pass a bunch of printf-like descriptors and pointers to a "create" function, and then use an access function to read it back a character at a time. I guess fprintf_P() itself is sort-of like that, and you might be able to use its underlying support functions (vfprintf()?)

Write a function (get_str_part) which returns the individual sub-sections of the string by index into a fixed char[] buffer. That buffer needs only be as large as the largest sub-section of the string.

Write a couple of loops...

int len = 0;
for (int i = 0; i < n; i++) {
  get_str_part(buffer, i);
  len += strlen(buffer);
}
// Now you have the length without having to hold the entire string in memory at once

for (int i = 0; i < n; i++) {
  get_str_part(buffer, i);
  write_str(buffer);
}
// Now you have sent the string without having to have the entire string in memory at once

You could also make the get_str_part function return a zero length string as the last element. Then you can loop until you detect that. This saves knowing n in advance.

Edit: You can extend this scheme to the ultimate extreme where you don’t have an external buffer at all, and the function just returns a single character. Internally the function needs to remember which sub-string it is currently returning, and which character offset in that substring it needs to return next. When it gets to the end of one substring it advances substrings and resets the character index to zero so the next call returns the first character of the next string.

I have this scheme for sending variable length HTTP requests from a PIC with only 1K of SRAM. Works perfectly.

Unfortunately not with the way the project has been set up (the legacy thing). All business logic has to be on the 328 Arduino.

Even if you can't change the way the project has been set up in the sense that business logic occurs on the 328 and communication with the Web uses an 8266 is there no prospect of changing the 328 for a processor with more memory ?

It is very likely that no peripherals (the DHT11 for one) or the interface to them need be changed and the majority, if not all of the existing code would run on the new processor, but it is difficult to know without more details

Having said that I would still favour integrating everything on to say an ESP32

I would use a char array instead of String, much less problem that way:

  char buff[totalLength() + 1]; //create buffer of sufficient length to hold complete string

  strcpy_P(buff, PSTR("GET /grafana.php?DHT11_Temperature="));
  dtostrf(DHTtemperature, 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&DHT11_Humidity="));
  dtostrf(DHThumidity, 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&BMP085_Temperature="));
  dtostrf(BMP085readTemperature, 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&BMP085_Pressure="));
  dtostrf(BMP085readPressure, 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&TSL2591_IR="));
  dtostrf((TSL2591_IR), 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&TSL2591_Full="));
  dtostrf((TSL2591_Full), 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&TSL2591_Visible="));
  dtostrf((TSL2591_Visible), 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&TSL2591_Lux="));
  dtostrf((TSL2591_Lux), 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&TSL2561_Visible="));
  dtostrf((TSL2561_Visible), 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&TSL2561_IR="));
  dtostrf((TSL2561_IR), 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&TSL2561_Lux="));
  dtostrf((TSL2561_Lux), 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&MQ7="));
  itoa(analogRead(3), (buff + strlen(buff)), DEC); //Carbon Monoxide
  strcat_P(buff, PSTR("&MQ135="));
  itoa(analogRead(2), (buff + strlen(buff)), DEC); //Air Quality (CO, Ammonia, Benzene, Alcohol, smoke)
  strcat_P(buff, PSTR("&MQ5="));
  itoa(analogRead(1), (buff + strlen(buff)), DEC); //Natural gas, LPG
  strcat_P(buff, PSTR("&MQ135_Acetone="));
  dtostrf(MQ135_Acetone, 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&MQ135_CO="));
  dtostrf(MQ135_CO, 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&MQ135_Alcohol="));
  dtostrf(MQ135_Alcohol, 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&MQ135_CO2="));
  dtostrf(MQ135_CO2, 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&MQ135_Tolueno="));
  dtostrf(MQ135_Tolueno, 4, 2, (buff + strlen(buff)));
  strcat_P(buff, PSTR("&MQ135_NH4="));
  dtostrf(MQ135_NH4, 4, 2, (buff + strlen(buff)));
  Serial.println(buff);

The total length of the string is calculated in a function:

int totalLength() {
  int sum = 280; //start with total length of all fixed text
  char nbuff[15]; //needs to be long enough to hold longest float
  //all floats are converted to text with a minimum length of 4 and 2 decimal places
  sum += strlen(dtostrf(DHTtemperature, 4, 2, nbuff));
  sum += strlen(dtostrf(DHThumidity, 4, 2, nbuff));
  sum += strlen(dtostrf(BMP085readTemperature, 4, 2, nbuff));
  sum += strlen(dtostrf(BMP085readPressure, 4, 2, nbuff));
  sum += strlen(dtostrf((TSL2591_IR), 4, 2, nbuff));
  sum += strlen(dtostrf((TSL2591_Full), 4, 2, nbuff));
  sum += strlen(dtostrf((TSL2591_Visible), 4, 2, nbuff));
  sum += strlen(dtostrf((TSL2591_Lux), 4, 2, nbuff));
  sum += strlen(dtostrf((TSL2561_Visible), 4, 2, nbuff));
  sum += strlen(dtostrf((TSL2561_IR), 4, 2, nbuff));
  sum += strlen(dtostrf((TSL2561_Lux), 4, 2, nbuff));
  sum += strlen(itoa(analogRead(3), nbuff, DEC));
  sum += strlen(itoa(analogRead(2), nbuff, DEC));
  sum += strlen(itoa(analogRead(1), nbuff, DEC));
  sum += strlen(dtostrf(MQ135_Acetone, 4, 2, nbuff));
  sum += strlen(dtostrf(MQ135_CO, 4, 2, nbuff));
  sum += strlen(dtostrf(MQ135_Alcohol, 4, 2, nbuff));
  sum += strlen(dtostrf(MQ135_CO2, 4, 2, nbuff));
  sum += strlen(dtostrf(MQ135_Tolueno, 4, 2, nbuff));
  sum += strlen(dtostrf(MQ135_NH4, 4, 2, nbuff));
  return sum;
}

Although its best to avoid the large buffer and print everything individually. Might be better to use dtostrf() and itoa() instead of letting print() convert the numbers, just in case there is a length difference.

  Serial.print(F("total length of string: "));
  Serial.println(totalLength());
  Serial.print(F("GET /grafana.php?DHT11_Temperature="));
  Serial.print(DHTtemperature, 2);
  Serial.print(F("&DHT11_Humidity="));
  Serial.print(DHThumidity, 2);
  Serial.print(F("&BMP085_Temperature="));
  Serial.print(BMP085readTemperature, 2);
  Serial.print(F("&BMP085_Pressure="));
  Serial.print(BMP085readPressure, 2);
  Serial.print(F("&TSL2591_IR="));
  Serial.print((TSL2591_IR), 2);
  Serial.print(F("&TSL2591_Full="));
  Serial.print((TSL2591_Full), 2);
  Serial.print(F("&TSL2591_Visible="));
  Serial.print((TSL2591_Visible), 2);
  Serial.print(F("&TSL2591_Lux="));
  Serial.print((TSL2591_Lux), 2);
  Serial.print(F("&TSL2561_Visible="));
  Serial.print((TSL2561_Visible), 2);
  Serial.print(F("&TSL2561_IR="));
  Serial.print((TSL2561_IR), 2);
  Serial.print(F("&TSL2561_Lux="));
  Serial.print((TSL2561_Lux), 2);
  Serial.print(F("&MQ7="));
  Serial.print(analogRead(3)); //Carbon Monoxide
  Serial.print(F("&MQ135="));
  Serial.print(analogRead(2)); //Air Quality (CO, Ammonia, Benzene, Alcohol, smoke)
  Serial.print(F("&MQ5="));
  Serial.print(analogRead(1)); //Natural gas, LPG
  Serial.print(F("&MQ135_Acetone="));
  Serial.print(MQ135_Acetone, 2);
  Serial.print(F("&MQ135_CO="));
  Serial.print(MQ135_CO, 2);
  Serial.print(F("&MQ135_Alcohol="));
  Serial.print(MQ135_Alcohol, 2);
  Serial.print(F("&MQ135_CO2="));
  Serial.print(MQ135_CO2, 2);
  Serial.print(F("&MQ135_Tolueno="));
  Serial.print(MQ135_Tolueno, 2);
  Serial.print(F("&MQ135_NH4="));
  Serial.print(MQ135_NH4, 2);
  Serial.println();

Note that the DHT and BMP085 readings have been stored in variables, this is necessary if you want to calculate the length separate from the printing, otherwise the values may change and alter the length of the text. You would also need to do this for the readings from the analog inputs.

noiasca:
you should really change your point of view. Program the ESP with all business logic and use the Arduino only for things what the ESP can't do. And by the way - this isn't much. If you just need more GPIOs, consider to use i2c port expanders (or the 328).

We don't have this opportunity at this point, as mentioned, legacy reasons.