A Safe Alternative to using Strings in Arduino

This topic is for discussion of the pros and cons of having the SafeString library at all.
If you have a question on how to apply SafeString to a particular case then please post under the topic
Using the SafeString library

There has been a number of posts on the evils of using Strings in Arduino.
Both Sparkfun and Adafruit advise against using Strings because of the heap fragmentation they cause and the penalty of creating lots of small String and then coping the results

The new SafeString library (available via the Arduino Library Manager) solves all these problems and is suitable for complete Beginners to use.

  • SafeStrings are easy to debug. SafeStrings provide detailed error messages, including the name of SafeString in question, to help you get your program running correctly.
  • SafeStrings are safe and robust. SafeStrings never cause reboots and are always in a valid usable state, even if your code passes null pointers or '\0' arguments or exceeds the available capacity of the SafeString.
  • SafeString programs run forever. SafeStrings completely avoid memory fragmentation which will eventually cause your program to fail and never makes extra copies when passed as arguments.
  • SafeStrings are faster. SafeStrings do not create multiple copies or short lived objects nor do they do unnecessary copying of the data.

There is a detailed tutorial at
https://www.forward.com.au/pfod/ArduinoProgramming/SafeString/index.html
and there are extensive examples included with the library. SafeString also fixes/avoids the errors in the String library.

Almost all the String functions are reproduced in the SafeString library so converting programs to SafeStrings is easy.

Here is a simple first example

#include "SafeString.h"
createSafeString(msgStr, 5); // create an empty string called msgStr with a capacity if 5 chars
void setup() {
  // Open serial communications and wait a few seconds
  Serial.begin(9600);
  SafeString::setOutput(Serial); // enable error messages and debug() output to be sent to Serial
}
void loop() {
  msgStr = F("A0");
  msgStr += " = ";
  msgStr += analogRead(A0);  // add the reading
  msgStr.debug(F(" After adding analogRead "));  // print debug info
  Serial.println(msgStr);  // output the result
  while (1) {} // stop here
}

SafeStrings have integrated debugging and extensive error checking and (optional) error messages so you can quickly find and fix you problems.

Running the above sketch produces this output

Error: msgStr.concat() needs capacity of 8 for the first 3 chars of the input.
        Input arg was '598'
        msgStr cap:5 len:5 'A0 = '
 After adding analogRead  msgStr cap:5 len:5 'A0 = '
A0 =

As you can see the error message tells you that you need to increase the capacity of msgStr. Note: the sketch did not reboot. If an addition would exceed the capacity it is just ignored and an error message output if setOutput( ) has been called.

SafeStrings also includes simple and effective parsing/tokenizing methods for processing user commands without blocking the rest of the sketch from running at maximum speed.

very good. That's a big step back towards to newbee-friendlyness.
Thank you very much for giving a headsup on this

Debug can be turned off with commenting out a #define in the library. That means then all new programs don’t have the debug feature unless you duplicate the library. Need a better way (template?)

In the doc you state

The major difference is that the capacity of a SafeString is fixed at creation and does not change,

SafeStrings do not support the + operator, because it creates temporary Strings

One of the value of the String class is it’s ability to expand which brings the risk indeed of fragmentation (and if you use reserve with the String class you preallocate a chunk of memory to limit that risk) and common operations like + and that newbie learn in other languages

I’m not a fan of the String class but with such restrictions for newbies, are they better of just using cStrings and standard library and cStrings functions and learn to be careful?

Nice attempt to an alternate way though, thanks for sharing

The safestring initial CAPACITY is not necessarily its LENGTH.
And the class class DOES support appending to strings ( "S += n;" )
It just doesn't support S = S + n + ";";
(which would require an intermediate string in the middle of the computation...)

westfw:
The safestring initial CAPACITY is not necessarily its LENGTH.

So same as char s[30] = "Hello"; // 30 bytes, 6 used

And the class class DOES support appending to strings ( "S += n;" )

OK, so just as a wrapper around a sizeof() based boundary check and strcat or strncat()

Don't get me wrong, it's a nice submission / attempt to address recurring issues.

Sometimes I just wonder how much sugar coating one needs to bring to standard library functions rather than have newbies understand what's going on really and what checks need to happen to prevent overflow.

In my experience it's just hiding for a little longer the naked truth that when memory gets scarce you'd better understand what's going on at byte level and so being acquainted with / exposed to the bare memory access in all its glory. Learn once, use all your life :slight_smile:

And if memory is not scarce, most of the time the existing String class is just fine (most alloc/dealloc would happen in LIFO mode in most programs) and reserve() helps too.

I've seen other attempt just with a C library of functions on top of cStrings which brings extra safety too.

Well,yeah. It’s not clear to me that it’s that much better than always “reserving” a reasonable length with the existing string library. But people don’t, so this is an “interesting” experiment.
(Also, it seems to have added a lot of error checking, with debugging. That seems worthwhile...)

consider (is there some benefit to constructing a string using the "+" operator)?

A0 = 521, After adding analogRead
char s [100];

void setup() {
    Serial.begin(9600);

    sprintf (s, " A0 = %d, After adding analogRead", analogRead(A0));
    Serial.println (s);
}

void loop()
{
}

Am I correct to think that this SafeString library is using cstrings under the hood and is, essentially, a collection of utility functions to make using cstrings easier?

If so that is a welcome development.

However IMHO it is unfortunate that the name "SafeString" gives the impression that it is using the String class. What about "EasyCstrings" ?

...R

If so that is a welcome development.

I dunno, i think it is only going to muddle up things for us on the Forum. It is just another avenue a 'Newbie' can take or be pointed towards, which i suspect will only add to the confusion. Sure it is solving some issues, but in doing so creating new ones.

It's not clear to me that it's that much better than always "reserving" a reasonable length with the existing string library.

Technically that is only required when using 'Strings' outside of any local scope, though on an AVR it seems that above a certain length the String class anyway malfunctions.

In my experience it's just hiding for a little longer the naked truth that when memory gets scarce you'd better understand what's going on at byte level and so being acquainted with / exposed to the bare memory access in all its glory. Learn once, use all your life

!!! Exactly this.
The truth is often that here (on the forum) we get confronted with programs that are a concatenation of example sketches that were found as part of a library or elsewhere posted on the www. In the data transfer between those parts, sometimes 'Strings' get used, which may pose a problem for long term running. The author of the final program comes for help, and gets told to remove the usage of the 'String-class' and use c-strings instead, though many times this is not the cause of the issue. I Actually think it would be better to have a really good tutorial on "How to move from 'String' to 'c-strings' " which includes all the ins and outs of both, that we can simply reference. Using c-strings explains so much about memory usage that it should be used as an educational tool.

The difficulty with such a thing is that it would need to become popular before it's really useful. Otherwise, people looking for help with their code using the new library will mostly be met with a shrug.

If you had enough support and if you were using the Arduino for a single project I can see that it might be handy. If you plan to do multiple projects, you're better of learning how to use C strings right from the start.

I have not studied the library carefully so I have no idea whether it contains "bad things". If it does I'm sure they can be fixed.

But the concept of having some new utility functions that make cstrings easier to use seems to me to be an idea that we should support strongly.

I don't think we should dismiss it because it is different or because it is not widely used at the moment. If we encourage its use then it will probably become widely used.

In a sense the use of this concept for Arduino programming is no different from the use of other specialised libraries that we take for granted - such as Serial and digitalRead() / digitalWrite()

...R

wildbill:
The difficulty with such a thing is that it would need to become popular before it's really useful. Otherwise, people looking for help with their code using the new library will mostly be met with a shrug.

If you had enough support and if you were using the Arduino for a single project I can see that it might be handy. If you plan to do multiple projects, you're better of learning how to use C strings right from the start.

my estimation is: in spite to 95% of all libraries and example-codes available on GitHub the website that contains the tutorial for SafeStrings is extraordinary detailed and easy to understand. It covers a lot of details and problems. And that will be the reason why my guessing is that SafeString will become popular. If you can show me a c-string-tutorial that wins the competition in:

  • easyness of understanding,
  • lot's of detailed examples and
  • pointing on possible problems
    there might be tutorials dozens of pages long. But are they easy to understand?
    If you can point me to such a good cstring-tutorial I revise my opinion.
    best regards Stefan

I see the intent, there is value (interesting attempt in the parser, although could question if it has to be in that class) and clearly lots of effort put in making this well documented and robust and it’s anchored in reality.

But… in my humble opinion — unless it’s advertised heavily and embedded — the default String class will remain the preferred destination as this is the one documented as part of the IDE, used by tons of existing sample code out there and in many libraries (especially targeting the ESP platform). Also this is the default type kids learn when using Python or Java or Javascript.

Also the lack of full fledge simplicity with simple memory expansion when needed still means the newbie has to be careful about what he does. The program won’t crash but still won’t work… It will spit out some debugging info if not deactivated (once) in the library.

Most of the time the automatic realloc of memory would not fail, would not create holes (LIFO) and thus most of the time a code with the standard String class would result in working code for a newbie code whereas using the SafeString would require code modification and extra memory allocation without knowing exactly how much is enough - so newbie will likely be just one byte away from a failing program again (or will end up allocating tons of memory for each SafeString and running into other issues).

There is clearly value in this education, but if you need to understand about memory space you are just one code line away of handling it yourself with range check etc and acquiring skills that are useful for any C programming on any platform.

my 2 cts

J-M-L:
But... in my humble opinion — unless it's advertised heavily and embedded — the default String class will remain the preferred destination as this is the one documented as part of the IDE, used by tons of existing sample code out there and in many libraries (especially targeting the ESP platform). Also this is the default type kids learn when using Python or Java or Javascript.

That's probably correct.

But what do we do if a newbie presents a program for which using the String class is a problem? Up to now we have just pointed out the need to use cstrings and give them a link to the cplusplus website

Doesn't this new library offer a better (easier to use) solution for a newbie - even if it is not at the moment widely used?

...R

But what do we do if a newbie presents a program for which using the String class is a problem? Up to now we have just pointed out the need to use cstrings and give them a link to the cplusplus website

Usually we will point that direction whether the 'String-class' is the problem or not, and many a time without any investigation into other possible causes, which can be experienced as discouraging. With 'Strings' one can also reserve() space on the heap for it, which actually would fix many of those issues that do come up, but does leave the coder to do it's own error checking.

and in many libraries (especially targeting the ESP platform).

The 'String-class' on an ESP is not quite the same as it is on an AVR, and yes not only is it used by many libraries, but it is also used as a return variable from a function in many cases (which with a char* always seems a little more tricky to me). Of course on an ESP the memory is so much bigger that fragmentation is not going to cause issues very soon, but the rules to prevent fragmentation in any case are still quite simple - Do not use 'String' as a global variable type, unless you know up front what the maximum size will be and you reserve sufficient space for it, before declaring anything else on the heap.

Doesn't this new library offer a better (easier to use) solution for a newbie - even if it is not at the moment widely used?

Is it the ideal solution for me ? Am i going to support it ? maybe yes, if the debugging function is enabled it will show if the issues that the 'Newbie' encounters are caused by inaccurate use, but i will probably not point to it as a solution. The issues that are encountered by a 'Newbie' usually have the same underlying cause, being, attempting to complete a project which is a bit more advanced by combining parts found on the internet. That in combination with a lack of willingness to educate oneself into the basics of programming.
That is what working with 'Arrays' is, a basic programming skill, and a character arrays is one of the most common type of arrays. So if a 'Newbie' wants to skip chapter 3 to do the project in chapter 7 but somehow can't get it to work properly because of that, i think it is ok to reference chapter 3, also because chapter 3 is important. I am still in favor of doing a better explanation of it though.

Deva_Rishi:
Usually we will point that direction whether the 'String-class' is the problem or not, and many a time without any investigation into other possible causes, which can be experienced as discouraging.

It shouldn't matter whether it is the cause of some particular problem. It's like, someone goes to the hospital with a broken arm. The doctor sees first that they have cancer. Don't they tell the patient?

It's like, someone goes to the hospital with a broken arm. The doctor sees first that they have cancer. Don't they tell the patient?

If the patient complains about pain in the arm, the doctor should look at the arm. If you come to the garage with a flat tire, you want the tire fixed. If the engine has a blown head gasket and needs a major overhaul, that is a different matter. If on the other hand a piece of steel has been cutting through the tire, causing the flat, then i would expect the mechanic to make sure i don't puncture the tire straight away again. Back to the Arm, if bone cancer is the cause of it breaking, it is relevant. Still the arm should first go in a cast, and for the cancer, a different appointment with a specialist is the proper approach.
Again, i am not saying that the "String' issue should not be addressed, but am only stating that many times i have seen replies that state : "You are using Strings, change that to c-strings", when the cause of the programs malfunction (and the reason for seeking help) is something different and completely unrelated.

Deva_Rishi:
Again, i am not saying that the "String' issue should not be addressed, but am only stating that many times i have seen replies that state : "You are using Strings, change that to c-strings", when the cause of the programs malfunction (and the reason for seeking help) is something different and completely unrelated.

I reckon it is essential to take into account the amount of time available to assess a problem. With unlimited time (and the right hardware) one could upload the OP's code and find the actual cause of the problem - which, as you say, might not be a String problem.

However when time is limited (i.e. because I am lazy) it seems to me a good strategy to eliminate options that might be the cause of the problem. Or, put another way, if the problem is still there after the Strings have been removed it narrows the focus of enquiry.

There is also the issue that, even if the Strings are not now causing a problem they may do so in the future. An analogy would be discovering badly worn brake pads while you had the wheel off to repair the puncture.

However all this discussion seems to avoid the simple question of whether this new SafeString library would be a benefit for inexperienced programmers - and I think it would.

...R

However when time is limited (i.e. because I am lazy)

I was saying that, but if i would have phrased it like that,i think i would have gotten a lot of negative feedback.

whether this new SafeString library would be a benefit for inexperienced programmers

I think maybe it would, but it may postpone an educational important path. iow, it is a remedy for the ‘lazy’, which possible unwanted side-effects.
Maybe it should be called ‘LazyString’ instead.

Deva_Rishi:
Maybe it should be called 'LazyString' instead.

I don't like the name "SafeString" (as already mentioned) but I don't see it as "lazy" either. That's a bit like saying multiplication is a lazy way to do addition.

Think about it this way - wouldn't it be great if the normal C++ cstring functions were extended to include the capabilities of this library?

And as the library is using regular cstrings presumably the standard functions can also be used.

...R