Difference between char array and string

Hello everyone.

I have this question. What is the difference between a char array and a string?
From what i understand so far a char is a different data type than a string but i get confused by reading that a char array that is null terminated is a string of chars.
So what is the real difference ?
Is a null terminated char array the same thing with a data type string ?

I have read that strings consume alot more memmory than char arrays but i fail to see the difference between the two...

Also i have read that the compiler will automatically terminate a char array if there is enough space for a /0. If i manually null terminate my char array and there is still space left will the compiler fill the "empty" space with /0 characters ? if i write this
char testchars [10] = {'t', 'e', 's', 't', '/0'};

Also is this considered manually terminated or wil the compiler read the /0 as two different bytes in the below code and will automatically add the /0 termination after ?

char testchars [10] = {"test/0"} ;

Thank you in advance !!!

There is no string data type. The term "string" refers to a null terminated char array. There is a data type named String. I know it's confusing, but that upper case S is really important. Strings are indeed very different from strings.

Here's some information on string:

Here's some information on String:

1 Like

c++ string type is Arduino String type equivalent. When newbies search for c++ string, they most certainly get references to string instead of String.

Referring to OP's question, a String is of class type while a char array (called cstring or c-styled string) is primitive type. A cstring is from the C language and a String is from C++ language.

"Is a null terminated char array the same thing with a data type string ?"

In a word, no.

It took me a while to unlearn using the String class, but thanks to some very patient experts here, I finally grok.

I had no choice because I was running out of RAM in a project and it was full of Strings (note the upper-case "S"). Strings (String class) are a memory hog, but using c-strings (lower-case "s"), saved my project.

In general, When you Google or ask questions, the upper-case "String" indicates String Class, and lower-case "string" indicates c-string data types.

Here's an forum post that explains it better than I can.

There are other explanations on other forums. Google "Arduino evil strings"

Hope this helps.

I will attempt to do an overview of C style string and the Arduino String class.

This is a literal C string

"ABCDEF"

Similar to a literal integer like this
1

int n = 5; // Create an integer variable and initialise it with the value 5

// Create a string variable and initialise it with the string "ABC".
// The end of the string is denoted by a null terminator character which has the integer value 0.
char cstr1[20] = {'A', 'B', 'C', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};

// You can also do exactly the same as above but using the appropriate integer values for 'A', 'B' and 'C' from the ASCII table. 
char cstr2[20] = {65, 66, 67, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};


// Integer addition
n = n + 5;
n += 5; // Shorthand way of doing the above;

// String addition or more precisely concatenation
strcat(cstr1, "DEF"); // After this cstr1 will contain the string "ABCDEF"

// But you need to make sure that cstr1 has enough empty 'slots' (containing a null terminator) into 
// which the characters from literal string "DEF" can be slotted. Otherwise a run time memory overwrite // error will occur. And these can be very difficult to track down and fix.

// If you declared cstr1 like this
char cstr1[20] = {'A', 'B', 'C', 0, 0};

// And then tried this
strcat(cstr1, "DEF");

// A run time error would result because you only have two empty slots, with a null terminator, 
// available for the 3 characters in "DEF". The function strcat ASSUMES your have provided enough room in the destination string variable. Also cstr1 will be unterminated meaning that if you output it to serial monitor it will appear as though it contains a whole lot of garbage characters until some random 0 in memory is landed upon.

The Arduino String class is different.

It allows you to do this sort of thing...

String str1;

str1 = "ABC"; str1 contains "ABC"
str1 = str1 + "DEF"; // Concatenate "DEF", str1 contains "ABCDEF"
str1 += "GHI" // Concatenate "GHI", , str1 contains "ABCDEFGHI"

There is no need to worry about having enough slots in str1 in order to concatenate.

The number of character slots available in str1 automatically expands and contracts as you do various string manipulations.

C was invented first when big computers had not so much RAM. It is tight and "close to the metal" as my EE friends did say.

C++ came along later when even PC's had RAM to waste. It has features that make organizing code, especially big code a lot easier and it has conveniences that frankly make me shake my head because the costs of learning and using them is waste-y and limiting.

The C++ String class is one of those. It wraps a C string array up with extra bytes and not just the C string.h functions but a + operator and a bunch more including one that gives you the address of the text inside that you're not supposed to use but may really be needed which is why it's there.

Arduino RAM is limited. If you use a String variable and do not use the reserve() function then every time you add or remove a char, the String copies itself with the added or removed char (which takes up cpu cycles) and then erases the old version leaving a hole in your heap (in RAM) that may get fully filled later but otherwise "shotguns" the heap, raising it closer to a possible crash with the stack (heap grows from the bottom of RAM up while stack grows from the top down, if they cross the result is usually a crash very soon. Even if you do reserve space in the String, it takes more RAM than a char array that can hold the same text. The conveniences it has make it easier for beginners or people who use other languages use String, and those have a hidden cost of black box ignorance of what the code really does reinforced over time and use. A good programmer will find out and likely quit using String variables.

With C char array strings the programmer is -responsible- for "coloring inside the lines" and may find string.h (text and memory manipulation) functions to be cryptic until learned. You have to set up your arrays to always be able to hold the text you would put in them, but pre-planning how you fill strings and using const variables to hold that max length can guarantee that you don't write past the end of the array (into the next variable) as long as you code to keep inside of limits or my fave (and a bit advanced), write code that never can screw it up.

C char arrays give you direct, fast access to every char in the string. Char is a variable, you can get at ANY of them in a number of ways. For most text work I don't use string.h at all, just work the text char by char.

The string.h library has many functions and they about all have cryptic names but they are made of patterns of abbreviations that are not so hard to discern just looking at a table of the function names. So mem___ functions like memmove are memory functions and str___ functions like strstr (string-string finds a substring inside of a longer string) are string/text functions. A letter n in those denotes a count in the use and a few other letters signify alternate versions of simpler string.h functions. IMO if you can roll your own to work the array, it will teach you more.

Lastly, Arduino can store and use const char arrays (they do not change) in flash memory through PROGMEM functions whereas C++ Strings simply cannot work from flash. An Uno has 32K of flash and 2K of RAM, keeping text tables in flash can greatly expand how much that Uno can be made to do. C strings and C++ Strings don't work and play -well- together, it's easier and faster to stick with all char array strings.

2 Likes

char arrays are basicaly byte arrays with added funtions that you need to go back and forth from characters. the size is generally set.

a strings length can be set dynamically during runtime much easier. this sounds good and makes things easy but its very difficult for little arduino processors to manage memory for the changing length so char arrays are reccomended if you care about profficiency.

anyways each index in a char array holds a number from 0 to 255. you can lookup the possible values from a ascii chart. i dont think /0 is one of them!

as far as "null terminated"... with arrays at no point is there a reason to check indexes larger than your length. this should be known or prevented in you code ahead of time.

SteveMann:
It took me a while to unlearn using the String class, but thanks to some very patient experts here, I finally grok.

Karma to you for making the effort!

Now, about pointers... didjaknow you probably already use them? Pointers are C power tools that you can start simple and work up.

Taterking (any relation to "Tater Salad" aka Ron White?), just one thing.

Type byte is unsigned 8 bit (0 to 255) and type char is signed 8 bit (-128 to 127). Be careful if you ever mix the two, the compiled results may not do what you think.

There is an alternative is to design a String class that has most of the conveniences of Arduino String class and the Visual C++ MFC CString class, but does not use dynamic memory.

See attached CString.cpp and CString.h

class CBuff is a template class that is a wrapper for a char array. The specify its length at run time like so

CBuff<100> buff;

The CBuff constructor fills the char array with null terminators and has a few other convenient functions.

These are meant to be declared on the stack so that they take up memory only while the function is active.

You declare one of my CString objects like so...

CBuff<100> buff;
CString str(buff);

And you can do with my string class the vast majority of what you can do with Arduino String, plus extra.

The caveate is that you need to make sure that your CBuff object has a large enough char array for all the string manipulation.

These class resulted from the same advice that I got from this forum about not using Arduino String over a year ago.

I was inspired by the PString library.

CString.cpp (28.3 KB)

CString.h (10.1 KB)

boylesg:
The Arduino String class is different.

It allows you to do this sort of thing...

String str1;

str1 = "ABC"; str1 contains "ABC"
str1 = str1 + "DEF"; // Concatenate "DEF", str1 contains "ABCDEF"
str1 += "GHI" // Concatenate "GHI", , str1 contains "ABCDEFGHI"

And it is stuff like this that makes the String Class so bad to use. Leads to fragmentation of memory, and over time it will cause issues in your code.

1 Like

Romonaga:
And it is stuff like this that makes the String Class so bad to use. Leads to fragmentation of memory, and over time it will cause issues in your code.

But it is possible to do all this without using dynamically allocated memory.

For complex string manipulation my CString class (modeled on Arduino String, Visual C++ MFC CString and Arduino PString) HUGELY simplifies your code.

All the benefits of Arduino String with none of the memory fragmentation problems.

each index in a char array holds a number from 0 to 255.

The char data type is signed 8 bit which holds -128 to 127. The above would be true of a byte (unsigned 8 bit) array.

boylesg:
But it is possible to do all this without using dynamically allocated memory.

Arduino String class has the reserve(n) function.

All the benefits of Arduino String

Bad habits and a more-code compatibility with text in PROGMEM.

The "conveniences" of String come at a price of ignorance of the machine itself.

On SMALL hardware it is far better to code as close to the native ways it runs. You will end with smaller and faster code.

Ive "lost it" somewhere in the posts :slight_smile:

I think i got the point. Like mentioned in arduino ref. gives you more functionality at the cost of more memory.

Bottom line for me is that it is not wise to use Strings unless it is absolutely necessary.
Also from what i can tell, c style char arrays (or strings) do have a pretty good "arsenal" of functions.
Again im just a "newborn" when it comes to programming. I try to learn but with google and youtube as the only teacher, things are tough...

String gives no added functionality, it's just more convenient to those who are used to handling text with more abstract, to-hell-with-the-hardware syntax. Letting you use + to concatenate strings is not more functional than using strcat() and the same goes for the rest. Combining functions as "new" names does not add more that isn't already there.

Convenience has a cost in speed and RAM and code size.

But I tell ya, you don't really string.h to manipulate text in C. Whether string.h or String class, none of that compiles to anything but low level machine code, none of it is magical or beyond coding with simpler means. When you explore the code and see that, it becomes easier to avoid needing either and to know when string.h will save time in writing but that's all you save.

If your line of code generates a few lines worth of compile, you're only saving on typing. Reliance on black box code-wheelchairs means you get stuck in wheelchair lanes, forced to buffer-then-process as an approach when quite often there's no need.

Knowing more about what I'm doing lets me think and code in simpler, tighter ways. On Arduino, tight pays off.

That is not to say never use string.h functions. Just when you have more choices you can approach a sketch with that much more insight, more freedom to arrange your data and code.... and know that String is a fat black box of a choice.

HellasT:
Again im just a "newborn" when it comes to programming. I try to learn but with google and youtube as the only teacher, things are tough...

When I was a beginner programmer, I had a copy of BASIC BASIC the boss gave me and there was no internet.

There is a whole manual worth of material plus tutorials on the Arduino main site and in your IDE. There are links at the top of the forum to get you to those but you gotta click em to see em.

Also from what i can tell, c style char arrays (or strings) do have a pretty good "arsenal" of functions.

And you mostly never need to use them but that does take a while to find out by yourself.

GoForSmoke:
When I was a beginner programmer, I had a copy of BASIC BASIC the boss gave me and there was no internet.

There is a whole manual worth of material plus tutorials on the Arduino main site and in your IDE. There are links at the top of the forum to get you to those but you gotta click em to see em.

And you mostly never need to use them but that does take a while to find out by yourself.

In my view at least, the standard string functions, like strcpy and strcat, do not make for easily readable code when you are doing more than just rudimentary string operations.

It often takes several lines of code to do an operations using strcat etc when a string class can do the same with one function call.

Obviously it doesn't reduce the total number of lines of code - the string manipulation code is simply hidden in the string class.

But what a string class does do is to make debugging your code a lot easier through the compartmentalization of complexity that object oriented programming provides.

It means a lot more on a PC that has room to waste. It works on Arduino but you run out of RAM before necessary.

I have a parse&lex word-match function that easily keeps up with on-the-fly high speed serial. It does not require a buffer or any string.h functions. When I feed it text from RAM, it averages 9us per char matched while walking through a word list in PROGMEM, the list can be 1000's of words long (2000 words with average taking 8 chars fills a bit more than half the flash on an Uno, a Mega can support loads more list) and not bog this code down.

There's no way I would think to write that kind of code if I thought in terms of String or string.h.

For most things text, I don't need or want an approach where string.h or String functions need to be used.

Buffer then process requires more RAM and that's just the start when using those approach/technique-limiting functions.