Beginner Question: Char Array Vs Strings

I'm quite new to the Ecosystem and C++ and at the moment a little bit confused about the differences between char and strings. I've read that chars are basicaly arrays of single characters with a terminating zero and that thats "the way the language does strings".

But there seems to be some sort of "real" strings too? What is the difference? There seem to be moment where I can mix them and sometimes they wont let me do that. Some functions accept strings while others only accept chars?

    void setup() 
{
    Serial.begin(9600);
}

void loop() {
  String myString= "Hello ";
  char myChar[] = "World";
  char myCharSetLength[5] = "World";
  short testNum = 5;
  String stringtoArr = "1,2"; 
  char charToArr[] = "1,2"; 

  char *firstChar  = strtok (charToArr, ","); //what does the asterisk mean???
  // COMMENTED BECAUSE OF ERROR // String *firstChar  = strtok (stringtoArr, ","); //Drops Error: cannot convert 'String' to 'char*' for argument '1' to 'char* strtok(char*, const char*)'

  String combinedString = myString+testNum; // creates a working string "Hello 5"
  // COMMENTED BECAUSE OF ERROR // char combinedChar[] = myChar+testNum; // drops Errors: error: initializer fails to determine size of 'combinedChar' //  error: array must be initialized with a brace-enclosed initializer
  // COMMENTED BECAUSE OF ERROR // char combinedChar[] = myString+testNum; // drops an Error "cannot convert 'StringSumHelper' to 'char' in initialization"

 //myString += myChar; // creates a working string "Hello World"
  myString += myCharSetLength; // working too with a fixed size char

    Serial.println(myString); // outputs "Hello World"."
    Serial.println(combinedString); // outputs Hello 5"
    Serial.println(firstChar); // outputs 1
}

Correct. An array of characters followed by a byte that is zero (also called "zero-terminator").
That is good old 'C' and 'C++' strings.

That also means that this is a bug:

char myCharSetLength[5] = "World";

It needs 6 bytes in memory, because there are 5 character plus the zero-terminator makes 6.
Because of that bug, your test code might go haywire. Any outcome of that code should not be taken serious.

Please don't use the strtok() function. It is the only function that is not safe in a multitasking operating system, such as the ESP32 with FreeRTOS.

Now we have that out of way, let's move forward.
The 'C++' language was extended with standard libraries. One of those is std::string.
https://cplusplus.com/reference/string/string/
Those standard libraries are too much for an Arduino Uno, so you can not use them on simple Arduino boards.

Looking for a solution to be able to use strings in a more modern way, the 'String' class was developed: https://www.arduino.cc/reference/en/language/variables/data-types/stringobject/
If you read all the function on the reference page, then you know what it can do.

The Arduino 'String' class is a C++ class that is added to the Arduino software: https://github.com/arduino/ArduinoCore-avr/blob/master/cores/arduino/WString.h
It can do a lot, but not everything.
Since it can use the heap intensively, it is better to not use it in a interrupt. It might even cause a heap problem: https://learn.adafruit.com/memories-of-an-arduino/measuring-free-memory#sram-370031

Under the hood (bonnet), Strings use c-strings internally.
That should say something in itself.
Strings are ‘nice’, but quite inefficient.

They bring BASIC-like simplicity by hiding the realities of memory usage,

Wait.. UNIX, c, strtok() all grew up together in the same neighborhood. Where did you hear this stuff about strtok() not being safe in a multitasking operating system?

-jim lee

The function strtok() keeps data stored by itself to be used a next time it is called. When used in a interrupt and in the normal code then it can go wrong. When used at the same time in different tasks, then it can go wrong.

It is not something I heard. As far as I know it is common knowledge in the embedded systems world.

Maybe the term "Operating System" is misleading for FreeRTOS. Perhaps "embedded pre-emptive multitasking system" is better.
On a real Operating System, such a unix, each program that runs on its own as its own task is no problem.

Correct indeed.

There is a reentrant version of the strtok() function, strtok_r() , which does not use any internal static storage, and can be used in place of the strtok() function.

On UNIX if your app is multithreaded you might need to think about it too.

That's a lot of knowledge to digest for me :smiley: , thanks a lot for the resposes! I think I get it now.
Concerning the multitasking/memory concerns: I'm using a Mega 2560 with a rather simple sketch atm, so I think I should be safe for this project. But it's nevertheless good to know that I should be careful regarding those points for future projects.

Concerning strtok(): I'll using it to split a string of commands from serial write with delimiters into variables/array elements, so the hint about strtok_r() is highly appreciated. Could you enlighten me why I need an asterisk in that case in the var name? Most of the time that means a wildcard in some way, what's it here?

char *firstChar  = strtok_r(charToArr, ",");

That in C/C++ is the symbol to make the variable into a pointer. It access an address which contains not the variable its self, but the address of where the variable can be found.

It is useful for allowing a function to directly change a variable value, as if it were a global variable. Look up its exact use on any C/C++ tutorial site, now you know what it is called.

you are as long as you don't use strtok both in an interrupt routine (would not be advised anyway) and the main code.