I've never run into any problems using it. However, I have recently seen various threads discussing that the function "strtok()" is not safe to use, as it isn't "thread-safe". Serial input basics uses this same function.
Here's one thread I read discussing these problems with strtok; Strtok() bites me again, though I'm not quite advanced enough to understand what the specific problem(s) with strtok() actually are.
Here is the usage of strtok() from example 5 of the SIB tutorial;
void parseData() { // split the data into its parts
char * strtokIndx; // this is used by strtok() as an index
strtokIndx = strtok(tempChars,","); // get the first part - the string
strcpy(messageFromPC, strtokIndx); // copy it to messageFromPC
strtokIndx = strtok(NULL, ","); // this continues where the previous call left off
integerFromPC = atoi(strtokIndx); // convert this part to an integer
strtokIndx = strtok(NULL, ",");
floatFromPC = atof(strtokIndx); // convert this part to a float
}
Is the author's usage of strtok() in this example problematic? If so, what would be a better way of parsing multiple variables out of a single char array?
FFI, I run nearly all my projects on ESP32-WROOM-32U boards these days.
The function strtok() stores a pointer locally. That means if strtok() is used in a interrupt and in the loop(), then it goes wrong. Also when used from multiple tasks in a multitasking system then it goes wrong.
You won't notice any problem, unless you use a ESP32 and decide to start another FreeRTOS task which also uses strtok(). The chance that you do that and also have forgotten about the risks of strtok() might be 1%. Then why take a risk of 1% that your project will fail ? Just don't use strtok() anymore.
Thanks. Your explanation makes sense to me. Even if it's such a rare occurrence that I'd ever use strtok() on 2 cores, it seems trivial to just use a different method.
I will see if I can modify the example with a different function/method. I will post my results to this thread.
No, strtol() can store the end pointer. You just chose to discard it by making the second argument NULL. If you keep the end pointer, after the conversion of an unknown number of digits, you have a pointer to the remaining string,
" If endptr is not NULL, strtoul() stores the address of the first invalid character in *endptr. If there were no digits at all, however, strtoul() stores the original value of nptr in endptr. (Thus, if *nptr is not '\0' but **endptr is '\0' on return, the entire string was valid.)"
An unknown number and type of delimiters is a different matter, e.g. '<space>,<space>' so that would require some extra footwork. But I've used this technique to collect a series of <space> separated numeric arguments. Because also:
" The string may begin with an arbitrary amount of white space (as determined by isspace())"
I have modified the parseData() function of the Serial Input Basics - updated - #3 by Robin2 tutorial to use the strtok_r function instead. The link provided in Post #4 of this thread seems to indicate that this is a thread-safe alternative to strtok. I tested it and it seems to work fine.
void parseData() {
char *str;
char *p = tempChars; //example "hello,123,3.14"
byte counter = 0;
while ((str = strtok_r(p, ",", &p)) != NULL) // Don't use \n here it fails
{
switch(counter){
case 0:
strcpy(messageFromPC, str);
break;
case 1:
integerFromPC = atoi(str);
break;
case 2:
floatFromPC = atof(str);
break;
}
counter++;
}
}
The following may be a bit over the top, but when tucked away in a library it could be a convenient way to parse strings.
The idea is to let the parseLine function decide how to parse a field based on the types of the parameters it receives. New types can be added by overloading the parse function.
In the example above, "," is used as the delimiter. In the link to the simulator the delimiter is ", ". For machine to machine communication I would not use text at all, so I assume that the incoming data is generated by a spreadsheet program or something like that.
strtok() captures the initial ptr it is given, which means subsequent calls with a NULL ptr are based on it and calls with a new non-NULL ptr overwriting its internally saved ptr and possibly interfering with the original string being further processed.
with strtok_r(), a ptr variable is passed to it to use, so that it doesn't need an internal ptr, allowing multiple calls to strtok_r() where each is passed a unique ptr variable
Have you checked that all the other functions your code uses are also reentrant-safe? All your functions? All the functions in the libraries you're using?
If not, why are you obsessing over strtok() specifically?
Another scenario where it goes wrong is if you're parsing something like 12=1,14=0,6=55.
I strtok on the comma first and get 12=1; next I strtok that result on the equal sign before doing the next strtok on the comma.