Help with strtok function?

Hello,

I'm attempting to use the strtok (string token) command but keep getting "strtok not declared in this scope" error.

Can anyone tell me what I'm missing here?

Thanks!

#include <string.h>

char *record = "name:bob";
char *p;

void setup() {

   Serial.begin(9600);
   Serial.println("Starting..");
   
    //  First strtok iteration
    p = strtok(record,":");
    Serial.print(p);
    Serial.print(" = ");

    //  Second strtok iteration
    p = strtok(NULL,":");
    Serial.println(p);

}

void loop () {
}

Xtalker, it looks like avr's string.h supports strtok_r (re-entrant safe) instead of strtok. See

http://www.mkssoftware.com/docs/man3/strtok_r.3.asp

Mikal

Aha! Thanks so much! The following code works as expected:

#include <string.h>

char *record = "name:bob";
char *p, *i;

void setup() {

   Serial.begin(9600);
   Serial.println("Starting..");
   
    //  First strtok iteration
    p = strtok_r(record,":",&i);
    Serial.print(p);
    Serial.print(" = ");

    //  Second strtok iteration
    p = strtok_r(NULL,":",&i);
    Serial.print(p);
    Serial.println("");

}

void loop () {
}

Output:

Starting...
name = bob

Ok, with that problem solved its on to the next mystery!

I'd like he subStr function below to return a sub-string at a certain index such that

subStr("this is a test", " ", 2)

would return "is"

This seems to work but not consistently. I suspect some problem with my string handling here but just not sure what.

Thanks for any help!

//   strtok test                                                               */

#include <string.h>

#define MAX_STRING_LEN  20

char *record1 = "one two three";
char *record2 = "Hello there friend";
char *p, *i;

void setup() {

   Serial.begin(9600);
   
   Serial.println("Split record1: ");
   Serial.println(subStr(record1, ' ', 1));
   Serial.println(subStr(record1, ' ', 2));
   Serial.println(subStr(record1, ' ', 3));

   Serial.println("Split record2: ");
   Serial.println(subStr(record2, ' ', 1));
   Serial.println(subStr(record2, ' ', 2));
   Serial.println(subStr(record2, ' ', 3));
}

void loop () {
}

// Function to return a substring defined by a delimiter at an index
char* subStr (char* str, char delim, int index) {
   char *act, *sub, *ptr;
   char copy[MAX_STRING_LEN];
   int i;

   // Since strtok consumes the first arg, make a copy
   strcpy(copy, str);

   for (i = 1, act = copy; i <= index; i++, act = NULL) {
      //Serial.print(".");
      sub = strtok_r(act, &delim, &ptr);
      if (sub == NULL) break;
   }
   return sub;

}

output:

Split record1:
one
two
tQQQ
Split record2:
Hello
t
eQQQ

Hi Xtalker--

The problem with subStr is that you ultimately return a pointer (sub) that points into a buffer (copy) that has ceased to exist. Since subStr has exited, the space its local variables once occupied has been returned to the system. This is an insidious error, because the return pointer points to data that was once (and may still be) valid.

A simple workaround may simply be to declare copy static. Or you could rewrite subStr to require that callers pass in their own "copy" buffer.

Good luck.

Mikal

Thanks mikalhart, that makes sense. I changed:

char copy[MAX_STRING_LEN];
to
static char copy[MAX_STRING_LEN];

The strange chars are gone from the output, but still not what I'd expect. There must be something else going on here?

Output:

Split record1:
one
two
t
Split record2:
Hello
t
ere

Thanks for your help on this!

Ah, I see. The "delim" parameter for strtok_r is supposed to be a pointer to "a NULL-terminated set of characters", whereas you are passing a pointer to a single (non-terminated) character.

To fix, change subStr's second parameter to "char *delim", remove the "&" in the strtok_r call, and when you call subStr, use " " instead of ' '.

Do you see what's going on now?

Mikal

I think it's funny that we only get a "reentrant strtok()" on this platform, because without threads, reentrancy issues are exceedingly rare. Doing strtok() in ISRs and non-ISRs together is similarly unlikely.

If you want to develop safe string handling in C, you need to adopt the point of view that the caller owns the storage, the callee just manipulates it (and never goes outside the buffer limits expressed to it). No "static buffers" inside the string routines, just in the callers. It's possible to get the behavior right in other ways, but it's also possible to go bald from pulling out your hair.

And as mikalhart observes, carefully consider the differences between buffers (block of memory), strings (zero-terminated), and characters.

That was it! Not sure why I didn't realize this by looking at the ref pages! I guess this would allow you to define a string of several delimiters to tokenize with, pretty powerful.

The following now works as expected, thanks again!

//   strtok test                                                               */

#include <string.h>

#define MAX_STRING_LEN  20

char *record1 = "one two three";
char *record2 = "Hello there friend";
char *p, *i;

void setup() {

   Serial.begin(9600);
   
   Serial.println("Split record1: ");
   Serial.println(subStr(record1, " ", 1));
   Serial.println(subStr(record1, " ", 2));
   Serial.println(subStr(record1, " ", 3));

   Serial.println("Split record2: ");
   Serial.println(subStr(record2, " ", 1));
   Serial.println(subStr(record2, " ", 2));
   Serial.println(subStr(record2, " ", 3));
}

void loop () {
}

// Function to return a substring defined by a delimiter at an index
char* subStr (char* str, char *delim, int index) {
   char *act, *sub, *ptr;
   static char copy[MAX_STRING_LEN];
   int i;

   // Since strtok consumes the first arg, make a copy
   strcpy(copy, str);

   for (i = 1, act = copy; i <= index; i++, act = NULL) {
      //Serial.print(".");
      sub = strtok_r(act, delim, &ptr);
      if (sub == NULL) break;
   }
   return sub;

}

output:

Split record1:
one
two
three
Split record2:
Hello
there
friend

I think it's funny that we only get a "reentrant strtok()" on this platform, because without threads, reentrancy issues are exceedingly rare.

I thought that was funny too, Halley. Even funnier is the fact that the solution I proposed defeats that re-entrancy protection by creating the static shared buffer in subStr. :slight_smile:

@Xtalker: Halley is quite right in observing that next time you should move "copy" out of subStr and pass the caller-owned buffer in as a parameter to subStr. That way you could, for example, maintain buffers in two different places at once.

Mikal