How to recover char array modified/destroyed by strtok()

I want to use strtok() function to split a comma delimited char array.

char *strtok(char *str, const char *delim)

From what I have read and found from experimenting is that strtok() needs to be able to write NULL characters at the locations where the delimiters.

Since the following generic example produces Pointers to the locations of delimiters, I wonder whether it would be possible to recover the original un-split array if I overwrite the NULL character with the delimiter again after copying the required delimited strings over. Basically I would like to be able to preserve the original array as it is.

ptr = strtok(array, ","); 
// first token is not an issue because NULL is at the end of the first string
while(ptr != NULL)
{
	// Do something to extract the info using the ptr
	//
	// Then do
	*ptr-- = ",";  <<<<< something like this

	ptr = strtok(NULL, ",");  
}

Has anybody done something like this?

What are the pitfalls of doing this?

The best way is to copy the original string to another string and use strtok() on the copy.

char array[] = "1,76,2,0,19:31,- 6.0,E.-dorf Krkhaus,24,107,1,07601";

char tempChars[sizeof(array)];  // ***** will receive copy of array to preserve original array

char *strings[16]; // an array of pointers to the pieces of the above array after strtok()
char *ptr = NULL;

void setup()
{
   Serial.begin(9600);
   Serial.print(tempChars);
   byte index = 0;

   strcpy(tempChars, array); // **** copy array to preserve original

   ptr = strtok(tempChars, " ,");  // delimiters space and comma
   while (ptr != NULL)
   {
      strings[index] = ptr;
      index++;
      ptr = strtok(NULL, ",");
   }
   //Serial.println(index);
   // print all the parts
   Serial.println("The Pieces separated by strtok()");
   for (int n = 0; n < index; n++)
   {
      Serial.print("piece ");
      Serial.print(n);
      Serial.print(" = ");
      Serial.println(strings[n]);
   }

}

void loop()
{
   // step away, nothing to see here.
}

jk2021:
What are the pitfalls of doing this?

I would expect that strtok() would find the reinserted token on its next pass. Time for some research.

The question might be why you want to restore the original string.

You can achieve what you want using two pointers and strchr().

Little demo:

char input[100] = "Hello world, how are you?";

void setup()
{
  Serial.begin(57600);

  char *ptrBegin;
  char *ptrFound;

  char sep = ',';

  // set pointer to begin of input
  ptrBegin = input;
  do
  {
    // find separator
    ptrFound = strchr(ptrBegin, sep);

    // if found
    if(ptrFound != NULL)
    {
      // replace separator by terminalting NUL character
      *ptrFound = '\0';
      // print
      Serial.print("Got: '"); Serial.print(ptrBegin); Serial.println("'");
      // restore separator
      *ptrFound = sep;
      // the next begin is the character after the separator that was just found
      ptrBegin = ptrFound;
      ptrBegin++;
    }
  } while (ptrFound != NULL);
  // print the last part
  Serial.print("Got: '"); Serial.print(ptrBegin); ; Serial.println("'");
  
  Serial.print("Original: '"); Serial.print(input); Serial.println("'");

}

void loop()
{

}

PS
Karma for using code tags in your first post

Thanks for all the answers.

I was contemplating on 'strchr' but did not have the idea about using a second pointer.

That makes sense.

Will give it a try.

Just wanted to give a link to this lengthy post which has some real insight in to how 'strtok()' works.

Hoping that it will be useful if someone stumbles upon this post looking for 'strtok()'.

How does strtok() split the string into tokens in C?

I don't know if the link quite explains it. But basically it does what the code in reply #4 does; the difference is that strtok() does not restore the character.

I'm still curious why you want to keep the original? :wink:

sterretje:

char input[100] = "Hello world, how are you?";

void setup()
{
 Serial.begin(57600);

char *ptrBegin;
 char *ptrFound;

char sep = ',';

// set pointer to begin of input
 ptrBegin = input;
 do
 {
   // find separator
   ptrFound = strchr(ptrBegin, sep);

// if found
   if(ptrFound != NULL)
   {
     // replace separator by terminalting NUL character
     *ptrFound = '\0';
     // print
     Serial.print("Got: '"); Serial.print(ptrBegin); Serial.println("'");
     // restore separator
     *ptrFound = sep;
     // the next begin is the character after the separator that was just found
     ptrBegin = ptrFound;
     ptrBegin++;
   }
 } while (ptrFound != NULL);
 // print the last part
 Serial.print("Got: '"); Serial.print(ptrBegin); ; Serial.println("'");
 
 Serial.print("Original: '"); Serial.print(input); Serial.println("'");

}

void loop()
{

}

I adapted the above example and managed to get the desired result.

I have one more related question:

Further I used following to split a key/value pair using strchr() command:

  char *pSplit = strchr(myKeyValPair, '=');
  char *pValue = strchr(myKeyValPair, '=') + 1 ; //pick from next character after "="
  *pSplit = '\0';
  // Now the Original pointer myKeyValuePair will retrieve only the left hand part of the original string

When the return happens from the function will the memory release be impacted because of the truncation of the original string myKeyValPair? (Hopefully not).

sterretje:
I'm still curious why you want to keep the original? :wink:

I have the original array stored in a structure and would need to reuse it again in a different iteration of the code.

You will have to show your complete function as well as how it is called.

I understand that you want to re-use it; but after parsing you have all the data available in variables, so in my opinion no need for the original anymore.

jk2021:
I have the original array stored in a structure and would need to reuse it again in a different iteration of the code.

As previously suggested, the obvious way to do this is to work on a copy of the original string

sterretje:
You will have to show your complete function as well as how it is called.

I understand that you want to re-use it; but after parsing you have all the data available in variables, so in my opinion no need for the original anymore.

Actually the above question is related to a different function where I applied the same principle of replacing delimiter with a NULL to extract key,value pairs.

Initially I did not have the restore option in and found that repeated calls to the function misbehaved.

Then I added a restore option at the end and now everything seems to be working fine.

The function will be called using a pointer to a character array which contains a string like "key=value".

void SetupGlobalPatternVariables(char* myKeyValPair){

  Serial.print("Received myKeyValPair: '"); Serial.print(myKeyValPair); Serial.println("'");

  char *pSplit = strchr(myKeyValPair, '=');
  char *pValue = strchr(myKeyValPair, '=') + 1 ; //pick from next character after "="
  Serial.print("pValue: '"); Serial.print(pValue); Serial.println("'");

  *pSplit = '\0';

  Serial.print("myKeyValPair: '"); Serial.print(myKeyValPair); Serial.println("'");
  // Actually now myKeyValPair will contain only the key value. We continue with the same variable name for simplicity

  if (strcmp(myKeyValPair, "color1") == 0) {
    Serial.print("passed color1: '"); Serial.print(pValue); Serial.println("'");  
    gColor1Value = pValue;
  } else if (strcmp(myKeyValPair, "color2") == 0) {
    Serial.print("passed color2: '"); Serial.print(pValue); Serial.println("'");
  } else if (strcmp(myKeyValPair, "fadeAmount") == 0) {
    Serial.print("passed fadeAmount: '"); Serial.print(pValue); Serial.println("'"); 
  } else if (strcmp(myKeyValPair, "loopDelay") == 0) {
    Serial.print("passed loopDelay: '"); Serial.print(pValue); Serial.println("'");    
  }

  *pSplit = '='; // replace the NULL with original value

}//SetupGlobalPatternVariables

With this what I am trying to do is to pass a character string which just contains plain text "key=value" via a HTTP request and then be able to decode and setup some global variables which will be picked up by code currently getting executed in the loop();

I still don't understand why you want to parse the same text multiple times.

In this snippet of your code, you assign the received value to a variable gColor1Value. So you don't have to parse that again.

  if (strcmp(myKeyValPair, "color1") == 0) {
    Serial.print("passed color1: '"); Serial.print(pValue); Serial.println("'");  
    gColor1Value = pValue;

sterretje:
I still don't understand why you want to parse the same text multiple times.

In this snippet of your code, you assign the received value to a variable gColor1Value. So you don't have to parse that again.

Ok. Let me explain

  1. I have some patterns with custom parameters.

  2. The parameters have names and each pattern will have different number of parameters

  3. At the start of a new pattern it takes default values stored in PROGMEM

  4. The user interface (Web App) can display these names & values when the pattern is running

  5. User has a choice of changing these parameters in real time and the pattern will adopt them immediately

  6. All these work fine when coded in bits and pieces. Now I want to get them in to one working setup

  7. Started by storing the pattern specific parameters as a string with multiple delimiters (analogy JSON)

  8. These strings will be stored in a structure in PROGMEM

  9. As each pattern is started, the default values will be picked from here and also displayed to the user with some user controls to change them. eg. color pickers to pick values for different color parameters etc

  10. For some parameter types Global Variables can be used (as shown in above snippet). However to make the system flexible I want to explore the possibility of making it data driven ie. the parameter string to be the driving force.

  11. So I am trying to see how efficiently I can handle all these by storing the configuration in PROGMEM in a single structure. Why I want to preserve the structure is that at the next iteration of the same pattern the values should be set to default values based on the contents of the structure.

  12. My idea was to try to manage everything with pointers to different string segments in the original structure. It is becoming little complicated than I expected.

First time you run a pattern, copy the parameters from PROGMEM to the structure. On successive runs of the pattern, just use the parameters in the structure. No need to parse again.

If you start another pattern for the first time, repeat above.

If you want to remember the settings of one pattern while running another one, use an array of structures (one element per pattern).

If you doing string processing, I suggest you check out my SafeString library (available from the library manager)
A detailed tutorial is here

It avoids all the coding errors associated with pointers and has extensive built in error checking and error messages.
You can also easily 'wrap' a char array method argument is a SafeString for processing.
The SafeString strtoken() method does what you need without modifying the original string

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.