String Handling Problem

Hi everyone, I am new to the Arduino community and I'm in need of a little help, I Have tried to implement a function which splits an incoming string into 5 arguments using the strtok_r function, the problem I am facing, when I get the first argument or the command, it separates fine using an ' ' (space) as the delimiter then on the second call to strtok_r it first checks the next character in the original array to see if its a '"' (quote) or not, if its a quote it sets the delimiter as a quote and calls strtok_r, and places the quoted string into 1 argument, then i change the delimiter back to a space and before calling strtok_r again it checks if the next character is an quote and if true does the same again except it does not work as expected on the third call, from testing it has either output the third argument including the space if not quoted and with the quote if it was enclosed in quotes,
heres the code any help is much appreciated.

#include <SPI.h>
#include <SdFat.h>
#include <SdStream.h>
#include <SdFatUtil.h>
#include <SdBaseFile.h>
#include <SdFatmainpage.h>
#include <SdFatStructs.h>
#include <bufstream.h>
#include <SdFatConfig.h>
#include <SdInfo.h>
#include <ios.h>
#include <istream.h>
#include <ostream.h>
#include <SdFile.h>
#include <ArduinoStream.h>
#include <SdVolume.h>
#include <Sd2PinMap.h>
#include <Sd2Card.h>
#include <iostream.h>
#include <string.h>

#define SD_CS 53
#define SD_SS 4

unsigned short int  serIn;
char serInString[100];
unsigned short int i;
unsigned short int  serInIndx  = 0;
unsigned short int  serOutIndx = 0;
char *str1, *cmd, *arg1, *arg2, *arg3, *arg4;
unsigned short int cmdLen = 0, arg1Len = 0, arg2Len = 0, arg3Len = 0, arg4Len = 0;
String cmdStr, arg1Str, arg2Str, arg3Str, arg4Str;
unsigned short int argc = 0;
char copiedstr[100];
char *sscpy;
unsigned short int copiedstrIndx;

Sd2Card  card;
SdVolume volume;
SdFile   root;
SdFile   file;

void setup() {
  pinMode(SD_CS, OUTPUT);
  Serial.begin(57600);
  Serial.println("Hello World");
  if(!card.init(SPI_FULL_SPEED, SD_SS))
  {
    Serial.println("SD Card Initialisation Failed");
    return;
  }

}

void readSerialString ()
{
  argc = 0;
  int sb;   
  if(Serial.available()) {
    sb = Serial.read();
    if(sb == '@')
    {
      while (Serial.available())
      { 
        sb = Serial.read();        
        serInString[serInIndx] = sb;
        serInIndx++;
      }
      for(copiedstrIndx = 0; copiedstrIndx < serInIndx; copiedstrIndx++)
      {
        copiedstr[copiedstrIndx] = serInString[copiedstrIndx];
      }
      sscpy = copiedstr;
    }
  }
}

void findArguments(char *sstring)
{
  char delim = ' ';
  char *saveptr1, *saveptr2, *saveptr3, *saveptr4, *saveptr5;
  str1 = sstring;
  cmd = strtok_r(str1, &delim, &saveptr1);
  cmdStr = String(cmd);
  cmdLen = cmdStr.length();
  if(cmd != NULL)
  {
    argc++;
  }
  str1 = saveptr1;
  if(serInString[cmdLen+1] == '"')
  {
    delim = '"';
  }
  arg1 = strtok_r(str1, &delim, &saveptr2);
  arg1Str = String(arg1);
  arg1Len = arg1Str.length();
  if(arg1 != NULL)
  {
    argc++;
  }
  str1 = saveptr2;
  delim = ' ';
  if(serInString[cmdLen+1+arg1Len+2] == '"')
  {
    delim = '"';
  }
  arg2 = strtok_r(str1, &delim, &saveptr3);
  arg2Str = String(arg2);
  arg2Len = arg2Str.length();
  if(arg2 != NULL)
  {
    argc++;
  }
  str1 = saveptr3;
  delim = ' ';
  if(serInString[cmdLen+1+arg1Len+2+arg2Len+2] == '"')
  {
    delim = '"';
  }
  arg3 = strtok_r(str1, &delim, &saveptr4);
  arg3Str = String(arg3);
  arg3Len = arg3Str.length();
  if(arg3 != NULL)
  {
    argc++;
  }
  str1 = saveptr4;
  delim = ' ';
  if(serInString[cmdLen+1+arg1Len+2+arg2Len+2+arg3Len+2] == '"')
  {
    delim = '"';
  }
  arg4 = strtok_r(str1, &delim, &saveptr5);
  arg4Str = String(arg4);
  arg4Len = arg4Str.length();
  if(arg4 != NULL)
  {
    argc++;
  }
}

void printSerialString() {
  if( serInIndx > 0) {
    Serial.print("you said: ");
    for(serOutIndx=0; serOutIndx < serInIndx; serOutIndx++) {
      Serial.print( serInString[serOutIndx] );
    }        
    serOutIndx = 0;
    serInIndx  = 0;
    Serial.println();

    str1 = NULL;
    cmd = NULL;
    arg1 = NULL;
    arg2 = NULL;
    arg3 = NULL;
    arg4 = NULL;
  }
}

void loop () {
  readSerialString();
  if(serInIndx > 0)
  {
    findArguments(sscpy);
  }
  if(argc > 0)
  {
    Serial.println(cmdStr);
    Serial.println(cmdLen);
    Serial.println(arg1Str);
    Serial.println(arg1Len);
    Serial.println(arg2Str);
    Serial.println(arg2Len);

    argc = 0;
  }
  printSerialString();
  delay(1000);
}

That is very confusing. However in case you don't realise, strtok modifies the original string. So if you are planning to run it multiple times on the same string (which you appear to be doing) it almost certainly won't work in the way you expect.

You may find regular expressions a bit easier to use:

Thank you for your quick response Nick, I am aware that it modifies the string passed to it and i have made a copy of the original string in the "readSerialString" function with the for loop then point to the copied string with *sscpy i understand it is confusing and i should have commented it, my bad, I will take a look at your link and see what i can come up with.

Mark.

andreotti09:
i have made a copy of the original string in the "readSerialString" function with the for loop then point to the copied string with *sscpy ...

You made a copy of the pointer you mean, it isn't exactly the same thing.

Thank's for the lib Nick, it should do the trick.

Mark.

I could still be wrong here, iv'e only been learning C for around 3 months, but i copied it into another char array first then pointed to that as strtok_r takes a pointer for its first argument? char copiedstr[100];, or have I misunderstood something, I just read up a little on regex's and the search and match functions are a little out of context, maybe I was a bit hasty when i quoted "should do the trick" XD is the only documentation available included in the link you posted?

I Have tried to implement a function which splits an incoming string into 5 arguments using the strtok_r function

The strtok_r() function is the thread safe version of the strtok() function. It is much more complicated to use than the strtok() function.

On a system that can not execute multiple threads, why do you want/need the added complexity/code size of the thread-safe version?

Use the proper function!

andreotti09:
I could still be wrong here, iv'e only been learning C for around 3 months, but i copied it into another char array first then pointed to that as strtok_r takes a pointer for its first argument? char copiedstr[100];, or have I misunderstood something,

I don't think you are using strtok_r the right way. The third argument is supposed to be:

s3 : Is a value-return parameter used by strtok_r() to record its progress through s1.

You aren't using it like that.

  char *saveptr1, *saveptr2, *saveptr3, *saveptr4, *saveptr5;
  str1 = sstring;
  cmd = strtok_r(str1, &delim, &saveptr1);
 ...
  arg1 = strtok_r(str1, &delim, &saveptr2);
...
  str1 = saveptr2;
  arg2 = strtok_r(str1, &delim, &saveptr3);
...
  str1 = saveptr3;
...
  arg3 = strtok_r(str1, &delim, &saveptr4);
...
  str1 = saveptr4;
  arg4 = strtok_r(str1, &delim, &saveptr5);

According to the docs:

To get the first token from s1, strtok_r() is called with s1 as its first parameter. Remaining tokens from s1 are obtained by calling strtok_r() with a null pointer for the first parameter.

You aren't calling strtok_r with NULL as the first parameter. You are somehow combining both ideas.

I just read up a little on regex's and the search and match functions are a little out of context, maybe I was a bit hasty when i quoted "should do the trick" XD is the only documentation available included in the link you posted?

There's lots of documentation on regular expressions, and examples in the library. Perhaps if you showed an example of the incoming string, and what you expect to extract from it (by example). I found your description:

... it first checks the next character in the original array to see if its a '"' (quote) or not, if its a quote it sets the delimiter as a quote and calls strtok_r, and places the quoted string into 1 argument, then i change the delimiter back to a space and before calling strtok_r again it checks if the next character is an quote ...

... confusing. I can't visualize what you are passing in, and what you expect to get out.

It worked as I expected until I added the if statement, what I want is to split the words into a separate variable as to obtain arguments like argv[0]-argv[4] and that worked but then I wanted to check if there was a quote and if so put whatever was between the quotes into argv[1] it worked up until this point, but when I try and check for the quote with if statement after this point it does not work as expected, the output of the above (modified) code with this input '@echo "hello world" "this string"' to the serial is


Hello World
echo
4
hello world
11
"this
5
you said: echo "hello world" "this string"

and with the input "@echo "hello world" this string" I get

Hello World
echo
4
hello world
11
this
4
you said: echo "hello world" this string

I don't think strtok will work easily for you, nor in this case will regular expressions without some fiddling around.

What you can do is "manually" parse the string, it isn't that hard.

Something like:

  1. Skip leading spaces.

  2. If we have a quote:
    a. bypass it (add 1 to pointer)
    b. remember where we are
    c. scan forwards until we get the same quote again
    d. The "word" is what was between the two quotes
    e. Go back to 1

  3. No quote:
    a. Remember where we are
    b. Scan forwards until we get a space
    d. The "word" is what was between the two spaces
    e. Go back to 1

You could probably write that in less lines than all the strtok stuff.

Yes you're right Nick, I figured why strtok_r was not working for me, it was because just before the third call to strtok_r I set the delimiter back to the space then I check if it is a quote and if so change the delimiter back to a quote therefore the next argument is a space when quotes are present in arg1 and arg2, its catch 22 so I'll have a try at the loop approach that you mentioned, your help is appreciated.

 pinMode(SD_CS, OUTPUT);
  Serial.begin(57600);
  Serial.println("Hello World");
  if(!card.init(SPI_FULL_SPEED, SD_SS))
  {
    Serial.println("SD Card Initialisation Failed");
    return;

The above is not correct.........

#include <pins_arduino.h>
// pin 4 is the SPI select pin for the SDcard
const int SD_CS = 4; 
// pin 10 is the SPI select pin for the Ethernet
const int ETHER_CS = 10;

pinMode(SS_PIN, OUTPUT);      // set the SS pin as an output (necessary to keep the board as master and not SPI slave)
digitalWrite(SS_PIN, HIGH);   // and ensure SS is high

// Ensure we are in a consistent state after power-up or a reset button
// These pins are standard for the Arduino w5100 Rev 3 ethernet board
// They may need to be re-jigged for different boards
  pinMode(ETHER_CS, OUTPUT); 		// Set the CS pin as an output
  digitalWrite(ETHER_CS, HIGH); 	// Turn off the W5100 chip! (wait for configuration)
  pinMode(SD_CS, OUTPUT);           // Set the SDcard CS pin as an output
  digitalWrite(SD_CS, HIGH); 		// Turn off the SD card! (wait for configuration)

  // initialize the SD card.
  Serial << F("Setting up SD card...\n");
  if (!card.init(SPI_FULL_SPEED, SD_CS)) // Pass over the speed and Chip select for the SD card
 {

The above is 100% correct.

Based on your variable names , you are switching the Arduino in and out of SPI master/slave mode, rather than switching the SD on & off.

HC