Problems processing a character string

I’m processing data sent to my Arduino Uno that is manually entered via the com port. The data is stored in a character string and by using pointers, strtok(), strcmp(), atoi() and atof() I’m able to process the data (if entered “correctly”) and get the instructions I need. I’m now trying to handle cases where the data is entered by an unsympathetic user.

I check the first part of the string for specific sets of characters, and if I don’t get an “acceptable” value, all the data gets ignored and the user is prompted to re-enter the data. I welcome any thoughts on how to handle this any “better” than I already have.

Here is where I need the help.

In some cases, there is a second part to the string that is supposed to get converted to a 16 bit integer. I have not figured out how to handle the 2nd part of the string if its characters do not result in integer when processed by the atoi() command. Basically, I want to ignore all of the strings that fall outside of the character range of -32768 to 32767. One possible way forward might be to “ignore” strings that contain non-numerical characters, but I’m stumped on how to handle cases where the numerical characters would fall outside the 16 bit int range, or if the first strtok() completely consumes the string.

In the code below, I’m trying to correct for the condition where the user incorrectly enters a single “B”, or a “B “, followed by nothing or something other than the character equivalent of a 16bit int.

const byte numChars = 32;  //could be as high as 64 bytes
char receivedChars[numChars];
char tempChars[numChars];        // temporary array for use when parsing
char messageFromPC[numChars] = {0};
int integerFromPC = 0;
bool newData = false;
void setup() {
    Serial.begin(9600);
    Serial.println("type 'LIST' for valid inputs format");
    delay (1000);  
}

/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

void loop() {
    checkSerial();
}

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

void checkSerial() {

    static byte ndx = 0;
    char endMarker = '\n';  // '\n' is for nonprintable newline, check bottom right of serial monitor and set to 'Newline' '\r' is for 'Carriage return'
    char readChar;

    while (Serial.available() > 0 && newData == false) {
        readChar = Serial.read();  // reads a character
            if (readChar != endMarker) {  
                receivedChars[ndx] = readChar;  // writes the character to an array
                ndx++;
                if (ndx >= numChars) {  // limits the string to 32 characters
                    ndx = numChars - 1;
                }
            }
            else {  // '\n' is the Enter key, and when pressed will terminate the string
                receivedChars[ndx] = '\0'; // strings must be always terminated by null character
                ndx = 0;
                newData = true;
            }
    }
    if (newData == true) {
        strcpy(tempChars, receivedChars); // copy string to preserve original because strtok() destroys the string it works on
        parseData();
        newData = false;
    }
}

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////

void parseData() {      // split the data into its parts

    char * strtokIndx; // this is used by strtok() as an index by declaring a pointer to a memory address
    strtokIndx = strtok(tempChars," "); // get the first part of string which is terminated by null character
    strcpy(messageFromPC, strtokIndx); // copy first part of string to messageFromPC

// put this strtok in an if loop so that it is only called if the correct message is received
   if (strcmp(messageFromPC, "LIST") == 0){
       Serial.println("LIST    = returns list of all commands to serial port"); 
       Serial.println("A       = returns something"); 
       Serial.println("B       = returns an integer and expects input format 'B 1000' "); 
   }
   else if (strcmp(messageFromPC, "A") == 0){// this ignores everything after the first string
       Fonzi();  
   }
   else if (strcmp(messageFromPC, "B") == 0){// Expects an int and NEEDS ERROR CORRECTION
       strtokIndx = strtok(NULL, " ");       // this continues where the previous call left off
       integerFromPC = atoi(strtokIndx);     // convert this part to an integer
       Serial.print("B = ");
       Serial.println(integerFromPC);
   }
   else{ //this should handle all cases where the first part of the string is invalid
      Serial.println("data entered is invalid, type 'LIST' for valid inputs");
      Serial.println();

   }
}
/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

void Fonzi(){
    Serial.println("Aaaaaaaaaaaaaa!");
}

References: I have tried to absorb as much of this thread as possible, https://forum.arduino.cc/index.php?topic=396450.0 and I have borrowed heavily from example 5. Per recommendations from this forum, I have purchased a copy of “C Programming Language 2nd Edition” by Kernighan & Ritchie, but I am an impatient beast and it has not yet arrived.

I also found this link helpful to find the functions that will convert the data into other lengths of ints. https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_73/rtref/itoi.htm

You can use strtol(); below for demo purpose

void setup()
{
  Serial.begin(57600);

  strcpy(receivedChars, "12345678aa");

  char *endptr;
  long val = strtol(receivedChars, &endptr, 10);
  if (endptr == receivedChars)
  {
    Serial.println("No digits found");
  }
  Serial.println(val);
  if (*endptr != '\0')
  {
    Serial.print("Non-digit found; after number you got: ");
    Serial.println(endptr);
  }


  Serial.println("type 'LIST' for valid inputs format");
  delay (1000);
}

To me, this is a perfect case for a parser..

#include "lilParser.h"

 
enum commands {   noCommand,  // ALWAYS start with noCommand. Or something simlar.
                  cmdA,       // The rest is up to you. help would be a good one. Have it list
                  cmdB,       // What the other commands are, how they work and what they do.
                  };          // Our list of commands.

lilParser   ourParser;        // The parser object.

void setup() {
   
   //Serial.begin(9600);
   Serial.begin(57600);
   
   // Link typed in strings to commands.
   // These can be any string without witespace.
   ourParser.addCmd(cmdA,"A");   // Type "A"  for command A.
   ourParser.addCmd(cmdB,"B");   // Type "B"  for command B.
}


// Your loop where it parses out all your typings.
void loop(void) {

   char  inChar;
   int   command;
   
   if (Serial.available()) {                       // If serial has some data..
      inChar = Serial.read();                      // Read out a charactor.
      Serial.print(inChar);                        // If using development machine, echo the charactor.
      command = ourParser.addChar(inChar);         // Try parsing what we have.
      switch (command) {                           // Check the results.
         case noCommand : break;                   // Nothing to report, move along.
         case cmdA      : handleCmdA();   break;  // Turn the LED on (HIGH is the voltage level)
         case cmdB      : handleCmdB();   break;  // Turn the LED off by making the voltage LOW
         default        : sendHelp();     break;   // No idea. Try again?
      }
   }
}



// ******* Command handlers *******



void handleCmdA() {
   
   Serial.println("Aaaaaaaaaaaaaa!");
}


void handleCmdB() {

   char* charBuff;                                
   int   value;                                    
   
   if (ourParser.numParams()) {              // If they typed in somethng past the command.
      charBuff = ourParser.getParamBuff();   // We get the first parameter, assume its the number.
      value = atoi(charBuff);                // convert this part to an integer
      free(charBuff);                        // Dump the parameter buffer ASAP.
      Serial.print("B = ");                  // Do the output thing.
      Serial.println(value);
   } else {
      Serial.println("Where is my number, huh?");
   }
}


void sendHelp() {
   
   Serial.println("LIST    = returns this list of all commands."); 
   Serial.println("A       = returns something"); 
   Serial.println("B       = returns an integer and expects input format 'B 1000' "); 
}

If you would like to try this way, instal LC_baseTools from your friendly IDE library manager.

-jim lee

jimLee,

Thanks!

I'll need to look into this further. I'm running multiple steppers in my ultimate program and I'm trying to avoid blocking functions. This being said, this looks useful and worthy of further exploration

The use of atoi() has a huge disadvantage of not only simply ignoring non numeric characters, but also returning 0 in case of a 'non-value' I usually just write my own version of it and use that. Mine returns -1 in case error (any non numeric characters or out of bounds, which in my case is 0 - 255, but this is one of the easiest functions to write yourself. No need for anything complicated, just test every character to see if it is within the bounds of '0' - '9' (possibly a '-' for the first character) If not, return error value, else multiply added value by 10 and add character - '0'. If you find the null terminator or another marker, return the value.

jimLee,

Your parser works great so far, thanks! I still need to get under the hood a little bit further. FYI is works fine on a teensy4.0, but seems to crap out on an Arduino Nano Every

My C Programming book arrived and is super helpful regarding my original direction.

Oh, and my wife grew up on Anacortes, I know it well!

Deva_Rishi,

I'm going down the rabbit hole you suggested and am making some progress. Using strlen() and isdigit() functions is proving to be helpful.

I'm now also stuck on the case where the user enters a carriage return with no data before it. This crashes the program I originally posted.

underconstrained:
I’m now also stuck on the case where the user enters a carriage return with no data before it.

Always terminate the string as it comes in.

receivedChars[ndx++] = readChar;  // writes the character to an array
         receivedChars[ndx] = '\0';

… but make sure there’s enough room in your string…

TheMemberFormerlyKnownAsAWOL,

I thought the original code I posted does precisely what you mentioned while also including a check to make sure the string is not too long. I'm still not sure what is breaking the code when the carriage return is entered with no data preceding it .

No, I don’t think I would have posted what I did, if you had done as you said you had.

You should always ensure there is space for your terminator.

Got it. If I understand correctly, my checkSerial() allows for the creation of a string receivedChar[0] of zero length. Honestly, I was just following example 4 from https://forum.arduino.cc/index.php?topic=396450.0

strtok() appears to crap out when it sees a string of zero length. I'll fix checkSerial() to prevent the creation of a zero length string.

In the meantime, I just nested the offending strtok() inside in if statement that checks to see if string length is != 0.

Cheers,

underconstrained:
strtok() appears to crap out when it sees a string of zero length. I'll fix checkSerial() to prevent the creation of a zero length string.

What do you mean by "crap out"? strtok() should return a null pointer in that case.

@underconstrained

You're wife grew up here?! Small world! You know the old falling down bong shop with the dragon on the side? My Workshop is in the back of that old building.

As for the parser, it was originally written for a teensy 3.2. Never had an issue running it on UNOs before. Would be interested if you could let me know what you find. Its no good if it don't behave well on them. I do have an old nano in the shop..

What part of the planet did you guys end up on?

-jim lee

Check out my tutoral Arduino Serial I/O for the Real World
That includes non-blocking readers and string parsing functions that won't crash your program.

christop:
What do you mean by "crap out"? strtok() should return a null pointer in that case.

On an ESP strtok() may have unexpected results.

Here is a safer version using Strings (that does have any memory fragmentation).

String receiveStr;
int integerFromPC = 0;

void setup() {
  Serial.begin(9600);
  for (int i = 10; i > 0; i--) {
    Serial.print(i); Serial.print(' ');
    delay(500);
  }
  Serial.println();
  Serial.println("type 'LIST' for valid inputs format");
  // receiveStr.reserve(80); // prevent heap fragmentation not really necessary here but good pratice
}

void loop() {
  if (checkSerial()) {
  //  Serial.println(receiveStr); // echo input
    parseData();
    receiveStr = ""; // finished with this input
  }
}

bool checkSerial() {
  char endMarker = '\n';  // '\n' is for nonprintable newline,
  //check bottom right of serial monitor and set to 'Newline' or Both NL and CR
  // '\r' is for 'Carriage return'
  while (Serial.available() > 0) {
    char c = Serial.read();
    if ( c == endMarker) {
      return true;
    }
    receiveStr += c;
  }
  return false;
}

void parseData() {      // split the data into its parts
  int startIdx, endIdx;
  receiveStr.trim(); // removes any leading spaces and any trailing /r
  if (receiveStr.length() == 0) {
    // empty line skip
    return;
  }
  receiveStr += ' '; // for final indexOf;
  receiveStr.toUpperCase();
  startIdx = 0;
  endIdx = receiveStr.indexOf(" ", startIdx);

  String messageFromPC = receiveStr.substring(startIdx, endIdx);

  if (messageFromPC == "LIST") {
    Serial.println("LIST    = returns list of all commands to serial port");
    Serial.println("A       = returns something");
    Serial.println("B       = returns an integer and expects input format 'B 1000' ");
  }
  else if (messageFromPC == "A") {// this ignores everything after the first string
    Fonzi();
  }
  else if (messageFromPC ==  "B") { // Expects an int and NEEDS ERROR CORRECTION
    startIdx = endIdx + 1; // for next loop step over ' '
    endIdx = receiveStr.indexOf(" ", startIdx);
    String intStr = receiveStr.substring(startIdx, endIdx);
    integerFromPC = intStr.toInt();     // convert this part to an integer
    // returns 0 if the int is missing,  use SafeString toInt() for a better method
    Serial.print("B = ");
    Serial.println(integerFromPC);
  }
  else { //this should handle all cases where the first part of the string is invalid
    Serial.println("data entered is invalid, type 'LIST' for valid inputs");
    Serial.println();
  }
}
/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

void Fonzi() {
  Serial.println("Aaaaaaaaaaaaaa!");
}

My SafeString library has a much more robust toInt() method that return false if it fails

Deva_Rishi:
On an ESP strtok() may have unexpected results.

That's still very vague. What results do you get? Does it return a pointer to the string "banana"? That would be unexpected. (And if strtok() doesn't return a null pointer when it's given an empty string, then it would be a good idea to report that bug to the ESP developers.)

And if strtok() doesn't return a null pointer when it's given an empty string, then it would be a good idea to report that bug to the ESP developers.

The problem as I understand it is that strtok() returns a null pointer, i.e. a pointer with the value of 0 and therefore points to memory address 0. That is OK in the avr architecture, but it creates an error on the esp8622. I don't know about the esp32.

See the discussion here

cattledog:
The problem as I understand it is that strtok() returns a null pointer, i.e. a pointer with the value of 0 and therefore points to memory address 0. That is OK in the avr architecture, but it creates an error on the esp8622. I don't know about the esp32.

Ah, then it's actually a bug in the application and not in the strtok() function itself. The application should be written to detect a null pointer because strtok() is expected to return a null pointer when a string is zero length (or, to be more general, has no tokens).

Use my SafeString library available from the library manager, (detailed tutorial here)
Its stoken() method does not have those types on 'bugs' and won't make your sketch crash.

SafeString also has detailed error messages that help you find other logic issues in your string parsing