Converting data types - Guidance please

Apologies for a 2nd posting. This time I hope the subject may get wider attention.

I have files I read from an SD card. I use SD.h library to read the data. It has 10 fields. An ID, a data length number (always 8) and 8 fields of data payload.
Here is an example: 89F80173, 8, 4C, 1, D6, EB, 90, CD, 1A, 5A

All of the fields carry Hex data but they are treated by the SD as char types.

I can read the SD one line of data at a time and store each field in an array, but my requirement is to load the data into a predefined STRUCT so I can then send it to an CANbus interface.

The characteristics of the STRUCT are:
ID = 32bit unsigned
Data length = 8bit unsigned
Data = 8 x 8bit unsigned

I would appreciate guidance as to how to convert the char/string data to load into this STRUCT.

I have tried various methods using str-to-int style conversion but they do not allow handling the Hex codes.

Thanks in advance

You can specify the radix at the third parameter...

e.g.

const char hexstring[] = "89F80173";
void setup() {
  Serial.begin(9600);
  unsigned long value = strtoul(hexstring, NULL, 16);
  Serial.print(value);
}
void loop() { }
1 Like

this sounds like TLVs

first read the the 2 byte type and length and depending on the type then read the entire data into the struct for that type

you could take an approach like this

struct t_data {
  uint32_t  id;
  uint8_t   len;
  uint8_t   payload[8];
};

char lineToParse[] = "89F80173, 8, 4C, 1, D6, EB, 90, CD, 1A, 5A";

bool parseLine(const char* line, t_data& data) {
  Serial.print(F("\nParsing "));  Serial.println(line);
  char * tmpPtr = nullptr;
  data.id = strtoul(line, &tmpPtr, 16);
  data.len = strtoul(tmpPtr + 1, &tmpPtr, 16); // +1 to skip the comma
  for (byte i = 0; i < max(data.len, sizeof(data.payload)); i++) { // think twice about this line... :) 
    data.payload[i] = strtoul(tmpPtr + 1, &tmpPtr, 16);
  }
  return true;
}

void printData(t_data& data) {
  Serial.print(F("ID = 0x")); Serial.println(data.id, HEX);
  Serial.print(F("Length = ")); Serial.println(data.len);
  Serial.print(F("Payload = "));
  for (byte i = 0; i < data.len; i++) {
    if (data.payload[i] < 0x10) Serial.write('0');
    Serial.print(data.payload[i], HEX);
    Serial.write(' ');
  }
  Serial.println();
}

void setup() {
  t_data values;
  Serial.begin(115200);
  if (parseLine(lineToParse, values)) printData(values);
  else {
    Serial.print(F("Parsing error for "));
    Serial.println(lineToParse);
  }
}

void loop() {}

you would need to build more intelligence in the parseLine() function to catch parsing errors (check value range, check if tmpPtr returns nullptr etc), here I assume you always have a well formed line to parse.

Chris & gcjr, tks for the tips. I'll experiment.
J-M-L thanks for taking the time to provide the code. I am going to try to integrate it, it looks like it could suit my project.
Ray

You have a data frame containing ten ASCII formatted items. The sketch shown below describes a mechanism of obtaining numerical hex value of the ID (89F80173 in your example) extracted from the ASCII formated codes and save it in a structure. You may apply it to extract the other items and save them in the structure. (Add comma (,) as the last last character of the array, in which you are saving the frame, to apply for() loop.)

struct myData      //data structure from Post#4 of @J-M-L
{
  uint32_t  id;
  uint8_t   len;
  uint8_t   payload[8];
};

char myString[] = "89F80173, 8, 4C, 1, D6, EB, 90, CD, 1A, 5A,";  //insert , as the last
char myId[10];
unsigned long x = 0;
myData data;

void setup()
{
  Serial.begin(9600);
  int i = 0;
  do
  {
    myId[i] = myString[i];
    i++;
  }
  while (myId[i - 1] != ',');
  myId[i - 1] = '\0';

  for (int i = 0; i < 8; i++) //89F80173
  {
    if (myId[i] <= '9')
    {
      x = x << 4;
      x += (myId[i] - '0');   
    }
    else
    {
      x = x << 4;
      x += (myId[i] - 0x37);
    }
  }
  data.id = x;
  Serial.println(data.id, HEX);  //shows: 89F80173

}

void loop()
{

}

Why do you need to extract first the ID substring into the myID cString? You could Just parse it directly as you do by iterating over the input line until you find the comma. That will also avoid issues with the ID being shorter than 8 hex digits in case there is no 0 padding in front of the ID.

0x37 Is a magic value that makes the code hard to read just use ('A' - 10) that will be self explanatory and also show you only handle capital letters.
Some robustness could be added by using toupper() and verifying the character is indeed a legitimate hex digit and that you have at most 8 of them.

You don’t need x, you could build the value straight into data.id

Thankyou for all the valuable inputs/comments. I have played/stumbled around all day experimenting with your ideas...BUT... I now realise my first description of having the data from the SD as 'char' types is WRONG. Sorry, I just don't know enough!
I was reading the SD line by line using 'readStringUntil' so the result is placed in a String.
All of the suggestions so far have started with the data in 'char' so the process is different.
Is there a way to convert from a 'String' to the STRUCT?
Meantime I am looking at reading the data chr by chr into a 'char' variable/array.
Comments/suggestions welcome.
Ray

Assuming the String (capital S) is called lineToParse, you can use my exact parse function by calling it this way:

if (parseLine(lineToParse.c_str(), values)) printData(values);

the c_str() method gives you access to the underlying char * buffer

the code to test:

struct t_data {
  uint32_t  id;
  uint8_t   len;
  uint8_t   payload[8];
};

String lineToParse = "89F80173, 8, 4C, 1, D6, EB, 90, CD, 1A, 5A";

bool parseLine(const char* line, t_data& data) {
  Serial.print(F("\nParsing "));  Serial.println(line);
  char * tmpPtr = nullptr;
  data.id = strtoul(line, &tmpPtr, 16);
  data.len = strtoul(tmpPtr + 1, &tmpPtr, 16); // +1 to skip the comma
  for (byte i = 0; i < max(data.len, sizeof(data.payload)); i++) { // think twice about this line... :) 
    data.payload[i] = strtoul(tmpPtr + 1, &tmpPtr, 16);
  }
  return true;
}

void printData(t_data& data) {
  Serial.print(F("ID = 0x")); Serial.println(data.id, HEX);
  Serial.print(F("Length = ")); Serial.println(data.len);
  Serial.print(F("Payload = "));
  for (byte i = 0; i < data.len; i++) {
    if (data.payload[i] < 0x10) Serial.write('0');
    Serial.print(data.payload[i], HEX);
    Serial.write(' ');
  }
  Serial.println();
}

void setup() {
  t_data values;
  Serial.begin(115200);
  if (parseLine(lineToParse.c_str(), values)) printData(values);
  else {
    Serial.print(F("Parsing error for "));
    Serial.println(lineToParse);
  }
}

void loop() {}

Thank you so much for your usual inputs that always add values to my knowledgebase.

I was motivated to post this discrete lines to extract numerical hex value from ASCII to demonstrate that there is no function similar to atol() which works if the ASCII codes are limited to digits 0 - 9.

have you looked at my code? I'm using strtoul() which is better than atol() as you have ways to catch parsing errors and you can handle different Numerical base

I have plan to study and test/experiment your codes.

@OP
1. I will now explain to you (referring to @J-M-L codes) the working principle and application of the strtoul() function which @J-M-L has aplied in his skethces of this thread to extract the numerical hex values of your ASCII formatted data and then save them into a structure.

2. Note that you are reading ASCII formatted data bytes from SD Card, which could be easily saved in a cstring using the following commad:

byte m =  myFile.readBytesUntil(arg1, arg2, arg3);

3. Working Principle of strtoul() Function
Given:

char *tmpPtr = nullptr;
char lineToParse[] = "89F80173, 8, 4C, 1, D6, EB, 90, CD, 1A, 5A";
unsigned long x = strout(lineToParse, &tmpPtr, 16);
Serial.println(x, HEX); //shows: 89F80173

(1) The function will parse ASCII codes for the digits: 0 - 9, A - F (because base is 16) until a non-digit is encountered. The resultatnt numerical value (which is here: 0x89F80173) will be assigned to variable x.

(2) The 1st parsing will end after reading the first data item (89F80173), the reamaining characters (, 8, 4C, 1, D6, EB, 90, CD, 1A, 5A) will be there in the array while the tmpPtr pointer will point to the first charcater (the comma:,) of the remaining string.

(3) Next parsing command will be:

byte y = strout(tmpPtr+1, &tmPtr, 16); // comma (,) is skipped; parsing begins from next character
Serial.println(y, HEX);   //shows: 8

The next item 8 will be extracted and will be saved in variable y..

(4) And then the next parsing command and so on until all the data items are extracted and saved,

4. The Sketch

struct t_data
{
  uint32_t  id;
  uint8_t   len;
  uint8_t   payload[8];
};
t_data data;

char lineToParse[] = "89F80173, 8, 4C, 1, D6, EB, 90, CD, 1A, 5A";
char *tmpPtr = nullptr;

void setup()
{
  Serial.begin(115200);
  data.id = strtoul(lineToParse, &tmpPtr, 16);
  Serial.println(data.id, HEX);  //shows: 89F80173
  //----------
  data.len = strtoul(tmpPtr + 1, &tmpPtr, 16); //parse this: 8, 4C, 1, D6, EB, 90, CD, 1A, 5A
  Serial.println(data.len, HEX);  //shows: 8
  //----------
  for (int i = 0; i < sizeof(data.payload); i++)
  {
    data.payload[i] = strtoul(tmpPtr + 1, &tmpPtr, 16);
  }
  
  for (int i = 0; i < sizeof(data.payload); i++)
  {
    Serial.println(data.payload[i], HEX); //shows: 4C 1 D6 EB 90 CD 1A 5A
  }
}

void loop()
{

}

they are not stored in an array, they stay where they are, in the lineToParse array. You just get a pointer to the comma within that array (where the parsing stopped)

No parsing with continue at the space after the comma, and the strtoul() function knows how to skip those when looking for a number.

this

for (int i = 0; i < sizeof(data.payload); i++)

could be improved and take into consideration the length you just read instead of sizeof(data.payload)

I had assumed the OP could read and understand the documentation for strtoul()

1. When this command is issued : Serial.println(tmpPtr), then this stricng appears: , 8, 4C, 1, D6, EB, 90, CD, 1A, 5A from which I drew the said conclusion.

Ok! Now, I have understood the what the tmpPtr pointer does.

2. Yes! I will edit comment field of my post with the following message:
Comma is skipped and parsing begins from next character (could be a white).

3. I have seen the following inequality in your codes. Would appreciate to know in way it is better.

for(int i = 0; i<max(data.len, sizeof(data.payload)); i++)
{
    //statements
}

4. The online documention on strtoul() function is criptic (at least to me). The credit for what I have learnt about this function goes to @J-M-L's codes and my experiments.

glad you caught this : it was a mistake left for the OP to see if he would read the code... :wink:
instead of max(), it was supposed to be min() .

we receive a frame that includes an ID, the number of bytes in the payload and then the payload.
So I parse only data.len elements and to avoid any overflow risk, I limit the number of elements I read in to the space available in the array. hence you don't want to read in more than 8 bytes, but if the payload was only 3, you would not want to read 8 of them.

:slight_smile:

1 Like

Excellent!

Though @J-M-L is not diretly involved in the Zoom Classroom to teach my pupils; but, in the next class I am going (they are learning full ATmega328P MCU Hardware, moderate C, and slight C++ OOP Programming using Arduino UNO) to refer @J-M-L's code to show how codes could be improved!

Wow, I wake to find so much help from Golam and J-M-L. Thankyou. I'm not sure where you are but I am in Sydney Australia....and in tight stay-at-home lockdown due to Covid. This little project is a good use of my time but if you read my profile you will understand why I find some of this stuff hard work.
I do look closely at the code you have offered and try it out. I then consider how to integrate it with the basic stuff I am working with. I also research on the web language I don't understand using the Arduino Reference and others. Unfortunately these are often quite cryptic without examples so I still have to play around a bit to make sense of them. No bad thing because I learn something. I really should have bought a book like C++ for Dummies or Arduino for Dummies.
J-L-M your code did what I wanted but then I found the problem of taking my String data whilst you were working with char. I went through the code and commented it for my own understanding. I think I 'got' most of it but tracking all the stuff with pointers (once I learned what the '*' meant) was a bit harder.
Golum, I haven't had a chance to try your version because I was asleep, but will now and take into account your comments about cstring etc. I do appreciate you taking the time to explain the code. I look forward to experimenting with it in the next hour or two.
So, thanks for your inputs and understanding. I'll let you know how I get on.
Ray

J-L-M, I didn't spot your mistake with max() instead of min() but I did understand that you were creating limits for the process. I am pleased it is clarified because it could have caused problems later if I changed the incoming structure of the data being parsed.
R

89F80173, 8, 4C, 1, D6, EB, 90, CD, 1A, 5A

Every text char 0-9 and A-F represents a 4-bit nybble.
if ( dataChar >= '0' && dataChar <= '9' )
{
nybVal = dataChar - '0'; // turn decimal text to 0 to 9
}
else if ( dataChar >= 'A' && dataChar <= 'F' )
{
nybVal = 10 + dataChar - 'A'; // turn A to F text to 10 to 15
}

As each char is read, the struct variable being filled gets shifted left 4 bits and the new nybVal OR'd with the variable, repeat to fill -- the ID gets shifted and set 8 times.

Are those enough clues?