Handling fixed size packet over serial between arduino and PC

My arduino program for automotive use will receive a fixed size packet over serial at a fixed interval (maybe every 100 ms). It will always have the same size and format:

Packet:
< byte 1, byte 2, byte 3, byte 4 >
byte 1 = pedal % (0 to 100)
byte 2 & 3 = engine RPM (0 to 10000)
byte 4 = gear (1-15)

My question is, how should a fixed size packet be constructed to be sent over serial? Or more specifically, how do you read the contents of a single packet without getting mixed up between the start/end markers and packet data of the same value?

For example, say a struct packet is being sent from a visual C# interface to the arduino:

struct controllerPacket {
    char startMarker;
    uint8_t pedalPos;
    uint16_t engRPM;
    uint8_t gear;
    char endMarker;
};


controllerPacket packet = { '<', 62,  4500,  3, '>' };
// hex conversion =        { 3C, 3E, 11, 94, 03, 3E }

So as you read bytes from incoming serial you find the start marker, '<' (hex 3C), but then find the end marker '>' (hex 3E) as the next byte, even though the hex 3E is really part of the data and signifies 62% pedal. My first thought was to do:
if (Serial.available >= 6) // take 6 bytes as a packet, check for start/end markers and process data

My issue with that is what if a byte gets lost or missed and the 6 byte packet you read no longer lines up with the data and then you lose data from the packet:

{ 3C, 3E, 11, 94, 03, 3E } missed byte, take another 6 bytes ---> { OLDBYTE, 3C, 3E, 11, 94, 03} **lost end marker

Even if you are just reading bytes waiting for '<' and then counting until 6 bytes have filled your packet, how would you prevent reading a '<' in the data and have a bad starting byte position?

My question is, how should a fixed size packet be constructed to be sent over serial?

My question is whether you really need to send RPM with that level of granularity. Can you really distinguish between 8500 RPM and 8502 ROM?

If you could live with sending RPM to the nearest 100 RPM, then you only need to send 3 bytes, and none of them would be greater than 100.

You could then use a sync byte with a value greater than 100, and send just 4 bytes.

Even if you NEED RPM to that granularity, you could send it where RPM = b1 * 100 + b2, so that b1 and b2 never have values over 100, and you'd send 5 bytes, including the sync byte.

Read the 1st byte, confirm it is the start symbol and continue receiving. If not, do your error housekeeping.
Read and store the next N bytes ignoring their content.
Read the end symbol and verify it. If it's OK, process your message. If not, do your error housekeeping.
If data stops prematurely or continues past the correct size, do your error processing.

@PaulS, you are correct that there is no point in having a single RPM resolution. Using 10k RPM as a super safe upper limit, 1 byte max value = 255. 10k / 255 = 39.2. Use a factor of byte_value * 40 for RPM calculation to reduce the packet by a byte. Thanks for the tip!

@boolrules, this makes a lot sense wrote out like that and now I feel dumb for even asking the question lol. For some reason my brain was stuck on having to read each byte in the "data" portion of the packet and do something with it right then.

Have a look at the examples in Serial Input Basics - simple reliable ways to receive data. There is also a parse example to illustrate how to extract numbers from the received text.

The technique in the 3rd example will be the most reliable.

You can send data in a compatible format with code like this

Serial.print('<'); // start marker
Serial.print(value1);
Serial.print(','); // comma separator
Serial.print(value2);
Serial.println('>'); // end marker

...R

Serial.read() returns an int 0-255, the value of the first byte in the serial buffer, or -1 if no value exists. If you use Serial.available() as a condition for Serial.read(), you can avoid -1 and cast the value as a byte or a char. Byte 1 is probably an easy conversion to char, byte, int, unsigned int, long or unsigned long as it very likely looks identical in all variable types. Byte 2 and 3 are questionable. How are the bytes representing the value? If they are an int, you simply shift the high byte and or them together into one int. The last byte may very much be as the first, for 0-15.

dtbingle:
controllerPacket packet = { '<', 62, 4500, 3, '>' };
// hex conversion = { 3C, 3E, 11, 94, 03, 3E }[/code]

So as you read bytes from incoming serial you find the start marker, '<' (hex 3C), but then find the end marker '>' (hex 3E) as the next byte, even though the hex 3E is really part of the data and signifies 62% pedal.

A: To overcome this kind problem (data byte and control byte are mingled as is seen in the above quote) in serial communication, the Intel-Hex formatted frames (used in Arduino IDE for uploading codes) are transmitted over the UART Port. The structure of Intel-Hex formatted frame is:

:           xx         xxxx       xx         xxxxxx, ..., xxxx      xx
Field-a   Field-b  Field-c     Field-d  Field-e                    Field-f

Field-a The ASCII code 3A for : indicates the start of an incoming Intel-Hex frame.
Field-b Number of information byte of the frame = bytes contained in Field-e.
Field-c The beginning address (arbitrary) of a buffer (RAM) for the storage of information byte.
Field-d A zero value indicates that there more frames to arrive in this transmission session.
Field-e Indicates the actual information bytes.
Field-f Indicates checksum and is computed as follows:
All data bytes from and including Field-b to Field-e are added. The carry is discarded. Two's complement of the remaining lower 8-bit is taken as CHKSUM and transmitted as the last byte in the frame.

B: Example: Intel-hex frame for OP's data bytes ((decimal)62 4500 3 = (binary)3E 1194 03)
: 04 1000 00 3E119403 06
(a) (b) (c) (d) (e) (f)

C: The Bit Pattern of Transmission over UART Port:
3A3034313030303033453131393430333036 (the ASCII codes for the symbol/digits of Intel-hex frame)

The start of the frame is found by looking at 3A which will never occur for any other byte in the frame; the end of the frame can be found by counting the character-length of the frame from Field-b (number of characters in the frame = Field-b2 + 2(2+1+1)).

@Robin2, I have looked through that thread with many thanks to you and all of the others who provided example code and explanation. From what I saw, the examples were aimed towards reading some string of serial data with an unknown length. For example, a name where start/end markers were used to read it in - , , , etc. In my situation, example 6 (same idea, but as raw binary data) would be closer to how the data would be coming in. I was stuck up on:

The examples that follow assume that the binary data will NEVER include the byte values used for the start- and end-markers.

In my case, the byte values used for start and end markers WOULD show up in the binary data. As boolrules pointed out, I can modify your example to simply treat the data as a black box of N byte length - only checking for start and end markers in the proper byte position. THEN handle the black box of data if the start/end markers are correct.

To address your other point, I'm avoiding sending numbers as text with comma separators due to the overall larger size that the "packet" be.

@Perehama, This makes sense. To clarify, Byte 2 and 3 would be combined to make a single uint16_t, which can be done using bitshifts. So if engine RPM is 5000 = 0x1388, byte 2 would be 13 and byte 3 would be 88. Given the fixed format, it is pretty easy to merge them back together as a single uint16_t. But based on PaulS's advice, I can sacrifice some resolution and scale engine RPM back to a single byte by using a scalar.

static byte bytesRead = 0;
if (Serial.available) {
  char c = Serial.read();
  if (bytesRead == 0) {
    if (c >= 0 && c <= 100) {
      pedal = c;
      bytesRead++;
      }
    else {
      bytesRead = 0;
      }
    }
    else if (bytesRead == 1) {
...

The problem with this method is that all your bytes are i the same range, or could be in the same range. for example the pedal could be at 10, the first and second RPM bytes could be at 10, the 4th byte could be at 10, so how would you know without delimiters that you are on the right byte?
I would add a non-range byte or byte sequence, such as 255 255 255 etc. before of after the 3-4 bytes of data, so that if they don't check, you discard the whole packet and start at 0 until you have 3-4 valid data bytes AND initialization bytes.

Wow @GolamMostafa. A bit more error checking and robustness than I need for my project, but definitely addresses much skepticism I had about Serial port comm.

In my situation, reading a startmarker and endmarker with N bytes in between to process will probably work given it's low complexity. However, I see how this system would be necessary in more critical communications, like flashing software (checking each data packet against a checksum, specific memory storage location, etc).

So with this system, it seems in order to prevent having the startmarker show up inside the packet, your packet is an array of hex values and then you send each character in each hex value individually. This makes the startmaker impossible to show up in the data. Nifty.

In other words if you just consider the startmarker and first byte...

Human interpretation:
: 04 = startmarker + hex value indicating data is 4 bytes long

What's transmitted using this format:
':' '0' '4'
3A 30 34

EDIT:
@Perehama
You would know your byte position because it's a fixed format. Let's reduce engine RPM to one byte. So how boolrules wrote it out:

Format:
< pedal_byte engRPM_byte gear_byte >

Keep reading serial bytes until '<' is found.
Keep reading bytes until 3 bytes have been read and store into an array
Read next byte
-this should be endmarker or else packet is bad
-if it IS the endmarker, we know that of the 3 bytes read in as data, byte 1 = pedal, byte 2 = engine RPM, and byte 3 = gear

Note, the pedal, engRPM, and gear would be transferred as raw binary data, NOT as characters. This means pedal_byte is always one byte. It's not if pedal = 0-9, it's 1 char, if pedal = 10-99, it's 2 char, if pedal = 100, it's 3 char.

dtbingle:
Keep reading serial bytes until '<' is found.
Keep reading bytes until 3 bytes have been read and store into an array
Read next byte
-this should be endmarker or else packet is bad
-if it IS the endmarker, we know that of the 3 bytes read in as data, byte 1 = pedal, byte 2 = engine RPM, and byte 3 = gear

This is essentially what I am saying so we are in agreement on the approach. That said, we need to be sure the endmarker is exclusive to the endmarker and cannot be one of the other bytes. endmarker is ASCII? If so, an RPM value of 62 == '>' so perhaps the endmarker should be above 100 or whatever the max of byte 3 or 4 is, or perhaps it needs to be a two byte sequence that cannot be 3 and 4, or 2 and 3 or 1 and 2 etc.

dtbingle:
In my case, the byte values used for start and end markers WOULD show up in the binary data.

Why?

You only need to sacrifice 2 bytes out of 256 - less than 1%

Of course my examples are designed for variable length messages, but they will also work with fixed length messages and using start- and end-markers makes the system a great deal more robust.

If you really do need to be able to send any byte value from 0 to 255 then that can be done by setting aside 3 byte values for markers. For example 254 might be the start marker and 255 the end marker. And 253 can be an indicator that the next byte is NOT to be treated as a start or end marker. So, for example, to send the value 254 as data you can send 253 254.

I think you can see the value of sacrificing a couple of bytes :slight_smile:

...R

@OP

In case you are interested to transmit your data as Intel-Hex frames, I am giving below algorithms (program codes) on how to create Intel-Hex frame for given 'data bytes', transmit Intel-Hex frame over UART Port, receive Intel-Hex frame, and then retrieve the 'data bytes' from the received frame.

A: Creation of Intel-Hex Frame
1. Given data bytes : 3E 11 94 03 (binary/hex formatted) via:
byte dataArray[] = {0x3E, 0x11, 0x94, 0x03};

2. Length of intended Intel-Hex Frame (in byte): 1(a) + 1(b) + 2(c) + 1(d) + 4(e) + 1(f) = 10
byte frameArray[10]; //to be filled up by program codes

byte dataArray[] = {0x3E, 0x11, 0x94, 0x03};
byte frameArray[10]; //to be filled up by program codes
//---------------------------------------------------

frameArray[0] = ':'; //Filed-a start of frame
frameArray[1] = 0x04;    //Field-b
frameArray[2] = 0x10;    //upper byte of Field-c
frameArray[3] = 0x00;    //lower byte of Field-c
frameArray[4] = 0x00;    //Field-d; code to mark ;not EOF when sending multi-frame from a file
//----------------------------------------------------

for(int i=0, j=5; i,4, j<9; i++, j++)
{
   frameArray[j] = dataArray[i];
}

frameArry[9] = chksum();         //normal sum baed CHKSUM for error control

//----------------------------------------------------

byte chksum()
{
   byte sum = 0;
   for(int k=1; k<9; k++) //add all bytes of frame :
   {
      sum += frameArray[k];
   }
   sum =~sum; //every bit is inverted
   sum++;       //1 is added to get 2's complement
   return sum;  //the value of sum enters into frameArray[9] of Main Line Program
}

B: Transmit Intel-Hex Frame over UART Port

void sendFrame()                       //conver digit to ASCII and write to HC-12 for transmission
{
   mySerial.write(frameArray[0]);         //always send the synchronize marker (:) as binary code
   Serial.print((char)frameArray[0]);    //show the character on Hard Serial Monitor
  
  for(int i = 1; i<10; i++)
  {
    byte x = frameArray[i];
//    Serial.print(frameArray[i]);
    byte x1 = x;
    x1 = x1>>4;
    if (x1 <=9)
    {
      x1 = x1+0x30;             
      mySerial.write(x1);      //transmit: 0x30 - 0x39 for 0 - 9 for the 1st digit of a byte
       Serial.write(x1);
       //HERE: goto HERE;
    }
    else
    {
      mySerial.write(x1+0x37);  //transmit: 0x41 - 0x46 for A - F for the 1st digit of a byte
       Serial.write(x1+0x37);
    }
  //--------------------------------
    x = frameArray[i];
    x = x & 0x0F;
    if (x <=9)
    {
      x = x+0x30;
      mySerial.write(x);      //transmit: 0x30 - 0x39 for 0 - 9 for the 2nd digit of a byte
       Serial.write(x);
    }
    else
    {
      mySerial.write(x+0x37);  //transmit: 0x41 - 0x46 for 0=A - F for the 2nd digit of a byte
       Serial.write(x+0x37);
    }
    
  } 
  Serial.println();    //enter new line 
}

C: Reception of Intel-Hex Frame over hardware UART Port

bool flag = false;          //Frame synchronizer has not come
byte frameArray[19];             //To 19 characters (19 data byte); : (3A) is not stored here
int i=0;
byte x;

void setup() 
{
  Serial.begin(9600);             // Serial port to computer
}

void loop() 
{
   if (i == 19) //complete frame is received
  {
    //process received frame to retrieve 'information byte' -->(binary)3E 1194 03 ((decimal)62  4500  03)
    //buildAndShowInformation();
    i=0;
    flag = false;
  }
}

//-----------------------------------------------------------

void serialEvent()  //
{
   if (flag == true)
   {
    while (Serial.available()) 
      {       
        x = Serial.read();
        Serial.write(x);         // Send the data to Serial monitor
        frameArray[i] = x;    //save in array
        i++; 
      }
   }

   else
   {
    byte s = Serial.read();
    if ( s != 0x3A)
    {
      flag = false;
    }
    else
    {
      if( s == 0x3A)   //: (3A) is found as the start of incoming Intel-Hex frame
      {
        Serial.write(s);
        flag = true;
        i++;
      }
    }
    
  }
}

D: Process received characters to form 'byte oriented data'
//byte infoData[4]; This array will hold the information: 3E 11 94 03 (binary)

void buildAndShowInformation()
{
   Serial.println();
   for(i=9, j=0; i<17, j<4; i++, j++)
   {
      y = frameArray[i];
      if (y < 0x41)
      {
        y = y & 0x0F;
        y = y << 4;
      }
      else
      {
        y = y - 0x37;
        y = y << 4;
      }
     
      //-------------------
      i = i+1;
      y1 = frameArray[i];
      if (y1 <0x41)
      {
        y1 = y1 & 0x0F;
        
      }
      else
      {
        y1 = y1 - 0x37;
       
      }
   //   Serial.println(y1, HEX);
      //------------------------
      y = y|y1;
    //  Serial.println(y, HEX);
      infoData[j] = y;
   }
}

Thanks, the code makes sense to me (I think haha). Maybe I'll give it a shot after getting it running with the basic start/end marker version first.

I like your original transmission scheme. { '<', 62, 4500, 3, '>' }. That's a particularly good example because character 62 is '>'. You see that in the hex translation: 0x3E. You correctly reasoned that the simple receive-with-end-markers will detect the second byte as an end marker and you've got an incorrect packet.

So the basic error-correction and re-sync should always count characters. It will only accept a '>' if it occurs in the 6th position. But don't do it via Serial.available(). Just read the characters in one-by-one and examine the 6th position in your buffer. If it isn't, then slide the characters left until you get another '<' at the 1st position.

Another thing you might consider checking is look for a packet of the structure "><...>" because it's even more unlikely (although not impossible) that you'll get >< in your data. There is always some combination of noise and dropped characters that will give you invalid data.

To improve the noise immunity even further, a checksum can be used, like in the format that Golam explained. But for 4 bytes, you may as well just send each packet twice and only accept the data if both copies match. That would take 12 bytes instead of 18 in Golam's worked example.

GolamMostafa:
The start of the frame is found by looking at 3A which will never occur for any other byte in the frame;

You have not shown how this restriction is enforced. What happens if DT wants to send 58% in his first data byte? What if the checksum is 0x3A?

: (start of an incoming Intel-hex) frame is sent as 00111010; the data byte 0x3A (CHKSUM) is sent as two ASCII frames (00110011 and 01000001). So, there is no way for the CHKSUM to mingle with : (00111010).

Data bytes of the Intel-Hex frame are limited within these pattern: 0011000 - 00111001 (0 - 9); 01000001 - 01000110 (A - F), and this pattern does not contain 00111010. The reception of the Intel-hex frame begins by finding 00111010 and ends by receiving known number of bytes which is computed in real time from the incoming Intel-hex frame itself.

MorganS:
You have not shown how this restriction is enforced. What happens if DT wants to send 58% in his first data byte? What if the checksum is 0x3A?

Maybe I misunderstood the intel-hex, but I think each character in 58 is transmitted separately to prevent the possibility of 3A occurring in the packet data. Something like
: 5 8
3A 05 08

Not like:
: 58
3A 3A

EDIT: ooops didn't see Golam's reply

dtbingle:
Maybe I misunderstood the intel-hex, but I think each character in 58 is transmitted separately to prevent the possibility of 3A occurring in the packet data. Something like
: 5 8
3A 05 08

Not like:
: 58
3A 3A

EDIT: ooops didn't see Golam's reply

In Intel-Hex Frame, the data stream :58~~%~~ is sent as:

00111010(:) 00110101(5) 00111000(8) 00100101(%) //these are ASCII codes of : 5 8 %
3A 35 38 25

00111010(:) 00110101(5) 00111000(8)
3A 35 38

Edit: In view of Post#18.

kbblk-7.png

Well that would be stupid. You're sending a % symbol and the original packet didn't need that symbol to know what it contains. You've introduced a new problem: which of these various packets contains which piece of data?

Like I said, I like the original scheme. It just needs a little thought put into the receiver side to re-sync on a lost '<' and handle a '>' or '<' appearing in the data bytes.

Why on earth would you use Intel hex format in a serial comm situation?. I've been design transmission networks for more than 40 years and have worked with virtually every protocol that's ever been invented and I must say that this is a first.