Serial Input Advanced

This is an advanced tutorial that builds upon Robin2’s tutorial Serial Input Basics.

In this tutorial, we will go over how to:

  • construct more robust serial data packets
  • implement COBS
  • calculate Checksum and CRC values
  • handle transmission of multi-byte values
  • properly handle the reception of bad packets.

All of what is covered in this tutorial can be referenced in SerialTransfer.h (and the Python package pySerialTransfer) that is used to reliably and quickly transmit serial data.

Motivation: Many embedded projects rely on the transmission of serial data between several Arduino boards. For instance, a hobbyist that is building an RC car might use serial radios (i.e. XBee transceivers) for command and control. Another user might design a project where an Arduino in a field sends weather telemetry via a serial radio to another Arduino at the user’s house for display on an LCD screen.

In all cases, the project designer will want data transferred from one Arduino to the next with minimal latency, low overhead, and reliability. Over the following replies, this tutorial will give you the tools to design software for your projects with truly robust serial communication.

1 Like

Packet Anatomy:

When planning an Arduino project that requires UART serial communication between two or more boards, careful thought must be put into the serial “packet anatomy”. This is a low level template/schema of all the bytes within a complete packet. Knowing the packet anatomy allows you to understand how to create a packet in software on the transmitting Arduino AND how to properly parse the packet data in software on the receiving Arduino.

Before we get into the specifics of how to design a packet anatomy, here is a (very simple) example:

00111100 01111111 00101100 00000001 00111110
| | | | | | | | |_|End byte (‘>’)
| | | | | | |
|Second potentiometer reading
| | | | |
|_________________Delemiter (‘,’)
| | |
|___________________First potentiometer reading
|
|______________________________Start byte (‘<’)

With this packet anatomy, the context and meaning of each byte can be deciphered based on the position of the byte within the packet. For instance, if the above packet was received, the receiving Arduino can be programmed to grab the 2nd byte in the packet to be stored as the first potentiometer reading.

There is a problem with this packet anatomy, though. What happens if a byte is dropped or corrupted during transmission? The receiving Arduino will not be able to detect these errors based off of the design of our current packet anatomy. We can add certain fields to the packet to give the receiving Arduino more information about the packet as a whole. This data will be used to "catch" these transmission errors to identify corrupted packets.

These fields include CRC and payload length. The CRC field is used to identify byte droppage and packet corruption. The payload length field is useful to determine byte droppage while also providing an extra advantage: allowing robust transmission of packets with varying length. Both of these will be discussed in detail in future posts.

In addition to packet integrity problems, using a delimiter character (',') is inefficient because it doubles the packet's payload length while providing no real usefulness.

Lastly, if the parsing algorithm is designed to "restart" when reading in a start byte ('<') when parsing the payload, what happens if one of the payload bytes actually is '<'? The parsing algorithm will think the rest of the packet was dropped during transmission of the payload and a new packet is currently being parsed. In short, your entire packet and all of its associated data will be lost if one of the payload bytes happen to have the same value as the start-byte. In order to handle this case, you can implement COBS (Consistent Overhead Byte Stuffing). COBS will be detailed in greater detail in a future post.

Putting everything together and accounting for the above considerations, one might design a more robust packet anatomy such as:

01111110 11111111 00000000 00000000 00000000 ...... 00000000 10000001
| | | | | | | | | | | | | | ||__Stop byte
| | | | | | | | | | | | |
|8-bit CRC
| | | | | | | | | | ||____________________Rest of payload
| | | | | | | | |
|_______________________2nd payload byte
| | | | | | |
|1st payload byte
| | | | |
|
_# of payload bytes
| | |
|_______________________________________________COBS Overhead byte
|
|________________________________________________________Start byte

Calculating CRC Values:

In order to learn the theory of CRC and how to calculate them, check out this website. The website also has example C code for each step of the tutorial.

Note that when calculating CRC values, it is most efficient to create a lookup table at the beginning of the program. Then you can just use the table to lookup CRC values on the fly.

Here is an example C++ class that you can use to calculate 8-bit CRC values via a lookup table:

class CRC
{
  public: // <<---------------------------------------//public
    uint8_t poly = 0;




    CRC(uint8_t polynomial = 0x9B, uint8_t crcLen = 8)
    {
      poly      = polynomial;
      crcLen_   = crcLen;
      tableLen_ = pow(2, crcLen);
      csTable   = new uint8_t[tableLen_];

      generateTable();
    }

    void generateTable()
    {
      for (int i = 0; i < tableLen_; ++i)
      {
        int curr = i;

        for (int j = 0; j < 8; ++j)
        {
          if ((curr & 0x80) != 0)
            curr = (curr << 1) ^ (int)poly;
          else
            curr <<= 1;
        }

        csTable[i] = (byte)curr;
      }
    }

    void printTable()
    {
      for (int i = 0; i < tableLen_; i++)
      {
        Serial.print(csTable[i], HEX);

        if ((i + 1) % 16)
          Serial.print(' ');
        else
          Serial.println();
      }
    }

    uint8_t calculate(uint8_t val)
    {
      if (val < tableLen_)
        return csTable[val];
      return 0;
    }

    uint8_t calculate(uint8_t arr[], uint8_t len)
    {
      uint8_t crc = 0;

      for (uint16_t i = 0; i < len; i++)
        crc = csTable[crc ^ arr[i]];

      return crc;
    }




  private: // <<---------------------------------------//private
    uint16_t tableLen_;
    uint8_t  crcLen_;
    uint8_t* csTable;
};

Here is the Python class that is equivalent to the C++ class above:

import sys

class CRC(object):
    def __init__(self, polynomial=0x9B, crc_len=8):
        self.poly      = polynomial & 0xFF
        self.crc_len   = crc_len
        self.table_len = pow(2, crc_len)
        self.cs_table  = [' ' for x in range(self.table_len)]
        
        self.generate_table()
    
    def generate_table(self):
        for i in range(len(self.cs_table)):
            curr = i
            
            for j in range(8):
                if (curr & 0x80) != 0:
                    curr = ((curr << 1) & 0xFF) ^ self.poly
                else:
                    curr <<= 1
            
            self.cs_table[i] = curr
    
    def print_table(self):
        for i in range(len(self.cs_table)):
            sys.stdout.write(hex(self.cs_table[i]).upper().replace('X', 'x'))
            
            if (i + 1) % 16:
                sys.stdout.write(' ')
            else:
                sys.stdout.write('\n')
    
    def calculate(self, arr, dist=None):
        crc = 0
        
        try:
            if dist:
                indicies = dist
            else:
                indicies = len(arr)
            
            for i in range(indicies):
                try:
                    nex_el = int(arr[i])
                except ValueError:
                    nex_el = ord(arr[i])
                
                crc = self.cs_table[crc ^ nex_el]
                
        except TypeError:
            crc = self.cs_table[arr]
            
        return crc

Using these classes, you can calculate the CRC value of any subset of your packet and stuff the result in the packet's CRC field (as defined in the packet anatomy). Remember that the subset of the packet the CRC is calculated over is arbitrary - it will work as long as both the transmitter and receiver algorithms are standardized. However, it's a good practice to calculate the CRC over the entire payload at a minimum.

For instance, let's say we have a packet of 4 payload bytes with the values [2, 5, 8, 1]. Next, let's define the subset of the packet that the CRC will be calculated over is solely the payload bytes. Lastly, let's use the 8-bit CRC polynomial 0x9B (x^7 + x^4 + x^3 + x + 1).

In this case, we can find the CRC value for this packet the following ways:

Python:

myCRC = CRC(polynomial=0x9B, crc_len=8)
myCRC.calculate([2, 5, 8, 1])

Arduino:

#include "CRC.h"

CRC myCRC(0x9B, 8);

void setup()
{
  Serial.begin(115200);

  uint8_t payload[] = {2, 5, 8, 1};
  Serial.println(myCRC.calculate(payload));
}

void loop()
{
  // do nothing
}

The result of both operations should be the 8-bit value (in decimal):

19

Now, the value of 19 can be inserted into the CRC field of the packet (if transmitting), or be compared to the value in the CRC field of a received packet (if receiving).

Note that when receiving a packet, if the calculated CRC value and the CRC value found in the received packet do not match, your packet is corrupt!!

An aside:

Some serial standards use simple checksums instead of CRCs. Although there is less overhead with checksums, they are much less likely to detect packet corruption.

The most common checksum is the "add and invert" checksum. This checksum requires all bytes to be added (ignoring overflow) with the result of the additions bitwise inverted to find the checksum.

Example Checksum Generation Code:

uint8_t findChecksum(uint8_t arr[], uint16_t arrLen)
{
  uint8_t output = 0;

  for (i=0; i<arrLen; i++)
    output += arr[i];

  return ~output;
}

An example device that uses this checksum algorithm is the DFPlayerMini MP3 module.

Consistent Overhead Byte Stuffing:

Recall in Reply #2 this paragraph, which is the motivation for this section:

Lastly, if the parsing algorithm is designed to "restart" when reading in a start byte ('<') when parsing the payload, what happens if one of the payload bytes actually is '<'? The parsing algorithm will think the rest of the packet was dropped during transmission of the payload and a new packet is currently being parsed. In short, your entire packet and all of its associated data will be lost if one of the payload bytes happen to have the same value as the start-byte.

Consistent Overhead Byte Stuffing (or COBS) is a low-byte-overhead method of avoiding false start/end bytes within the packet payload. The main idea behind COBS is to add a field in the transmitted packet that contains a pointer. This pointer will point to the first occurrence of the value of the start/end byte in the original packet payload. The value pointed to is then replaced with another pointer. This pointer then points to the second value of the start/end byte in the original packet payload. This then repeats until the entire packet has been "stuffed".

The parsing algorithm needs to "unpack" the stuffed bytes in the same process as above, but in reverse.

You can use COBS to deconflict the use of the start byte or end byte. For example, this Wiki article describes how to do end byte stuffing. However, it is possible to do start byte stuffing, as is implemented in the SerialTransfer libraries.

In the next post, we'll go over some example code on how to implement COBS for your own packets.

COBS Start Byte Stuffing Code:

First, we'll go over how to stuff a packet based on the packet start byte (as a transmitter) and then how to "unpack" the same packet (as a receiver).

Assume we're using the data payload as described in Reply #2: [2, 5, 8, 1]. Note that this array only represents the payload portion of the packet we need to send. Next, let's also assume our start byte is 0x7E ('~'). In order to determine the packet's COBS byte we need to find the address of the first instance of the start byte in the payload. This is easy for this particular example since the start byte ('~') isn't present at all in the payload. Because of this, the COBS byte for this particular packet will be 255.

You might be tempted to say 0 is also a good default value, but this won't work. If the COBS byte is 0, this means the first payload byte is equal to the start byte, which is false for this example. Because of this, the max payload size for our serial protocol is 255. With this systematic limitation, we can use 255 as a COBS "error code" since the payload will never have element 255. (I hope that explanation makes sense :D)

However, let's say our next packet's payload is now [1, 126, 126, 0] while using the same start byte as the previous packet. In this case, the COBS byte will not be 255 since the start byte is present in the second and third elements of the payload array (126 = 0x7E). This means the packet's COBS byte will now be 1 (remember indexing starts at 0).

Here is a C++ function that can be used to determine the packet's COBS byte:

/*
 byte calcOverhead(uint8_t arr[], uint8_t len)
 Description:
 ------------
  * Calculates the COBS (Consistent Overhead Stuffing) Overhead
  byte and stores it in the class's overheadByte variable. This
  variable holds the byte position (within the payload) of the
  first payload byte equal to that of START_BYTE
 Inputs:
 -------
  * uint8_t arr[] - Array of values the overhead is to be calculated
  over
  * uint8_t len - Number of elements in arr[]
 Return:
 -------
  * byte overheadByte - Payload COBS overhead byte
*/
byte calcOverhead(uint8_t arr[], uint8_t len)
{
 byte overheadByte = 0xFF;

 for (uint8_t i = 0; i < len; i++)
 if (arr[i] == START_BYTE)
 overheadByte = i;

        return overheadByte;
}

Here is the Python function that mirrors the above C++ function:

def calc_overhead(txBuff, pay_len):
    '''
    Description:
    ------------
    Calculates the COBS (Consistent Overhead Stuffing) Overhead
    byte and stores it in the class's overheadByte variable. This
    variable holds the byte position (within the payload) of the
    first payload byte equal to that of START_BYTE
    
    :param pay_len: int - number of bytes in the payload
    
    :return: overheadByte
    '''

    overheadByte = 0xFF

    for i in range(pay_len):
        if txBuff[i] == START_BYTE:
            overheadByte = i
            break
    
    return overheadByte

After finding the packet's COBS byte, we then need to stuff the rest of the payload before transmission.

Let's continue the last example with payload [1, 126, 126, 0] and COBS byte of 1. Since the COBS byte is 1, we then find the distance between the second element with the next element that holds the value of the start byte. In this case, that value is 1 since the third element also holds 126. We then replace the first element with this "distance". The payload now looks like [1, 1, 126, 0]. Be careful, the job isn't over! We have one last instance of the start byte in the payload (third element). Since this is the last element needed to be stuffed, we replace it with 0 like so: [1, 1, 0, 0]. Now we're done!

Let's look at the fully stuffed packet:
0x7E 0x01 0x01 0x01 0x00 0x00 0x81 (start byte, COBS byte, 1st payload byte, 2nd payload byte, 3rd payload byte, 4th payload byte, end byte)

Here's a C++ function to stuff the payload:

/*
 void stuffPacket(uint8_t arr[], uint8_t len)
 Description:
 ------------
  * Enforces the COBS (Consistent Overhead Stuffing) ruleset across
  all bytes in the packet against the value of START_BYTE
 Inputs:
 -------
  * uint8_t arr[] - Array of values to stuff
  * uint8_t len - Number of elements in arr[]
 Return:
 -------
  * void
*/
void stuffPacket(uint8_t arr[], uint8_t len)
{
 int16_t refByte = findLast(arr, len);

 if (refByte != -1)
 {
 for (uint8_t i = (len - 1); i != 0xFF; i--)
 {
 if (arr[i] == START_BYTE)
 {
 arr[i] = refByte - i;
 refByte = i;
 }
 }
 }
}

where findLast() is defined as:

/*
 int16_t packetStuffing(uint8_t arr[], uint8_t len)
 Description:
 ------------
  * Finds last instance of the value START_BYTE within the given
  packet array
 Inputs:
 -------
  * uint8_t arr[] - Packet array
  * uint8_t len - Number of elements in arr[]
 Return:
 -------
  * int16_t - Index value of the last instance of START_BYTE in the given packet array (-1 if not found)
*/
int16_t findLast(uint8_t arr[], uint8_t len)
{
 for (uint8_t i = (len - 1); i != 0xFF; i--)
 if (arr[i] == START_BYTE)
 return i;

 return -1;
}

The Python mirror of the above functions are:

def find_last(txBuff, pay_len):
    '''
    Description:
    ------------
    Finds last instance of the value START_BYTE within the given
    packet array
    
    :param txBuff:  list - payload
    :param pay_len: int  - number of bytes in the payload
    
    :return: int - location of the last instance of the value START_BYTE
                   within the given packet array
    '''

    if pay_len <= MAX_PACKET_SIZE:
        for i in range(pay_len - 1, 0, -1):
            if txBuff[i] == START_BYTE:
                return i
    return -1

def stuff_packet(txBuff, pay_len):
    '''
    Description:
    ------------
    Enforces the COBS (Consistent Overhead Stuffing) ruleset across
    all bytes in the packet against the value of START_BYTE
    
    :param txBuff:  list - payload
    :param pay_len: int  - number of bytes in the payload
    
    :return: void
    '''

    refByte = find_last(txBuff, pay_len)

    if (not refByte == -1) and (refByte <= MAX_PACKET_SIZE):
        for i in range(pay_len - 1, 0, -1):
            if txBuff[i] == START_BYTE:
                txBuff[i] = refByte - i
                refByte = i

This post is getting long, so I'll go over how a receiver can "unpack" COBS-stuffed packets in the next post.

"Unpacking" COBS-Stuffed Packets:

Assume we've received the packet 0x7E 0x01 0x01 0x01 0x00 0x00 0x81 as described in the previous post, where 0x7E is the start byte, 0x01 is the COBS byte, the byte stuffed payload is [1, 1, 0, 0], and the end byte is 0x81.

In order to determine if "unpacking" is even necessary, we check to see what the value of the COBS byte is. If the COBS byte is 255, "unpacking" is unnecessary and the payload can be parsed "as is". In this example, however, the COBS byte is 1. Before replacing the value in the payload at index 1 with the value of the start byte, we must remember its current value (1) for the next step.

Next, we replace the current value with the start byte value. The payload is now [1, 126, 0, 0]. Since the value of the replace index was 1, we move one byte to the right, take note of that value and replace it with the value of the start byte. Since the value currently one byte over is 0, we've reached the last index of the payload that was COBS "stuffed", so all we need to do is replace it with 126.

The final payload is [1, 126, 126, 0], which is identical to the original payload as defined in the previous post.

Here is an example C++ function that can be used to "unpack" a COBS-stuffed payload:

/*
 void unpackPacket(uint8_t arr[], uint8_t len)
 Description:
 ------------
  * Unpacks all COBS-stuffed bytes within the array
 Inputs:
 -------
  * uint8_t arr[] - Array of values to unpack
  * uint8_t len - Number of elements in arr[]
 Return:
 -------
  * void
*/
void unpackPacket(uint8_t arr[], uint8_t len)
{
	uint8_t testIndex = recOverheadByte;
	uint8_t delta = 0;

	if (testIndex <= MAX_PACKET_SIZE)
	{
		while (arr[testIndex])
		{
			delta = arr[testIndex];
			arr[testIndex] = START_BYTE;
			testIndex += delta;
		}
		arr[testIndex] = START_BYTE;
	}
}

Here is the Python mirror of the above function:

def unpack_packet(recOverheadByte, rxBuff, pay_len):
    '''
    Description:
    ------------
    Unpacks all COBS-stuffed bytes within the array
    
    :param recOverheadByte: int - COBS byte
    :param rxBuff:  list - received packet payload
    :param pay_len: int  - number of bytes in the payload
    
    :return: void
    '''

    testIndex = recOverheadByte
    delta = 0

    if testIndex <= MAX_PACKET_SIZE:
        while recOverheadByte[testIndex]:
            delta = rxBuff[testIndex]
            rxBuff[testIndex] = START_BYTE
            testIndex += delta

        rxBuff[testIndex] = START_BYTE

In the next post we'll go over how to easily transfer multi-byte values such as floats and structs.

Transferring Multi-Byte Values:

So far we've only been discussing how to send a packet of individual, single byte values inside our packets. However, many users need to be able to send data with greater precision per value than can be achieved with 8 bits.

Regular (16-bit) Ints:


This is the easiest example to show since we only have 2 bytes to deal with. In this case, we can specify a packet payload of 2 bytes to send the entire int. We can then use bit-masking and bit-shifting to place the MSB (Most Significant Byte) in one of the indexes of the payload and the LSB (Least Significant Byte) in the other index of the payload.

For instance, let's say our transmitting Arduino has a sensor, who's current output is 1400 and we want to send that sensor's value to a receiving Arduino for datalogging. Since we can't fit the value 1400 into a single byte, we'll have to use a 16-bit int. Whether the MSB or LSB comes first in the payload is arbitrary as long as both the transmitter and receiver both agree on the order.

Note: if the MSB is first the order is called Big Endian and Little Endian if the LSB is first. Also note that Arduinos are Little Endian.

For this example, let's define the byte order as Little Endian. We then define the packet anatomy as the following (ignoring COBS for simplicity):

Start-Byte | Sensor-LSB | Sensor-MSB | End-Byte

We can then stuff the payload fields using code such as the following:

uint16_t sensor = 1400;

uint8_t msb = (sensor >> 8) & 0xFF;
uint8_t lsb = sensor & 0xFF;

uint8_t packet[4] = {0x7E, lsb, msb, 0x81};

We can do the same thing in Python:

def msb(val):
    return byte_val(val, num_bytes(val) - 1)


def lsb(val):
    return byte_val(val, 0)


def byte_val(val, pos):
    return int.from_bytes(((val >> (pos * 8)) & 0xFF).to_bytes(2, 'big'), 'big')


def num_bytes(val):
    num_bits = val.bit_length()
    num_bytes = num_bits // 8

    if num_bits % 8:
        num_bytes += 1
    
    if not num_bytes:
        num_bytes = 1

    return num_bytes


if __name__ == '__main__':
    sensor = 1400
    
    msb = msb(sensor)
    lsb = lsb(sensor)
    
    packet = [0x7E, lsb, msb, 0x81]
    print(packet)

When receiving the integer, we simply apply the bit-masking/shifting in the opposite order:

C++:

uint8_t packet[4] = {0x7E, 0x78, 0x5, 0x81};

uint8_t msb = packet[2];
uint8_t lsb = packet[1];

uint16_t sensor = (msb << 8) | lsb;

Python:

packet = [0x7E, 0x78, 0x5, 0x81]

msb = packet[2]
lsb = packet[1]

sensor = (msb << 8) | lsb
print(sensor)

Floats:


Sometimes users need to transfer floating point numbers. For example, an Arduino collecting telemetry on a model rocket might need to send GPS coordinates to a ground station for datalogging/display on an LCD screen. What you can do in this case is multiply the sensor value by a given amount, transfer the value as a 16-bit integer (as described above), reconstruct the int on the receiving Arduino, and then divide by the same amount while saving to a float variable. Do be aware that you will lose accuracy with this method! If you want to retain full accuracy of the float during transmission, you will have to use the "Generalized Technique" as described in the next section!

Note: IEEE standard for representing floats in memory.

Generalized Technique: (C++ only - not Python)


A more elegant solution for transferring multi-byte values is to simply copy the bytes directly from memory into the packet's payload. We can do this easily with 2 pointers - one pointer to keep track of the current byte of the value we're copying over and one pointer to keep track of where we're at in the packet's payload (where we're saving the value to).

Here's an example function you can use to copy over the values from any object (except for "S"trings) to the packet payload:

	/*
	 void txObj(T &val, uint8_t len, uint8_t index)
	 Description:
	 ------------
	  * Stuffs "len" number of bytes of an arbitrary object (byte, int,
	  float, double, struct, etc...) into the transmit buffer (txBuff)
	  starting at the index as specified by the argument "index"
	 Inputs:
	 -------
	  * T &val - Pointer to the object to be copied to the
	  transmit buffer (txBuff)
	  * uint8_t len - Number of bytes of the object "val" to transmit
	  * uint8_t index - Starting index of the object within the
	  transmit buffer (txBuff)
	 Return:
	 -------
	  * bool - Whether or not the specified index is valid
	*/
	template <typename T>
	bool txObj(T &val, uint8_t len, uint8_t index=0)
	{
		if (index < (MAX_PACKET_SIZE - len + 1))
		{
			uint8_t* ptr = (uint8_t*)&val;

			for (byte i = index; i < (len + index); i++)
			{
				txBuff[i] = *ptr;
				ptr++;
			}

			return true;
		}

		return false;
	}

We can then use pointers in a similar way on the receiving end to save an object (such as a full 32-bit float) directly to memory using a function like this:

	/*
	void rxObj(T &val, uint8_t len, uint8_t index)
	 Description:
	 ------------
	  * Reads "len" number of bytes from the receive buffer (rxBuff)
	  starting at the index as specified by the argument "index"
	  into an arbitrary object (byte, int, float, double, struct, etc...)
	 Inputs:
	 -------
	  * T &val - Pointer to the object to be copied into from the
	  receive buffer (rxBuff)
	  * uint8_t len - Number of bytes in the object "val" received
	  * uint8_t index - Starting index of the object within the
	  receive buffer (txBuff)
	 Return:
	 -------
	  * bool - Whether or not the specified index is valid
	*/
	template <typename T>
	bool rxObj(T &val, uint8_t len, uint8_t index=0)
	{
		if (index < (MAX_PACKET_SIZE - len + 1))
		{
			uint8_t* ptr = (uint8_t*)&val;

			for (byte i = index; i < (len + index); i++)
			{
				*ptr = rxBuff[i];
				ptr++;
			}

			return true;
		}

		return false;
	}

Next we'll go over how to parse packets without blocking code via a FSM (Finite State Machine).

Packet Parsing with FSM (Finite State Machine):

Before being able to correctly design a packet parsing algorithm, we must first define our protocol's packet anatomy. For this example, we're going to use all concepts previously covered in our packet anatomy plus support for dynamic payload sizes. This packet anatomy will directly mirror that which is used in the SerialTransfer libraries:

Now that our packet anatomy is defined, we can begin designing our parsing algorithm. To outline the procedure the parser will need to complete, we'll make a finite state machine (FSM) diagram. A FSM is basically a high level framework for a process/procedure that can be depicted in something similar to a flowchart. FSMs can also be easily implemented in C++ using enum'd states and switch/case structures.

An example of a FSM implemented in C++ for a turnstile:

C++ Code:

enum fsm {
  locked,
  unlocked
};
fsm state = locked;

setup()
{
  // do nothing
}

loop()
{
  bool coin = true;

  switch (state)
  {

  case locked:
  {
    if (coin)
      state = unlocked;
    break;
  }

  case unlocked:
  {
    if (!coin)
      state = locked;
    break;
  }
  }
}

For more info on FSMs, check out the wiki.

Back to our parser. In order to create a parsing FSM, we will need to handle all of the following states:

  • find start byte
  • find COBS overhead byte
  • find payload length
  • find and process payload
  • find and verify crc byte
  • find end byte

To setup the FSM, we can create a set of enumerated types to represent each of these states:

enum fsm {
  find_start_byte,
  find_overhead_byte,
  find_payload_len,
  find_payload,
  find_crc,
  find_end_byte
};
fsm state = find_start_byte;

In order to implement the parser's FSM, we'll use this diagram as a reference:

Now that we have the overall idea of what our parser should do, we can look into how to actually implement it. For brevity, I'll just post the C++ and Python code that implements the parser's FSM:

C++:

/*
 uint8_t available()
 Description:
 ------------
  * Parses incoming serial data, analyzes packet contents,
  and reports errors/successful packet reception
 Inputs:
 -------
  * void
 Return:
 -------
  * uint8_t - Num bytes in RX buffer
*/
uint8_t available()
{
	if (Serial.available())
	{
		while (Serial.available())
		{
			uint8_t recChar = Serial.read();

			switch (state)
			{
			case find_start_byte://///////////////////////////////////////
			{
				if (recChar == START_BYTE)
					state = find_overhead_byte;
				break;
			}

			case find_overhead_byte://////////////////////////////////////
			{
				recOverheadByte = recChar;
				state = find_payload_len;
				break;
			}

			case find_payload_len:////////////////////////////////////////
			{
				if (recChar <= MAX_PACKET_SIZE)
				{
					bytesToRec = recChar;
					state = find_payload;
				}
				else
				{
					bytesRead = 0;
					state = find_start_byte;
					status = PAYLOAD_ERROR;
					return 0;
				}
				break;
			}

			case find_payload:////////////////////////////////////////////
			{
				if (payIndex < bytesToRec)
				{
					rxBuff[payIndex] = recChar;
					payIndex++;

					if (payIndex == bytesToRec)
					{
						payIndex = 0;
						state = find_crc;
					}
				}
				break;
			}

			case find_crc:///////////////////////////////////////////
			{
				uint8_t calcCrc = crc.calculate(rxBuff, bytesToRec);

				if (calcCrc == recChar)
					state = find_end_byte;
				else
				{
					bytesRead = 0;
					state = find_start_byte;
					status = CRC_ERROR;
					return 0;
				}

				break;
			}

			case find_end_byte:///////////////////////////////////////////
			{
				state = find_start_byte;

				if (recChar == STOP_BYTE)
				{
					unpackPacket(rxBuff, bytesToRec);
					bytesRead = bytesToRec;
					status = NEW_DATA;
					return bytesToRec;
				}

				bytesRead = 0;
				status = STOP_BYTE_ERROR;
				return 0;
				break;
			}

			default:
			{
				Serial.print("ERROR: Undefined state: ");
				Serial.println(state);

				bytesRead = 0;
				state = find_start_byte;
				break;
			}
			}
		}
	}
	else
	{
		bytesRead = 0;
		status = NO_DATA;
		return 0;
	}

	bytesRead = 0;
	status = CONTINUE;
	return 0;
}

Python:

def available():
    '''
    Description:
    ------------
    Parses incoming serial data, analyzes packet contents,
    and reports errors/successful packet reception
    
    :return bytesRead: int - number of bytes read from the received
                                  packet
    '''

    if ser.in_waiting:
        while ser.in_waiting:
            recChar = int.from_bytes(ser.read(),
                                     byteorder='big')

            if state == find_start_byte:
                if recChar == START_BYTE:
                    state = find_overhead_byte

            elif state == find_overhead_byte:
                recOverheadByte = recChar
                state = find_payload_len

            elif state == find_payload_len:
                if recChar <= MAX_PACKET_SIZE:
                    bytesToRec = recChar
                    payIndex = 0
                    state = find_payload
                else:
                    bytesRead = 0
                    state = find_start_byte
                    status = PAYLOAD_ERROR
                    return bytesRead

            elif state == find_payload:
                if payIndex < bytesToRec:
                    rxBuff[payIndex] = recChar
                    payIndex += 1

                    if payIndex == bytesToRec:
                        state = find_crc

            elif state == find_crc:
                found_checksum = crc.calculate(
                    rxBuff, bytesToRec)

                if found_checksum == recChar:
                    state = find_end_byte
                else:
                    bytesRead = 0
                    state = find_start_byte
                    status = CRC_ERROR
                    return bytesRead

            elif state == find_end_byte:
                state = find_start_byte

                if recChar == STOP_BYTE:
                    unpack_packet(bytesToRec)
                    bytesRead = bytesToRec
                    status = NEW_DATA
                    return bytesRead

                bytesRead = 0
                status = STOP_BYTE_ERROR
                return bytesRead

            else:
                print('ERROR: Undefined state: {}'.format(state))

                bytesRead = 0
                state = find_start_byte
                return bytesRead
    else:
        bytesRead = 0
        status = NO_DATA
        return bytesRead

    bytesRead = 0
    status = CONTINUE
    return bytesRead

Although I'm not going to go into detail on how exactly I designed the parser in code (too much typing I'm afraid, lol), I'm more than happy to answer any questions anyone may have about it.

Important Note:

You want to make sure your parser is as efficient as possible so that you don't tie up processing that may be needed for other elements of your project. Think of writing a parser similar to writing an ISR handler function - get in, do what processing you can, get out and finish later.

Note that both in C++ and Python, the parsing algorithm available() can (and should be) called many times before a packet is fully parsed without having to use blocking subroutines.

In the next post, we'll go over an example that follows the entire transfer process.