Reliable Serial Binary Newline Character?

Hi all,

I am sending raw binary readings from an ADC through the Bluetooth Serial (on an ESP32 Board running Arduino). The format is as follows:

data1 data2 data3 data4 loopTime \r \n (All int32_t except "\r\n")

The code excerpt for that is:

// Send binary to Bluetooth Serial Port
ESP_BT.write((byte*)&data1, sizeof(int32_t));
ESP_BT.write((byte*)&data2, sizeof(int32_t));
ESP_BT.write((byte*)&data3, sizeof(int32_t));
ESP_BT.write((byte*)&data4, sizeof(int32_t));
ESP_BT.write((byte*)&t_finish, sizeof(int32_t));
ESP_BT.write('\r');
ESP_BT.write('\n')

On my PC (MacOS) I read the serial data using PySerial in the following manner:

for i in range(10000):
    value = SER.read_until(b'\r\n')
    split_bytes = [value[4 * i:4 * i + 4] for i in range(5)]  
    decoded_data = np.array(
                   [int.from_bytes(split_bytes[i], "little", signed=False) for i in range(5)],
                   dtype=float
                   )

I should expect packets of 22 bites as marked by the '\r\n' newline character but I find this isn't reliable, as occasionally this shows up in the transmitted data itself (see highlighted line in attached image). The monitor output is: decoded_data, len(value), value

As you can see '\r\n' shows up in data and splits the packet to 18 bites instead of 22. Obviously, I can handle this fairly easily with additional code, but since I am seeking a specific data rate (~500Hz) I want a cleaner way of doing this.

Is there a better way to indicate a 'newline' character to split up my binary packets without having to send too many/ any additional bites?

Thanks in advance! :smiley:

The usual way around your problem is to change the message format to add a byte at the beginning of the massage and give the exact length of the message, including the new byte. Get rid of the carriage return and new line as you have it now. Then your read will have to be sure to read the number of bytes before processing the message.

Thanks for the quick reply Paul,

I don't think I quite catch on. I know that my packets will always be a fixed size of 5 bytes, so I can tell the PC to read_until() exactly 5 bytes plus the byte at the front as you say. But how do I synchronise the start byte so that it is not taken as being part of the data?

Read 6 bytes, one at a time and put into an array of bytes. The first byte, zero, will be the length of the message. If you will NEVER, ever need to change the number of bytes, you do not even need the length.

You cannot use the SER.read_until() anymore. Just do it yourself.

So lets say your expected input is a "start byte" (say 0x02), and you expect 5 bytes of data, you could do something like...

void loop() 
{

  byte buffer[10];
  int idx = 0;
  boolean started = false;

  while (Serial.available() > 0)
  {
    byte b = Serial.read();

    if (b == 0x02)        // Start byte received
    {
      idx = 0;
      started = true;
    }

    if (started)          // Ignore anything received until we get a start byte.
    {
      buffer[idx++] = b;  // Add bytes to my buffer

      if(idx == 5)         // Received all bytes?
      {
        // Process buffer
        // Clear buffer
        started = false;  // Wait for another start byte
      }
    }
  }
        
}```

This makes sense now thanks guys, including why a starting, rather than an ending character, makes more sense. I have played around using 'o\i' as a random character and it does not show up in the data (as of my limited testing), albeit adding 3 extra bites to each packet.

My question still, if the start byte is 0x02 per say, that will certainly show up in the middle of my data packet, so I may very well get my 5 bytes but the data will be framed incorrectly - am I missing something here?

I was just reading up Consistent Overhead Byte Stuffing (COBS) which looks interesting, but I was hoping to avoid using the added complexity of these embedded system protocols if I can.

Not if you use the code I posted above. It will essentially discard any bytes received prior to the start byte. It will then accept the next 5 bytes, then again discard anything before the next start byte.

For a fixed length input this should be fine. If you have a variable length input then this is better suited to having an "end byte" as well.

I can see how it does that red_car, however I still have two fundamental issues, which is my fault by lack of detailed explanation.

Problem 1:
I am receiving fixed data packets, but the binary bytes are ADC readings and will vary significantly. From some experimentation, I can fairly safely assume any bite combination (in hex code on PC side) will eventually show up in the data. Hence 0x02 for example will not be a good starting byte since it will show up periodically.

Problem 2:
The above problem doesn't exist as you say, as long as you get a perfect communication synchronisation from the start of the data sending event. However, the readings will eventually be used in an application, with a lot of stop/start behaviour on the serial communication based on user inputs, so if 0x02 is showing up all over the place, and the COM port is opened half way through a data packet... I think the issue is clear.

What I am hoping for, is being directed towards a clever way of using a unique 'start' or 'end' byte without the complexity of something like COBS. Very grateful for all the wise advice I've being given so far :slightly_smiling_face:

You might try this to see if there is any improvement. First set up a unique EOL sequence in the microprocessor.

Serial.write((byte*)&data1, sizeof(int32_t));
Serial.write((byte*)&data2, sizeof(int32_t));
Serial.write((byte*)&data3, sizeof(int32_t));
Serial.write((byte*)&data4, sizeof(int32_t));
Serial.write((byte*)&data5, sizeof(int32_t));
Serial.write("****");

In python check for EOL and packet length

while x<=10000:

	EOL="****".encode()
			
			my_data=SER.read_until(EOL)
			
			if EOL in my_data and len(my_data)==24:
				print(my_data)
				x+=1
			else:
				print("Error")

adjust this to display your 5 values, it will give an indication of how many errors you get

1 Like

Is the sending application able to reserve a specific value (say 0xFF) to indicate a start byte?

If not, then your only real option is to use multiple bytes in sequence to be the start marker.

Even then it is not straightforward, if a transmission can terminate mid stream. I was thinking something like below...

void setup()
{

}

void loop()
{

  byte startBytes[3] = {0xFF, 0x02, 0xFE};
  uint8_t startByteCount = 0;

  byte buffer[10];
  int idx = 0;
  boolean started = false;

  while (Serial.available() > 0)
  {
    byte b = Serial.read();


    if (started)           // Ignore anything received until we get a start byte sequence.
    {
      buffer[idx++] = b;   // Add bytes to buffer

      if (idx == 5)        // Received all bytes?
      {
        // Process buffer
        // Clear buffer
        started = false;   // Wait for another start byte sequence
      }
    }
    else
    {
      if (b == startBytes[0])                                 // 1st start byte received
      {
        startByteCount = 1;
      }
      else if (b == startBytes[1] && startByteCount == 1)     // 2nd start byte received
      {
        startByteCount = 2;
      }

      else if (b == startBytes[2] && startByteCount == 2)     // 3rd start byte received
      {
        startByteCount = 0;
        idx = 0;
        started = true;
      }
      else                                                    // Non start byte
      {
        startByteCount = 0;
      }

    }
  }
}

but this will fail if let's say you have received 4 bytes of the message, then it fails, and begins with another start sequence.

What is the time between transmissions? It might be better to read everything received into a buffer, then evaluate the buffer looking for the start sequence pattern.

1 Like

It seems like an unique EOL sequence will be the way to go, and just deal with the added load on the serial line.

The data consists of 32-bit numbers from the ADC code, so anything between 0x00 and 0xFF for each of the four bytes per channel. So no, probably no reserved bytes as far as I can tell.

The transmission frequency is approximately 500 Hz (2ms) with my current hardware, however the goal is 1000Hz (1ms) and the application needs to be "real-time". Hence, why I'm searching for a computationally-simple solution. I have tried a 3-byte EOL with success so far, but I might switch to a 2-byte 'start-byte' and 2-byte 'end-byte' approach for robustness. Let me know your thoughts!

What ADC are you using that can produce a 32 bit number with the high order byte = 0Xff?

Paul it is the ADS1262 from Texas Instruments. Technically, getting a full-scale reading will likely not happen with a load cell (which is what is measuring) however I think its more of a synchronisation issue, since 0xFF will show up in the mid bytes or LSB of the 32-bit numbers.

I have given this a bit of thought this afternoon.
My solution would be to split the data into 4 bit nibbles and put each nibble into a byte with the 4 high order bits set to ZERO. That will double the number of bytes being transferred, but now you can code the high order bits of ONE bite to indicate either the start of a complete message OR the last byte of a complete message. Your choice.
The receiving device will, of course, need to put the nibbles back into complete bytes to recover the original information.
What do you think?

I can see you are exploring your options, here is something else to consider.

My earlier example printed the data and perhaps was not the best idea. I have a stand alone python script here that so far has been reading 10K packets containing 24 bytes per packet in about 20.4 seconds on my set up with zero errors.

The serial runs at 115200 baud rate and is using the 4 asterisk as EOL. We know there are 24 bytes in each packet and the EOL in effect acts as the start and the end marker of each packet, both of these things help us produce the simplest of protocols.

I used a UNO in the test and found that by sending a signal from the arduino that told my script that the arduino serial had begun I was able to cut out some of the errors that occured at the start of the serial stream, here is what that looks like.

void setup()
{
Serial.begin(115200);  

Serial.print("<>");
}

It is acknowledged in the script with the "Arduino_Started" variable, you may or may not need that little routine.

The python script is below and splits the data into it's 5 parts separated with commas and writes it all to a StringIO file in memory. When it has finished it displays the contents of the memory file reports the number of errors closes the serial and finally the memory file.

import serial
import io

ser=serial.Serial("COM7",115200,timeout=5)
memory_file=io.StringIO()
err_cnt=0
idx=0
i=0

Arduino_Started="<>".encode()
EOL="****".encode()
	
ser.read_until(Arduino_Started)
print("aquisition started")			

while idx<10000:

	my_data=ser.read_until(EOL)

	if EOL in my_data and len(my_data)==24:	
		for i in range (0,20,4):

			memory_file.write(f",{str(int.from_bytes(my_data[i:i+3],'little'))}")
		idx+=1

	else:
		err_cnt+=1
		print("Error")

print(memory_file.getvalue())
print("Errors=" f"{err_cnt}")
ser.close()
memory_file.close()

P.S. although it reports errors those errors are not written to file

@Paul_KD7HB I had a similar thought... but rather that double the number of bytes by breaking into nibbles, just set the high bit of each data byte to zero. The original bits could be stored in one extra byte... so 5 bytes total required. Using that approach 0xFF could be used as the record delimiter, as it would never appear in the data.

You know, it's neither difficult nor expensive to implement an escape mechanism to ensure that a particular byte value appears ONLY as a packet terminator.

static const char TERMINATOR = 12;
static const char ESCAPE = 13;

void sendbyte(byte b) {
  if (b == TERMINATOR || b == ESCAPE) {
    b += 10;               // make the byte different.
    Serial.write(ESCAPE);  // but precede it by an escape.
  }
  Serial.write(b);         // write original or modified byte.
}

byte readbyte() {
  char b = *bufptr++;      // next character
  if (b == ESCAPE) {       // escaped?
    b = (*bufptr++) - 10;  // get next byte and modify backward.
  }
  return b;
}
1 Like

This is extremely helpful thank you everyone. I am going to try the asterisk EOL sumguy and I particularly like the idea of indicating start/stop transmissions between devices.

Thats quite neat westfw, I can probably make 0xFF the terminator and modify it to take the int32 array as input and split into bytes...

Will have a go at implementing all this and get back to you all soon!

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.