Req: Parse bytestream with Data Link Escaping via Serial to byte array

Hi

I’m trying to use an Arduino Mega to receive and parse streams of bytes from some laboratory equipment (received via the Serial1 port) into a 321 byte array in real-time.

My problem is how to implement a Data Link Escaping (DLE) protocol (also called “byte stuffing”, “octet stuffing”, or “data transparency” - see below) on the Mega in real time in a way which is efficient and stable.

A new packet of data is received at approximately 10 second intervals.
Character frame : Asynchronous. 19,200 baud 8-E-1 RTS/CTS
Packet length: Variable prior to parsing (>= 321 bytes: the actual number of bytes depends on number of additionalDLE bytes); Fixed length of 321 bytes after parsing.

Each packet of data is enclosed by a start and end flag: 0x7E.
Because the same flags are used for both start and end of data, there is also the possibility of the Arduino getting “out of phase” with the incoming data and trying to read the gaps between the flags rather than the the data between the flags, and the code needs to prevent this.

If the data bytes contain 0x7E, then this would prematurely signal “End of Message” (EOM).
Therefore a DLE protocol has been employed by the equipment manufacturer as follows:
0x7D acts like a “shift” key. It should not be parsed as data, but indicates that the next byte to be read should be XORed (^) with 0x20, so:
\0x7D\0x5E => \0x5E ^ \0x20 => 0x7E, and:
\0x7D\0x5D => \0x5D ^ \0x20 => 0x7D

I’ve successfully written code to do the above in Python on a laptop, using Python’s powerful string handling routines; but I’m struggling to adapt this to work using a byte array on the Mega.

I’m using Serial1 to handle the data and Serial to report back via the USB cable to the Arduino Serial Monitor to keep track of program flow and aid debugging.

Despite several different approaches, I’m getting unpredictable results with intermittent freezes, board resets, and mangling of the text returned to the Serial Monitor; occurring apparently at random.

I wondered if overflow of the Mega’s serial buffer might be to blame; therefore I’ve edited the core file HardwareSerial.cpp (as explained elsewhere on this forum) to increase the buffer size from 64 to 256 bytes; however this has not resolved my problems; and despite an extensive search of the forums and web, I’ve been unable to find a solution.

A crude (non-working) minimal example of my code is pasted below to indicate my general approach; although I’ve modified this repeatedly in desperation to try and get the **** thing to work!

I’d be very grateful for any help you can give me.

Many thanks in anticipation

Dave

const int RxPin = 19; // Serial1 Rx.Connect to Tx pin on RS232-TTL adaptor

byte data[321];
byte d = 0; // buffer for reading data byte
byte e = 0;// buffer for reading data byte

void setup()
{
// Serial Rx and Tx are automatically configured to USB & pins 1 and 2 respectively  
//Serial 1
pinMode(RxPin,INPUT);
Serial.begin(9600,SERIAL_8N1); // 
Serial1.begin(19200,SERIAL_8E1);

//=========================================================================================
// Main Loop starts here
//=========================================================================================
void loop()
{Serial.println("Starting main loop");  

//=========================================================================================
// Read incoming data from Serial1, applying correction for DLE bytes
//========================================================================================= 

int state = 1; // reset state to 'wait'
int n = 0; // reset pointer for data array

do{
Serial.println("Starting switch/case loop");
switch(state)
{ // start of switch/case loop
// state 1 = Wait; state 2 = Read; state 3 = Exit. Enter loop in "Wait" state.

//------------------------------------------------------
// State = "WAIT"
//------------------------------------------------------
case 1:
Serial.print("State = WAIT ");Serial.println(state);
while(true){
    if (Serial1.available()>0){ // wait for data to be available
    d = Serial1.read();
    if (d == 0x7E){
     break;
    }  
  }    
}     // if data available, wait for 7E ( = start or end flag)

Serial.println("Found 7E!!");

if (Serial1.available()>0){
e = Serial1.read(); // read the next byte. If it is available, 7E must represent a "start" flag, so:
Serial.println(e);
data[n] = d; // save these first two bytes to the data array
n +=1;
data[n] = e;
n +=1;
state = 2;  // return to the switch/case loop, and enter "Read" mode to read more data
break;
}
// otherwise, if the next byte is not available, 7E must represent an "end" flag. Return to the switch/case loop in "Wait" mode and keep waiting for more data
else{
break; 
}

//------------------------------------------------------
// State = "READ"
//------------------------------------------------------
case 2:
Serial.print("State: READ ");Serial.println(state);
do {
  if (Serial1.available()>0){
  d = Serial1.read(); // read the next byte
  }
  
// If the byte is not a stop (7E) or shift (7D) byte, it is data: add it to the array:
 if ((d != 0x7E) && (d != 0x7D)){
    //Serial.println("Collecting data");
     data[n] = d;
     n += 1;
   }   
   
// If the byte is 7D, then we need to read and shift the next byte:   
  if (d == 0x7D){ 
    Serial.println("Byte shifted");
    e = Serial1.read(); // read next byte
    e ^= 0x20; // Bitwise XOR next byte with Ox20: so 7D5E => 7E, and 7D5D => 7D
    data[n] = e; // and save the result to the array
    n += 1;
  }
   
// If the byte is 7E, then this must be a stop flag
  if (d == 0x7E){ 
    Serial.println("Stop flag");
    data[n] = d; // add this 7E to the collected data
    state = 3;  //and change the state to allow exit from the switch/case loop
  }
  
} while(state == 2);// continue to read frame until stop flag is found
break;

} (while state < 3);// end of switch/case loop

//------------------------------------------------------
// State = "EXIT"
//------------------------------------------------------
Serial.print("State = EXIT ");Serial.println(state);
Serial.print("Bytes collected: ");Serial.println(n);
Serial.flush(); //flush the buffers
Serial1.flush();

//continue with code to process data..
}

Despite several different approaches, I'm getting unpredictable results with intermittent freezes, board resets, and mangling of the text returned to the Serial Monitor; occurring apparently at random.

Here's a clue:

Serial.begin(9600,SERIAL_8N1); // 
Serial1.begin(19200,SERIAL_8E1);

Sending data out slower than you read it in isn't a good idea.

{Serial.println("Starting main loop");

Not a single recognized coding style accepts ANYTHING on the line after the {. I'm not impressed with your attempt to invent a new one.

Using do/while in place of a normal while loop is rarely a good idea. In a do/while loop, the body is executed and then the condition is checked. In a while loop, the condition is checked first, then, if appropriate, the body is executed.

The (almost) complete lack of indenting makes your code too hard to read.

Thanks for the reply Paul.

Apologies for the poor coding - I'm not a trained programmer. I'm a newbie to Arduino programming and to this forum, and I'm hoping for some constructive expert guidance in what is proving to be a difficult problem for me to implement on an Arduino.

Not a single recognized coding style accepts ANYTHING on the line after the { I'm not impressed with your attempt to invent a new one." Not intentional - wish I had the skill to invent a new coding style :) Probably because I've come from a background in Python rather than C, I'm aware my formatting leaves a lot to be desired. As I'm sure you know, Python formatting is based around indents rather than curly brackets, so I'm still getting to grips with the new conventions associated with programming the Arduino.

Here's a clue: Code: Serial.begin(9600,SERIAL_8N1); // Serial1.begin(19200,SERIAL_8E1); Sending data out slower than you read it in isn't a good idea.

Apologies - I should have clarified: All the data is being read in from an external device via a RS232-TTL converter attached to the Serial1 Rx pin of the Mega. Once I can read & parse the data properly from Serial1, I plan to send it out to another device via Serial2. I'm just using using Serial as a temporary measure to communicate details of variables and program flow via the USB cable back to a laptop running the Arduino Serial Monitor, so I can debug my code. It's not intended to be part of the final implementation.

I thought that Serial, Serial1, Serial2 and Serial3 were completely independent, and therefore could run at different speeds without causing conflict, so left Serial at the slower default settings. If I understand you correctly though, you're saying that by not setting both serial ports to the same speed, I may be causing problems. Thanks for this suggestion. Will try increasing Serial to 19200 baud and see if this helps.

Using do/while in place of a normal while loop is rarely a good idea. I haven't used "Do-while" loops much previously: they are not available in Python; but I thought it was the appropriate construct in the places that I've used it here - i.e. where a block of code had to be run through at least once before checking the condition. However if it is likely to cause problems (?particularly if there are nested do-while loops), I'm happy to try and re-write this using "while" instead.

Thanks again for your advice Paul. I'd be very grateful for any further helpful suggestions & guidance from the forum.

BW

Dave

Hi folks

I’ve re-written my code and incorporated everything that Paul suggested: Replacing Do-While loops, setting Serial and Serial1 to the same baud rate, and even improving the formatting (see code below); but still no joy. I get a load of garbled text in the Serial Monitor (screen dump jpg attached), and the routine never exits.

To make matters worse, last time I re-compiled the code, it failed to compile with the error message:
‘SERIAL_8E1 was not declared in this scope’
This is a new problem, and I haven’t changed anything: Still using the same Arduino compiler (1.0.5_v2) and selecting the right Arduino board (Mega 2560).

I’m guessing that the Mega 2560 must be plenty powerful and fast enough to parse 321 bytes of data using a short routine at the relatively slow speed of 19200 baud.
Is it possible that the Mega is running too fast for the data - do I need to slow it down with some carefully placed short delays?

Similarly, I’ve tried increasing the Hardware Serial buffer (cores>arduino>HardwareSerial.cpp) from 64 to 512 bytes, but this seems to make no difference; so I suspect I’m making a dumb mistake in my code somewhere or missing something…

I understand that indiscriminate flushing of buffers is generally a bad thing, as it can result in data loss; but could a strategically placed “Serial1.flush()” statement be the answer to my problem in this case?

Parsing serial data with DLE is a common enough task (I believe that the www uses a version of this protocol); so if my own efforts are futile, is there maybe a library out there which will perform the task?

Apart from the above, I really have no idea how to proceed… :~
I’d be really grateful for your expert advice.

Thanks again in anticipation for your tolerance and guidance

Dave

byte data[321]; // byte array to store parsed data
byte d = 0; // buffer for reading data
byte e = 0;// buffer for reading data

void setup()
{
  Serial.begin(19200,SERIAL_8E1); // for monitoring/debugging via USB & Serial Monitor
  Serial1.begin(19200,SERIAL_8E1); // for receiving data from RS232 to TTL adaptor
}
void loop()
{ // receive next packet of data
  int state = 1; // reset state to 'Wait'
  int n = 0; // reset pointer for data array

  // start of 'state machine' loop: state 1 = Wait; state 2 = Read; state 3 = Exit.
  while (state <3){ // Wait and Read; otherwise Exit and continue

    //------------------------------------------------------
    // State = "WAIT"
    //------------------------------------------------------
    while (state == 1){
      Serial.print("State = WAIT ");
      Serial.println(state);

      while(d != 0x7E){   // if data available, wait for 7E ( = start or end flag)
        while (Serial1.available() < 0){ // keep looping until data available
        }
        d = Serial1.read();  
      }
      /*
      If 7E is found, it MUST be a start or finish flag
       7E can't occur within the data, as it will have been shifted to "7D5E" by DLE protocol
       therefore if we start reading the data mid-packet, or between packets, the program flow will not
       get to this point, and just keep reading until 7E is found
       */

      Serial.println("7E found");

      if (Serial1.available() > 0 ){
        // If 7E is followed by more data, this must be a start flag 
        data[n] = d;
        n++;
        state = 2; // change state to "Read"
        Serial.print("State: READ ");
        Serial.println(state);
      }
      //If no data follows 7E, we're at a stop flag; so keep waiting in the 'WAIT' while loop

    } // end of WAIT 'while' loop

    //------------------------------------------------------
    // State = "READ"
    //------------------------------------------------------
    while (state == 2){

      while (Serial1.available() < 0){ // keep looping until data available
      }
      /*
      Shouldn't need to check Serial1.available(), as we can only get to this point if 
       we're at start of a data packet, therefore there MUST be bytes of data available
       following the start flag...but this might help stability
       */

      d = Serial1.read();
      // If the byte is not a stop (7E) or shift (7D) byte, add it to the array
      if ((d != 0x7E) && (d != 0x7D)){
        data[n] = d;
        n++;
      }   

      // If the byte is 7E, then this must be a stop flag
      if (d == 0x7E){ 
        Serial.println("Stop Flag found");
        data[n] = d; // add this 7E to the collected data
        state = 3;  //and change the state to allow exit from the state machine loop
      }

      // If the byte is 7D, then we need to shift the next byte   
      if (d == 0x7D){ 
        Serial.println("Byte shifted");

        while (Serial1.available() < 0){ // keep looping until data available
        }    
        /*
        Shouldn't need to check for Serial1.available(), as we can only get to this point if 
         we're in the middle of a data packet, therefore there MUST be bytes of data available
         following the shift flag...but this might help stability
         */

        e = Serial1.read(); // read next byte
        data[n] = e ^ 0x20; // Bitwise XOR next byte with Ox20; so 7D5E->7E, and 7D5D->7D; and save result to the array
        n++;
      }
    } // end of 'READ' while loop

  } // end of 'state' while loop
  //------------------------------------------------------
  // State = "EXIT"
  //------------------------------------------------------
  Serial.print("State = EXIT ");
  Serial.println(state);
  Serial.print("Bytes collected: ");
  Serial.println(n);

  //...continue with more code here to further process received data
} // end of main loop

Serial-Test_Screen-dump.jpg

Finally cracked it !! :slight_smile:

In no particular order:

  1. Like Paul said, seems as though Serial and Serial1 are not really independent at all, as some of the literature suggests.
    My attempts at debugging by sending back information from Serial.print to the Serial Monitor were part of the problem!
    Deleting all my Serial.print “debugging” calls, apart from at the end of the main loop, and allowing Serial1 to get on with reading the data without interruption made a big difference.

  2. Putting a “delay(1)” before every “Serial1.read()” statement also seems to be essential.
    Not sure why this is the case - maybe someone can explain?
    (It does not seem to be necessary before “if (Serial1.available())” statements).

  3. The two “while(Serial1.available() <0) {}” loops in the “READ” section of my code to check for data availability seem to be redundant - at least it works OK without them for now…

  4. Dumb error: The declaration “byte d = 0” should be inside the start of the “void loop()” instead of before “void setup()”.
    (Otherwise, the main loop exits with d = 0x7E (having detected the end flag) . It then enters the main loop again with d still = 0x7E; so thinks its immediately found a start flag, and falls through the loop second time round. The net result is that the length of the collected data will alternate between 1 and 320 bytes collected. D’Oh!).
    Re-setting d = 0 at the start of the main loop solves this problem.

On the hardware side of things:

  1. Using a different USB cable sorted out the problem with the ‘SERIAL_8E1 was not declared in this scope’ error, and other bizarre errors on trying to upload the code.

  2. Making sure I had good connections to both the Serial1Rx and Earth pin headers on the Mega helped - flying leads into the pin headers are great for prototyping, but probably not the most reliable for data transfer…

Anyway, I’m just relieved to have solved my problem & hope the above helps anyone with similar woes.

Thanks again for putting me on the right track Paul!

Cheers

Dave

  1. Like Paul said, seems as though Serial and Serial1 are not really independent at all, as some of the literature suggests.
    My attempts at debugging by sending back information from Serial.print to the Serial Monitor were part of the problem!
    Deleting all my Serial.print “debugging” calls, apart from at the end of the main loop, and allowing Serial1 to get on with reading the data without interruption made a big difference.

They ARE completely independent. But, there is a limited amount of memory and serial data receiving and sending takes time. That time takes away from doing other things. By removing the serial output statements, you sped up the input data processing and reduced the amount of memory needed.

Since Serial is only for debugging, there is no reason to not operate it at its maximum speed (115200).

  1. Putting a “delay(1)” before every “Serial1.read()” statement also seems to be essential.
    Not sure why this is the case - maybe someone can explain?
    (It does not seem to be necessary before “if (Serial1.available())” statements).

The key there is “seems”. There should be no need to wait after you’ve determined that there is something available to read.

  1. Using a different USB cable sorted out the problem with the ‘SERIAL_8E1 was not declared in this scope’ error, and other bizarre errors on trying to upload the code.

The compiler knows nothing about the USB cable you will use. You changed something else.

        while (Serial1.available() < 0){ // keep looping until data available

The range of values that the available() method returns ranges from 0 (nothing in the buffer) to 64 (the buffer is full). The available method never returns a negative value. So, this statement never evaluates to true, and, therefore, never does execute the body. Not that there is anything to do if it did evaluate to true.

      if (Serial1.available() > 0 ){
        // If 7E is followed by more data, this must be a start flag 
        data[n] = d;
        n++;
        state = 2; // change state to "Read"

Having just read the one character in the buffer, the 0x7E, it is nearly certain that the buffer will now be empty, so this statement will not evaluate to true. Or, rather, it wouldn’t if time wasn’t spend sending serial data between the last read and this test.

Building in timing dependencies is really not a good idea.

      while (Serial1.available() < 0){ // keep looping until data available
      }

Another incorrect assumption/test.

      if ((d != 0x7E) && (d != 0x7D)){
        data[n] = d;
        n++;
      }

Assuming that there is room in the buffer is not a good idea.

Thanks for taking the time to explain where I've been going wrong Paul. Much appreciated.

It's all making sense now, and the code is working just fine.

BW

Dave