Undelimited Serial Data

Maybe undelimited isn’t wholly accurate, but I have an embedded source that spews via RS485. I have that terminated on a 485 to 232 shield, the serial stream coming into the AVR on Serial1 @ 9600.

I can see the stream coming in - there’s a lot of it and it comes fast - maybe 1 second between streams. There are two issues I’m fighting: 1, there are multiple different nodes broadcasting on the bus with random length stream events and 2, the delimiter is sometimes a padding of eight 0xFF bytes followed by the header and interesting data and other times just the header 0xFF 0x0 0xFF 0xA5, other times just a 0x10 0x2. Fortunately the latter terminates with a 0x10 0x3. The others terminate with a 2 byte checksum. Since samples 1 & 2 both include the same 0xFF 0x0 0xFF 0xA5 maybe I should disregard the eight 0xFF bytes. I don’t do this stuff everyday, bear with me. I wasted a weekend trying various code and finally I’m tossing the towel in. HELP!

Google and I have parsed the string out in a text editor and I know which fields hold the interesting values, trouble is I can’t get the code right to parse it.

Here’s a sample of the data stream and what I know. No, there is no public API and the manufacturer won’t give it up.

In #1, I am looking for HOUR, MIN, TMP 1 & TMP 3.
In #2, I am looking for RPMH, RPML, WATTH, WATTL.
In #3, I am looking for PCT.

Any help is appreciated.

                                            H                             T  T      T                        C C
                                  D  S   ?  O M                           M  M      M                        H H
                                  S  R      U I                           P  P      P                        K K
<------padding-----> <--header--> T  C      R N                           1  2      3                        H L
FF FF FF FF FF FF FF FF 0 FF A5 7 F 10 2 1D C 27 1 0 0 0 0 0 0 20 0 0 0 4 5A 5A 0 0 66 0 0 0 0 0 0 D0 7C 3 D 3 B8
This shows a broadcast msg (10 >> F) at 12:39, TMP1 (water temp) is 90, TMP3 (air temp) is 102


                                   W  W
                            R  R   A  A                     C C
             D  S   ?       P  P   T  T                     H H
             S  R      U I  M  M   T  T                     K K
<--header--> T  C      R N  H  L   H  L                     H L
FF 0 FF A5 0 10 60 7 F 0 0  0  E6  5  78  0 0 0 0 0 1 C  38 2 DD
This shows a targeted update from 60 >> 10 (pump to control panel) reporting 230 watts @ 1400 RPM


           P  C  
           C  H
<hdr>      T  K  <trm>
10 2 50 11 32 A5 10 3 
This shows a load of 50% on the salt chlorinator cell.

These streams come all jammed together in raw format like this:

FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37DFFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37D102501132A51031020124B80EF103FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37DFFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37DFF0FFA50601041FF219FF0FFA50106041FF219FF0FFA50601061126FF0FFA50106061126FF0FFA506010142C457825D

This sounds like an interesting problem. I presume the coloured text is what you get if you capture all the data and I presume the colour changes represent the different messages.

What is the baud rate ?

Is there always an obvious gap between messages - you mention 1 second ?

If there is always a gap that represents the time for several characters that could easily be used as a delimiter. Then you would at least have the messages separate from each other and you could apply the appropriate code to extract the useful data.

What is the data coming from ?

...R

Robin2, thanks for the reply.

Correct, the color was just to help point out the various run-on commands in a sample string.

Baud is 9600 from the source, which is a pool controller. I've been trying to adapt this code from a guy that decoded his controller, albeit, from a different company. Similar setup - both are RS485, neither has an API, half-duplex. He has the advantage that all commands begin with common header and terminator bytes 0x10 0x2 and 0x10 0x3 (eerily identical to the 3rd string I posted).

Not always a gap in the case of the 3rd string - when they do appear those seem to append to the tail of the 1st string. For probably 90% of the messaging, yes, there is a gap.

Do you have to capture every packet, or do you have the luxury of grabbing the data, parsing it - which might lose a few packets - and get some more.

KeithRB:
Do you have to capture every packet, or do you have the luxury of grabbing the data, parsing it - which might lose a few packets - and get some more.

They just keep coming so I think it'd be safe to grab some if it could be parsed, so long as it contains the bytes of interest. Are you thinking about just collecting some number of bytes, stop reading, start parsing to see what was collected?

Exactly. Of course you can start after a long string of 0xFF

You can get RS485 module boards (and chips to roll your own) for Arduino.

9600 baud is slow to Arduino. That's 960 chars/sec, over a milli each.

Have you ever written a state machine? You don't need to buffer THEN parse and lex. You can process each char as it comes in and have your data as soon as the last char arrives, getting the evaluation done in between char reads. Instead of wasting time waiting for the whole message then making up for it with a bunch of processing that BTW requires buffers, use a state machine to run fast and light.

Simple data packet capture based on a delay between sending of the packets.

//zoomkat 6-29-14 Simple serial echo test
//type or paste text in serial monitor and send

String readString;

void setup() {
  Serial.begin(9600);
  Serial.println("Simple serial echo test"); // so I can keep track of what is loaded
}

void loop() {

  while (Serial.available()) {
    char c = Serial.read();  //gets one byte from serial buffer
    readString += c; //makes the String readString
    delay(2);  //slow looping to allow buffer to fill with next character
  }

  if (readString.length() >0) {
    Serial.println(readString);  //so you can see the captured String 
    readString="";
  } 
}

OK I think I'm making some progress. For now I'm using the buffer method. So far I am able to get the buffer filled and run IF statements for comparisons to the bytes. Only issue is if the byte has alpha chars in it it fails to compare.

So in my case most streams begin work FF FF FF FF FF FF FF FF but any comparisons against FF (or 0xFF) fail to match.

I know the serial mon is converting the char to FF. When I print the source char I get ascii goofyness.

So can I not compare against HEX? Interestingly a numeric HEX value compares properly.

Working snip:

  if (Serial1.available()){
    c = (uint8_t)Serial1.read();
    delay(20);
    switch( frameStatus ){
      case waitForFrameStart1:
        if(c == 0x10){
          frameStatus = waitForFrameStart2;
        break;
        }

Failing snip:

  if (Serial1.available()){
    c = (uint8_t)Serial1.read();
    delay(20);
    switch( frameStatus ){
      case waitForFrameStart1:
        if(c == 0xFF){
          frameStatus = waitForFrameStart2;
        break;
        }

These streams come all jammed together in raw format like this:

Is a single byte being sent and you are reading it out in a two byte hex representation, or is a two byte hex representation of a single byte being received?

FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37D

It's almost like this protocol is also designed to tx over wireless and the first few FF's are preamble to allow a receiver AGC to adjust.

Have you really collected the data correctly as breaking your coloured display down a different way...

FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37D
FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37D
FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37D
FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37D
              FF0FFA506010142C457825D
              FF0FFA50601041FF219
              FF0FFA50106041FF219
              FF0FFA50601061126
              FF0FFA50106061126
              102501132A5103
              1020124B80EF103

Seems to show a potential 6 different message types and some (most) messages have an odd number of characters so if this is hex representation of raw serial data then something has gone wrong as it should always be an even number.
Can you explain in detail how your capturing the data and maybe post the code you was using.

dotJason:
Not always a gap in the case of the 3rd string - when they do appear those seem to append to the tail of the 1st string. For probably 90% of the messaging, yes, there is a gap.

It may be that there is a gap that the Arduino could detect even though a human can't

It would not matter if two messages (with no gap between) were received as one because your code could figure them out.

The only concern would be whether there is space in SRAM to save all the data.

What is the maximum number of bytes between spaces ?

And, following from Reply #9, is the device sending the character 'F' or the byte value 255

I think it would be easy to modify the second example in serial input basics to use the gap between transmissions as a "delimiter". If you are interested, and can supply the above information, I will try.

...R

If I may suggest, also show the data in ASCII. A few things jump out at me:

  • 7B3D37D: 7B is { and 7D is } so this is {=7}
  • 03 is ETX (end of text)
  • 02 is STX (start of text)

Look at your ASCII chart, convert the characters that are less than 0x20 to their symbolic equivalents, and it should start to be obvious.

Do you have any idea how those devices manage to only transmit one at a time?
The fact that they manage (or you would get occasional total garbage) says they are managed.

When I start to dig into RS485, I see that there are such controls. That name Nick Gammon comes up in one Wiki as a good info source on the subject.
Yourduino How-To Wiki on RS485

I had to think a while about why a company would be so STUPID as to keep people from being able to use their products and come to the conclusion that these are overpriced repair parts rather than end products. You're not supposed to use those parts in any way but to plug them into existing end products from that company.
Anybody got stuck into owning a Trash-80? LOL!

7B 3D 37 D --or-- 7 B3 D3 7D?

It takes 2 hex digits to make a byte. Binary is funny that way.

Just pointing out a hiccup there Nick. You present stuff that amazes me regularly.

GoForSmoke:
Do you have any idea how those devices manage to only transmit one at a time?
The fact that they manage (or you would get occasional total garbage) says they are managed.

Things like this usually monitor the line and when they detect no traffic they wait a fixed + *random amount of time and check the line again. If it's still free then they seize it for themselves and start sending.
*The random amount of time could also be a fixed duration based on device ID.

You are absolutely right! :slight_smile:

And now I am wondering if in the original post the data where "These streams come all jammed together in raw format like this:" - included leading zeroes or not.

For example:

2000

Might be two bytes: 20 00

Or three bytes: 20 00 00

Or four bytes: 02 00 00 00

So maybe the OP can clarify how they were printed?

Riva:
Things like this usually monitor the line and when they detect no traffic they wait a fixed + *random amount of time and check the line again. If it's still free then they seize it for themselves and start sending.
*The random amount of time could also be a fixed duration based on device ID.

There doesn't need to be much gap, but you just give me the thought that the controller monitoring the line must be an easy way to tell when one message ends and another starts.

Riva:

FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37D

It’s almost like this protocol is also designed to tx over wireless and the first few FF’s are preamble to allow a receiver AGC to adjust.

Have you really collected the data correctly as breaking your coloured display down a different way…

FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37D

FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37D
FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37D
FFFFFFFFFFFFFFFF0FFA57F1021D1F20000000200004575700600000008D7B3D37D
             FF0FFA506010142C457825D
             FF0FFA50601041FF219
             FF0FFA50106041FF219
             FF0FFA50601061126
             FF0FFA50106061126
             102501132A5103
             1020124B80EF103




Seems to show a potential 6 different message types and some (most) messages have an odd number of characters so if this is hex representation of raw serial data then something has gone wrong as it should always be an even number.
Can you explain in detail how your capturing the data and maybe post the code you was using.

Riva,
you may be on to something. There is a wireless remote that connects into an RS485 port at the controller. I did a little more decoding of the stream and I found some additional values that control the LED’s on the remote. There are LED’s for CLEAN MODE, LIGHT, WATER FALL buttons. I know the command that toggles (for example) the light. When I issue that I get the FF FF FF FF FF FF FF FF pattern and a value in the MODE position changes, then the LED comes on. So you are probably right that the FF FF FF FF FF FF FF FF leading pattern targets the wireless, although there are DST and SRC values that address to which device the message is intended. In this case, 10 >> F means controller to broadcast.

For ex:

                                            H     M                                   T  T        T                                C  C
                                  D  S      O  M  O                                   M  M        M                                H  H
                                  S  R      U  I  D                                   P  P        P                                K  K
<------padding-----> <--header--> T  C      R  N  E                                   1  2        3                                H  L
FF FF FF FF FF FF FF FF 0 FF A5 7 F 10 2 1D 14 1E 00 00 00 00 00 00 00 20 00 00 00 04 5A 5A 00 00 60 00 00 00 00 00 00 81 83 03 0D 03 68    pool off, light off
FF FF FF FF FF FF FF FF 0 FF A5 7 F 10 2 1D 15 04 02 00 00 00 00 00 00 20 00 00 00 04 58 58 00 00 5F 00 00 00 00 00 00 81 83 03 0D 03 4C    pool off, light on
FF FF FF FF FF FF FF FF 0 FF A5 7 F 10 2 1D 14 2D 22 00 00 00 00 00 00 20 00 00 00 04 58 58 00 00 5F 00 00 00 00 00 00 81 83 03 0D 03 94    pool on, light on
FF FF FF FF FF FF FF FF 0 FF A5 7 F 10 2 1D 14 2D 20 00 00 00 00 00 00 20 00 00 00 04 58 58 00 00 5F 00 00 00 00 00 00 81 83 03 0D 03 92    pool on, light off
FF FF FF FF FF FF FF FF 0 FF A5 7 F 10 2 1D 14 20 04 00 00 00 00 00 00 20 00 00 00 04 5A 5A 00 00 60 00 00 00 00 00 00 81 83 03 0D 03 6E    waterfall on, light off
FF FF FF FF FF FF FF FF 0 FF A5 7 F 10 2 1D 14 20 06 00 00 00 00 00 00 20 00 00 00 04 59 59 00 00 60 00 00 00 00 00 00 81 83 03 0D 03 6E    waterfall on, light on
FF FF FF FF FF FF FF FF 0 FF A5 7 F 10 2 1D 15 05 01 00 00 00 00 00 00 20 00 00 00 04 58 58 00 00 5F 00 00 00 00 00 00 81 83 03 0D 03 4C    cleaner on, light off
FF FF FF FF FF FF FF FF 0 FF A5 7 F 10 2 1D 15 06 03 00 00 00 00 00 00 20 00 00 00 04 58 58 00 00 5F 00 00 00 00 00 00 81 83 03 0D 03 4F    cleaner on, light on

Nick,
Here is how I printed the data - quick and dirty just to see what I was getting. I just took a random sample and trimmed it to end on a full event.

 while (Serial1.available()) {
  inBytes = Serial1.read();
    Serial.print(inBytes,HEX);
    }  




 while (Serial1.available()) {
  inBytes = Serial1.read();
    Serial.print(inBytes);
    } 

