SoftwareSerial magic numbers

Next test - longer string “the quick brown fox jumps over the lazy dog” (42 chars) sent from A-> B at different speeds starting at 100 baud step size 100.
B sends back the number of chars correctly received from start of the string. so when receiving “theXquick brown fox jumps over the lazy dog” the answer would be 3.

Again we see baud rates up to 70K perform 100%, above failing starts…

BAUD 	CHARS
74800	34	 FAIL
79900	11	 FAIL
81000	3	 FAIL
84900	31	 FAIL
85200	23	 FAIL
85800	21	 FAIL
86200	15	 FAIL
86300	20	 FAIL
86600	38	 FAIL
86900	41	 FAIL
87100	38	 FAIL
87300	15	 FAIL
87400	20	 FAIL
87700	38	 FAIL
88300	20	 FAIL
88400	170	 FAIL  <<<< a very strange one ???
88500	37	 FAIL
88700	18	 FAIL
88800	15	 FAIL
89000	38	 FAIL
89200	38	 FAIL
89500	15	 FAIL
89600	12	 FAIL
89800	18	 FAIL
90000	31	 FAIL
..

For the statistics: highest successful transfer rate was 103800 baud,

Conclusion:
Up to 70K the formula based SoftSerial does work as expected, that is about 20% faster than 57600 from the fixed tables.

115200:
As the values are identical to the table based SS for 115200, I do not expect the (exisiting) table version to work at least for receiving data.

Another test; steps of 1000 start @ 1000 (so not as fine grained but much faster).

The string length is 42, the value returned is now the number of same characters.
So 41 means that of the characters received 41 matched “the quick brown fox …dog”.

Missing chars start now at 83K and we see the quality gradually drop, missing 8 chars at 115K => that is 20%!)

(baudrates without extra info are OK)

70000
71000
72000
73000
74000
75000
76000
77000
78000
79000
80000
81000
82000
83000	41	 FAIL
84000
85000	41	 FAIL
86000
87000	40	 FAIL
88000	41	 FAIL
89000
90000
91000	41	 FAIL
92000	41	 FAIL
93000	41	 FAIL
94000
95000	39	 FAIL
96000	41	 FAIL
97000	41	 FAIL
98000	41	 FAIL
99000	41	 FAIL
100000	40	 FAIL
101000	40	 FAIL
102000	39	 FAIL
103000	41	 FAIL
104000	40	 FAIL
105000	39	 FAIL
106000	39	 FAIL
107000	39	 FAIL
108000	38	 FAIL
109000	39	 FAIL
110000	37	 FAIL
111000	38	 FAIL
112000	39	 FAIL
113000	39	 FAIL
114000	37	 FAIL
115000	34	 FAIL
116000	34	 FAIL
117000	39	 FAIL
118000	36	 FAIL
119000	36	 FAIL
120000	38	 FAIL

No new conclusions from this test.

I like what you are doing robtillaart!

Would it be possible to write a code that auto calibrates the "magic numbers".

I would imagine using two serial communication links. One link could be the hardware serial port that would send to the slave the baud rate to be tested. Next, at the determined baud rate, the master would send a byte or string that is predetermined. The slave would then adjust the timing variables (within an allowed range) until the string is captured successfully "x" number of times. Lastly, the slave would perhaps save the variable results to EPROM or send to the serial monitor.

Would it be possible to write a code that auto calibrates the "magic numbers".

Definitely, but because the tables existed It was easier to let Excel find them.

In fact you can derive the magic numbers from the protocol. You know that the a data bit has 1/baudrate seconds between the edges. As you can see how much time the machine code takes to read and store a bit you can derive the magic numbers even quite exact for a given baud rate.

The real problem is that you need to find a protocol that works for all given baud rates, so an auto calibrating mode needs a number of known patterns to learn. Best to start with the highest baud rate as this is the most critical one. If you have the basic formula rxbit = CLOCKSPEED/(alpha * baudrate) - beta; you can try all possible combinations for alpha and beta between 1..20 so after 400 bytes you get the ranges for alpha and beta that work. You try the next baud rate and the ranges will decrease until all baudrates done. Then take the middle of the ranges and you're done.

A better faster approach is first find the optimal alpha, the search the optimal beta then alpha again then beta again until same values appear. That will bring you to the optimal value within 50 or so bytes (~10x faster).

Instead of linear search through the ranges you can do a binary search,...

For me the strange thing is the value SEVEN where I expected EIGHT in the formula - 16000000L/(7 * baudrate) - 3; but I did not really investigate 16000000L/(8 * baudrate) + BETA ... => todo list ;)

I'm glad to hear it is possible. You have helped my understanding of the process a lot although, it is a bit over my head to try to program an auto-calibration program.

Thanks, Mark

There are 4 levels of acting -> goal -> strategy -> tactics -> operations

You did define your goal, but you were thinking tactics (how to program) and operations (details) . You skipped one level.

You must think of the strategies how such thing can be done and do "thought experiments" to find the tactics that belong to the strategy to make the strategy choice.

strategies can be : - brute force (test all), random search (aka darting;), hill climbing (change any param that improves the result), analytics etc

imho it is not over your head, it's not easy but you can do it, give it a try

Hey Rob, I just wanted to say thank you for your research on this. As it turns out I have a need for SoftwareSerial to do 7800 baud to read from a motorcycle ECU->Dash communication link and it sounds like you've solved my problem. :)

@synfinatic

Let me hear the results of your tests, I'm interested. I can imagine that a motorcycle can generate quite some noise which may be disruptive to the signal. Did you think shielding/grounding etc?

I haven’t even begun to think/worry about that yet. Hopefully it just works. Ha!

In all seriousness, I’m literally waiting for FedEx to deliver my 'scope (hopefully Friday) so I’ll have a better idea then for what I’m dealing with. I do know that at 9600 baud it seems to kinda-sorta work, but I do seem to be getting some corruption, but that mostly seems to be due to the timing differences from what I can tell/guess. So it appears the line is pretty clean.

That being said, I haven’t tried it with the bike actually running yet, so things could get a whole lot noisier once those coils start firing.

So far so good. 7800 seems to be decoding messages without any errors and without my timing hack which was necessary at 9600.

Good to hear it works, "outside the lab" ;)

Hi robtillaart,

I wondered why a long devision (int32) is needed. I tried to find a solution with int (int16) only. And came up with this precise solution (focus is on rxintra):

  // 8MHZ
  int baudrate = (int)speed;

  int j=-39 | baudrate;
  int m=2* (-baudrate & 2*j | baudrate);
  int a=((2*m) & m ) | baudrate;
  int rxintra=(-15240-3*a-j)/a-1;

It’s precise in a sense that it resembles all table data with 0 difference. (But it’s not “linear” in your sense. Instead I call it “weird”.)
To make your proof simple I wrote this scetch:

#include "WProgram.h"

//
// Lookup table
//
typedef struct _DELAY_TABLE
{
  long baud;
  unsigned short rx_delay_centering;
  unsigned short rx_delay_intrabit;
  unsigned short rx_delay_stopbit;
  unsigned short tx_delay;
} DELAY_TABLE;

//  8 Mhz table
static const DELAY_TABLE table[] PROGMEM =
{
  //  baud    rxcenter    rxintra    rxstop  tx
  { 115200,   1,          5,         5,      3,      },
  { 57600,    1,          15,        15,     13,     },
  { 38400,    2,          25,        26,     23,     },
  { 31250,    7,          32,        33,     29,     },
  { 28800,    11,         35,        35,     32,     },
  { 19200,    20,         55,        55,     52,     },
  { 14400,    30,         75,        75,     72,     },
  { 9600,     50,         114,       114,    112,    },
  { 4800,     110,        233,       233,    230,    },
  { 2400,     229,        472,       472,    469,    },
  { 1200,     467,        948,       948,    945,    },
  { 300,      1895,       3805,      3805,   3802,   },
};

int SoftwareSerial_rxintraA(long speed)
{
  for (unsigned i=0; i<sizeof(table)/sizeof(table[0]); ++i)
  {
    long baud = pgm_read_dword(&table[i].baud);
    if (baud == speed)
    {
      return pgm_read_word(&table[i].rx_delay_intrabit);
    }
  }
  return -1;
}

int SoftwareSerial_rxintraB(long speed)
{
  long baudrate = speed;
  // 8MHZ
  int rxintra = 8000000L/(7 * baudrate) - 4;
  return rxintra;
}

int SoftwareSerial_rxintraC(long speed)
{
  // 8MHZ
  int baudrate = (int)speed;

  int j=-39 | baudrate;
  int m=2* (-baudrate & 2*j | baudrate);
  int a=((2*m) & m ) | baudrate;
  int rxintra=(-15240-3*a-j)/a-1;
  return rxintra;
}

void setup() {
  Serial.begin(9600);
  Serial.println("baud\tA\tB\tC");
  for (unsigned i=0; i<sizeof(table)/sizeof(table[0]); ++i)
  {
    long baud = pgm_read_dword(&table[i].baud);
    Serial.print(baud);  Serial.print("\t");
    Serial.print(SoftwareSerial_rxintraA(baud));  Serial.print("\t");
    Serial.print(SoftwareSerial_rxintraB(baud));  Serial.print("\t");
    Serial.print(SoftwareSerial_rxintraC(baud));  Serial.print("\t");
    Serial.println();
  }
}

void loop() {
}

Amazing! Seen quite some code in my life but this is a very amazing code (+1)

I didn't ran the code but I am wondering how you can represent 115200 in a int16. (you can't) so it is wrapped or truncated . That result is tweaked by bit manipulation and some magic numbers to the right values. Can you explain on which principle the math is based? website?

Drawback is its maintainability as it is hard to understand. Furthermore as the table is only calculated once when begin(speed) is called, I doubt if there is a serious performance gain.

Does your magic code also calculate the intermediate values e.g. a baud rate of 7800 as in the previous post?

No, it will not deliver reasonable results for 7800. (At the beginning of this thread you ask for linearity. "weird" functions are not "linear" in your sense.)

I did not invent this "weird" function. Instead a genetic program invented it for me. I fed in the table you gave, reduced to the columns 'baud' and 'rxintra':

49664   5
57600   15
38400   25
31250   32
28800   35
19200   55
14400   75
9600    114
4800    233
2400    472
1200    948
300     3805

Because it's an int16 genetic programmer I had to truncate 115200 down to 16 Bits which is 49664.

After a while it found a perfect solution (which is equivalent to 0 differences to the given table). I kept it running for a night whilst the genetic programmer delivered even shorter code solutions. At the next morning I picked the best code sequence (so far) which had a codelen of 24:

a = -39;
a |= x;
j = a;
a += a;
n = a;
a &= 0;
a -= x;
a &= n;
a |= x;
a += a;
m = a;
a += m;
a &= m;
a |= x;
n = a;
a += a;
y= a;
a = -15240;
a -= y;
a -= n;
a -= j;
a = a / n;
y= a;
y--;

Where x is baud and y is rxintra. I manually transformed to, what you already know as:

 // 8MHZ
  int baudrate = (int)speed;

  int j=-39 | baudrate;
  int m=2* (-baudrate & 2*j | baudrate);
  int a=((2*m) & m ) | baudrate;
  int rxintra=(-15240-3*a-j)/a-1;

This is real fun!

Do you have a link to that genetic programmer you used? Or did you write it yourself? post code ?

I would like to have the ability to read serial data on an Attiny at 1Mhz and while transmitting isn't essential, it would be nice for debugging. I don't need a high baud rate.

As the current standard SoftwareSerial only works for 8, 16 and 20MHz I was considering the original SoftwareSerial in another Thread when Rob suggested I look at this Thread and subsequently suggested I add my comments here.

DISCLAIMER - I'm not a mathematician and I'm not a professional programmer - just curious.

In Post#18 Rob listed the three versions of his formula for each of the three CPU frequencies. I was looking for a pattern that might suggest how to evolve the formula for a different frequency. Rob's work, so far, has been to experiment with different baud rates at a given frequency.

The formulas all have a similar pattern - divide the frequency by an adjusted baud rate and add a correction. So I made a small table listing these numbers. For convenience I will use 16M to mean 16000000.

Frequency Baud-adj rxStop-adj tx-adj rxCen-adj 20M 7 -3 -3 -7 16M 7 -2 -4 -7 8M 7 -4 -2 -10

However if you look at any of the formulas, for example,

// 8MHZ rxstop = 8000000L/(7 * baudrate) - 4; rxintra = rxstop; tx = rxstop - 2; rxcenter = rxstop/2 - 10;

you will see that tx is derived directly from rxstop and it could easily be the other way round. So, in my mind, I changed the formulas to be like this (the other bits are unchanged)

// 8MHZ tx = 8000000L/(7 * baudrate) - 6; rxstop = tx + 2;

And when you do that my revised first table looks like this

Frequency Baud-adj tx-adj rxStop-adj 20M 7 -6 3 16M 7 -6 4 8M 7 -6 2

and you can see that the tx formula is independent of frequency. Also, there is no frequency related pattern to rxStop-adj which leads me to wonder if a constant would do for that also.

This has prompted a lot more thoughts in my head. For example would it be difficult to run your tests without any of the adjustment factors - so tx, rxintra and rxstop = MHz/7/baudrate and rxcenter = tx/2. If that doesn't work, my suspicion is that a single correction to push everything left or right would be all that's needed rather than corrections for each part.

Another thing is that the numbers in the table must take account of the time used by DebugPulse().

...R

I know this might be slightly off-topic, but I thought it might be worth mentioning the AltSoftSerial library I wrote some time ago. It uses a formula intended for all baud rates, rather than a lookup table.

http://www.pjrc.com/teensy/td_libs_AltSoftSerial.html

AltSoftSerial uses a completely different design than SoftwareSerial. Rather than busy looping to transmit and interrupt-triggered busy looping to receive, AltSoftSerial uses Timer1's input capture to receive data and output compare to create the transmit waveform. As a result, it doesn't hog the CPU while transmitting or receiving, and it does properly buffered transmit like hardware serial.

But the downsides to this approach are consuming Timer1, so libraries like Servo can't be used. It also can't quite achieve 115200 speed, since it runs an interrupt for each signal transition. Coded in C, there's just a bit too much overhead to run fast enough to keep up with 115200 with the AVR at 16 MHz. But it's very close. Maybe some craft code optimization or implementing some or all in assembly could enable 115200?

At slower baud rates, like 7800, AltSoftSerial works far better than SoftwareSerial, because it never hogs the CPU. SoftwareSerial at 7800 disables interrupts for at least 1.15 ms, which might be long enough to cause millis() to lose time!

Very nice contribution Robin!

The adjustments are most important for the higher baud rates. As these are not reliable (>70K) anyway with SW serial, strectching them should not be a problem.

Alignment of the formulas into one single formula would make SW serial code simpler. A quick check with the spreadsheet shows that the formula below looks like the optimum.

tx = F_CPU/(7 * baudrate) - 6;  // replaced the hard coded clock speed with F_CPU (see Arduino.h) 
rxstop = tx + 3;
rxintra = rxstop;
rxcenter = tx/2 - 5;

The relative error of the single formula approach rises compared with the dedicated formulas and the original numbers from SW serial lib. The relative error is important to minimize as these add up per bit. A relative error of 10% timing per bit results in a failed byte as the last bit is out of sync. The most erratic is the rxcenter variable which is fortunately only used once during a byte transfer, to get the middle of a pulse.

Robin, when you have tested your formula on your 1MHz ATtiny, please share the results. A formula based timing for the ATTINY would be important as the tiny has less RAM than UNO or MEGA

@Paul, (you're not of topic imho)

Have you tried AltSoftSerial on a 1Mhz ATTINY as that is what Robin is looking for?

a quick look at the AltSoftSerial.cpp files didn't trigger any optimizations, code looks quite tight. Only thing noticed is a receive buffer overflow detection can be implemented very easily.

ISR(CAPTURE_INTERRUPT)
{
        ...
            if (state >= 9) {
                DISABLE_INT_COMPARE_B();
                head = rx_buffer_head + 1;
                if (head >= RX_BUFFER_SIZE) head = 0;
                if (head != rx_buffer_tail) {
                    rx_buffer[head] = rx_byte;
                    rx_buffer_head = head;
                }
else
{
   overflow++;
}
        ...

I wonder to what extent high baud rates are necessary with SoftwareSerial (in any of its varieties). If an application needs more than one fast serial connection it seems much simpler to use a Mega.

For the project I have in mind 2400 baud is probably quite adequate.

...R