Go Down

Topic: SoftwareSerial magic numbers (Read 15095 times) previous topic - next topic

robtillaart

Good to hear it works, "outside the lab" ;)
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

genom2

#31
Feb 07, 2013, 11:00 pm Last Edit: Feb 08, 2013, 10:56 pm by genom2 Reason: 1
Hi robtillaart,

I wondered why a long devision (int32) is needed. I tried to find a solution with int (int16) only. And came up with this precise solution (focus is on rxintra):
Code: [Select]

 // 8MHZ
 int baudrate = (int)speed;

 int j=-39 | baudrate;
 int m=2* (-baudrate & 2*j | baudrate);
 int a=((2*m) & m ) | baudrate;
 int rxintra=(-15240-3*a-j)/a-1;

It's precise in a sense that it resembles all table data with 0 difference. (But it's not "linear" in your sense. Instead I call it "weird".)
To make your proof simple I wrote this scetch:
Code: [Select]

#include "WProgram.h"

//
// Lookup table
//
typedef struct _DELAY_TABLE
{
 long baud;
 unsigned short rx_delay_centering;
 unsigned short rx_delay_intrabit;
 unsigned short rx_delay_stopbit;
 unsigned short tx_delay;
} DELAY_TABLE;

//  8 Mhz table
static const DELAY_TABLE table[] PROGMEM =
{
 //  baud    rxcenter    rxintra    rxstop  tx
 { 115200,   1,          5,         5,      3,      },
 { 57600,    1,          15,        15,     13,     },
 { 38400,    2,          25,        26,     23,     },
 { 31250,    7,          32,        33,     29,     },
 { 28800,    11,         35,        35,     32,     },
 { 19200,    20,         55,        55,     52,     },
 { 14400,    30,         75,        75,     72,     },
 { 9600,     50,         114,       114,    112,    },
 { 4800,     110,        233,       233,    230,    },
 { 2400,     229,        472,       472,    469,    },
 { 1200,     467,        948,       948,    945,    },
 { 300,      1895,       3805,      3805,   3802,   },
};

int SoftwareSerial_rxintraA(long speed)
{
 for (unsigned i=0; i<sizeof(table)/sizeof(table[0]); ++i)
 {
   long baud = pgm_read_dword(&table[i].baud);
   if (baud == speed)
   {
     return pgm_read_word(&table[i].rx_delay_intrabit);
   }
 }
 return -1;
}

int SoftwareSerial_rxintraB(long speed)
{
 long baudrate = speed;
 // 8MHZ
 int rxintra = 8000000L/(7 * baudrate) - 4;
 return rxintra;
}

int SoftwareSerial_rxintraC(long speed)
{
 // 8MHZ
 int baudrate = (int)speed;

 int j=-39 | baudrate;
 int m=2* (-baudrate & 2*j | baudrate);
 int a=((2*m) & m ) | baudrate;
 int rxintra=(-15240-3*a-j)/a-1;
 return rxintra;
}

void setup() {
 Serial.begin(9600);
 Serial.println("baud\tA\tB\tC");
 for (unsigned i=0; i<sizeof(table)/sizeof(table[0]); ++i)
 {
   long baud = pgm_read_dword(&table[i].baud);
   Serial.print(baud);  Serial.print("\t");
   Serial.print(SoftwareSerial_rxintraA(baud));  Serial.print("\t");
   Serial.print(SoftwareSerial_rxintraB(baud));  Serial.print("\t");
   Serial.print(SoftwareSerial_rxintraC(baud));  Serial.print("\t");
   Serial.println();
 }
}

void loop() {
}


robtillaart

Amazing!
Seen quite some code in my life but this is a very amazing code (+1)

I didn't ran the code but I am wondering how you can represent 115200 in a int16. (you can't) so it is wrapped or truncated .
That result is tweaked by bit manipulation and some magic numbers to the right values.
Can you explain on which principle the math is based? website?


Drawback is its maintainability as it is hard to understand.
Furthermore as the table is only calculated once when begin(speed) is called, I doubt if there is a serious performance gain.

Does your magic code also calculate the intermediate values e.g. a baud rate of 7800 as in the previous post?


Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

genom2

No, it will not deliver reasonable results for 7800. (At the beginning of this thread you ask for linearity. "weird" functions are not "linear" in your sense.)

I did not invent this "weird" function. Instead a genetic program invented it for me. I fed in the table you gave, reduced to the columns 'baud' and 'rxintra':
Code: [Select]

49664   5
57600   15
38400   25
31250   32
28800   35
19200   55
14400   75
9600    114
4800    233
2400    472
1200    948
300     3805

Because it's an int16 genetic programmer I had to truncate 115200 down to 16 Bits which is 49664.

After a while it found a perfect solution (which is equivalent to 0 differences to the given table). I kept it running for a night whilst the genetic programmer delivered even shorter code solutions. At the next morning I picked the best code sequence (so far) which had a codelen of 24:
Code: [Select]

a = -39;
a |= x;
j = a;
a += a;
n = a;
a &= 0;
a -= x;
a &= n;
a |= x;
a += a;
m = a;
a += m;
a &= m;
a |= x;
n = a;
a += a;
y= a;
a = -15240;
a -= y;
a -= n;
a -= j;
a = a / n;
y= a;
y--;

Where x is baud and y is rxintra. I manually transformed to, what you already know as:
Code: [Select]

// 8MHZ
  int baudrate = (int)speed;

  int j=-39 | baudrate;
  int m=2* (-baudrate & 2*j | baudrate);
  int a=((2*m) & m ) | baudrate;
  int rxintra=(-15240-3*a-j)/a-1;

robtillaart

This is real fun!

Do you have a link to that genetic programmer you used?
Or did you write it yourself? post code ?
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Robin2

#35
Sep 15, 2013, 10:19 am Last Edit: Sep 15, 2013, 11:15 am by Robin2 Reason: 1
I would like to have the ability to read serial data on an Attiny at 1Mhz and while transmitting isn't essential, it would be nice for debugging. I don't need a high baud rate.

As the current standard SoftwareSerial only works for 8, 16 and 20MHz I was considering the original SoftwareSerial in another Thread when Rob suggested I look at this Thread and subsequently suggested I add my comments here.

DISCLAIMER - I'm not a mathematician and I'm not a professional programmer - just curious.

In Post#18 Rob listed the three versions of his formula for each of the three CPU frequencies. I was looking for a pattern that might suggest how to evolve the formula for a different frequency. Rob's work, so far, has been to experiment with different baud rates at a given frequency.

The formulas all have a similar pattern - divide the frequency by an adjusted baud rate and add a correction. So I made a small table listing these numbers. For convenience I will use 16M to mean 16000000.

    Frequency Baud-adj rxStop-adj tx-adj rxCen-adj
    20M           7          -3             -3        -7
    16M           7          -2             -4        -7
     8M            7          -4             -2       -10

However if you look at any of the formulas, for example,

     // 8MHZ
    rxstop = 8000000L/(7 * baudrate) - 4;
    rxintra = rxstop; 
    tx = rxstop - 2;
    rxcenter = rxstop/2 - 10;

you will see that tx is derived directly from rxstop and it could easily be the other way round. So, in my mind, I changed the formulas to be like this (the other bits are unchanged)

     // 8MHZ
    tx = 8000000L/(7 * baudrate) - 6;
    rxstop = tx + 2;

And when you do that my revised first table looks like this

    Frequency Baud-adj tx-adj rxStop-adj
    20M           7            -6        3
    16M           7            -6        4
     8M            7            -6        2

and you can see that the tx formula is independent of frequency. Also, there is no frequency related pattern to rxStop-adj which leads me to wonder if a constant would do for that also.

This has prompted a lot more thoughts in my head. For example would it be difficult to run your tests without any of the adjustment factors - so tx, rxintra and rxstop =  MHz/7/baudrate and rxcenter = tx/2. If that doesn't work, my suspicion is that a single correction to push everything left or right would be all that's needed rather than corrections for each part.

Another thing is that the numbers in the table must take account of the time used by DebugPulse().

...R

Paul Stoffregen

I know this might be slightly off-topic, but I thought it might be worth mentioning the AltSoftSerial library I wrote some time ago.  It uses a formula intended for all baud rates, rather than a lookup table.

http://www.pjrc.com/teensy/td_libs_AltSoftSerial.html

AltSoftSerial uses a completely different design than SoftwareSerial.  Rather than busy looping to transmit and interrupt-triggered busy looping to receive, AltSoftSerial uses Timer1's input capture to receive data and output compare to create the transmit waveform.  As a result, it doesn't hog the CPU while transmitting or receiving, and it does properly buffered transmit like hardware serial.

But the downsides to this approach are consuming Timer1, so libraries like Servo can't be used.  It also can't quite achieve 115200 speed, since it runs an interrupt for each signal transition.  Coded in C, there's just a bit too much overhead to run fast enough to keep up with 115200 with the AVR at 16 MHz.  But it's very close.  Maybe some craft code optimization or implementing some or all in assembly could enable 115200?

At slower baud rates, like 7800, AltSoftSerial works far better than SoftwareSerial, because it never hogs the CPU.  SoftwareSerial at 7800 disables interrupts for at least 1.15 ms, which might be long enough to cause millis() to lose time!


robtillaart

Very nice contribution Robin!

The adjustments are most important for the higher baud rates. As these are not reliable (>70K) anyway with SW serial, strectching them should not be a problem.

Alignment of the formulas into one single formula would make SW serial code simpler. A quick check with the spreadsheet shows that the formula below looks like the optimum.

Code: (experimental) [Select]

tx = F_CPU/(7 * baudrate) - 6;  // replaced the hard coded clock speed with F_CPU (see Arduino.h)
rxstop = tx + 3;
rxintra = rxstop;
rxcenter = tx/2 - 5;


The relative error of the single formula approach rises compared with the dedicated formulas and the original numbers from SW serial lib.  The relative error is important to minimize as these add up per bit. A relative error of 10% timing per bit results in a failed byte as the last bit is out of sync. The most erratic is the rxcenter variable which is fortunately only used once during a byte transfer, to get the middle of a pulse.

Robin, when you have tested your formula on your 1MHz ATtiny, please share the results.
A formula based timing for the ATTINY would be important as the tiny has less RAM than UNO or MEGA
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

robtillaart

@Paul, (you're not of topic imho)

Have you tried AltSoftSerial on a 1Mhz ATTINY as that is what Robin is looking for?

a quick look at the AltSoftSerial.cpp files didn't trigger any optimizations, code looks quite tight.
Only thing noticed is a receive buffer overflow detection can be implemented very easily.

Code: [Select]

ISR(CAPTURE_INTERRUPT)
{
        ...
if (state >= 9) {
DISABLE_INT_COMPARE_B();
head = rx_buffer_head + 1;
if (head >= RX_BUFFER_SIZE) head = 0;
if (head != rx_buffer_tail) {
rx_buffer[head] = rx_byte;
rx_buffer_head = head;
}
else
{
   overflow++;
}
        ...
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Robin2

I wonder to what extent high baud rates are necessary with SoftwareSerial (in any of its varieties). If an application needs more than one fast serial connection it seems much simpler to use a Mega.

For the project I have in mind 2400 baud is probably quite adequate.

...R

Robin2

I have spent the day studying this business of reading bits from first principles and I think I am learning a little.

It seems to me the timing from bit to bit only needs to survive for 8 bits because the start position gets zeroed with every new byte - that's the purpose of the start bit.

And if that's correct it seems to follow that it would be a good strategy to arrange the first "delay" so that the read of the first bit value will happen soon after the start of that bit interval. Then if there is a positive error in the number for the "delay" between bits each successive read will take place a little further along the actual bit interval. But so long as the interval isn't too big it won't fall off the edge of the bit until after the 8 bits have been read. And it can't fall short.

The advantage of doing it this way (it seems to me) is that there is more control of the inevitable error compared to trying to find the centre of the bit interval and work out exact timings from bit to bit. For example it is easy to calculate the theoretical delay but in practice the actual time will always be longer than the number you choose. And it would be illogical to aim for a total delay that is shorter than the theoretical value.

If this is all BS please tell me.

...R

Robin2

Following some more experimenting it seems to me that another source of complexity in SoftwareSerial is the way it uses the various intervals.

I think it would be much easier to keep things in step if, when the start bit is encountered, the time is recorded [for example with micros()] and all of the subsequent timings are referenced to that, rather than being added on top of each other.

The way the intervals are added in SoftwareSerial it is impossible to allow for the time taken to execute the code in between - which will remain constant regardless of baud rate. By measuring the time from a single base that is taken care of automatically.

...R

Paul Stoffregen


Have you tried AltSoftSerial on a 1Mhz ATTINY as that is what Robin is looking for?


Oh, no, I haven't tried it, but I'm pretty sure it won't work.  Most ATTINY chips have only 8 bit timers.  AltSoftSerial needs a 16 bit timer with input capture and 2 compare units.

robtillaart

Quote
The way the intervals are added in SoftwareSerial it is impossible to allow for the time taken to execute the code in between - which will remain constant regardless of baud rate. By measuring the time from a single base that is taken care of automatically.


The time taken to execute code depends on the clock speed (MHz) that makes timing not trivial. The accuracy of micros() (== 4us) makes it hard for software to support higher baud rates, for the lower your idea should work.
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Robin2

#44
Sep 16, 2013, 06:15 pm Last Edit: Sep 16, 2013, 06:55 pm by Robin2 Reason: 1
Any given device will (usually) only work at a single clock speed. I recognize the limitations of micros(). (deleted a bit here) And, presumably, someone can write a loop [like tunedDelay()] that gives a finer count. My point is really that measuring everything from the start point automatically takes account of the time for every little bit of code and, if the device creating the characters works at a consistent speed it should be easy to match it without the need for empirical adjustments.

Indeed I suspect it should be possible to write a sketch that would automatically adjust its timing if it was sent a stream of known bytes against which to test itself.

...R


The time taken to execute code depends on the clock speed (MHz) that makes timing not trivial. The accuracy of micros() (== 4us) makes it hard for software to support higher baud rates, for the lower your idea should work.

Go Up