submitted as proposal - https://github.com/arduino/Arduino/issues/1198 -
@Marc G
A serial protocol is like a train with wagons and on each wagon there is one bit, 10 bits in total (including start/stop bits). The baud rate represents the speed of the train.
The software serial receiving code is triggered by the edge of the start bit (train). To read a bit properly one wants to read the value of the signal (HIGH/LOW) in the middle of the bit, not at the edges.
rxCenter is the time to the (approx) middle of the first (start) bit, rxIntra, rxStop are used as timings from the middle of one bit to the middle of the next. The higher the baud rate the lower these numbers.
tx is used for the timing for the transmit.
You can see this in the code of the library - C:\Program Files (x86)\arduino-1.0\libraries\SoftwareSerial - (windows) and search for this function - void SoftwareSerial::recv() -
Did a more extensive test with the SoftwareSerial using the formula approach. I connected two Arduino's - UNO, 16Mhz (resonator) + 2009, 16Mhz (crystal) + IDE 1.0 - both using SoftwareSerial, one Master and the other Slave (essentially an echo).
The master sent byte 0x55 at baudrate 100 and waits until the slave echos it back. If the answer is not 0x55, the test fails and master prints a message. Otherwise it just increases the baud rate with 100 and starts over.
The results are pretty good as it only gets constantly distorted above 190K baud. Between 90K and 190K it only failed 10 times.
I took 0x55 as test pattern 0x55 == 01010101 ; it helps to see what happened. (see comments after output
Typical output (multiple runs had comparable output) Note: started with baud rate 100 in steps of 100...
start...
BAUD BYTE
97600 F5 FAIL // = 11110101 ??
111200 AA FAIL // = 10101010 1 bit shifted
114400 D5 FAIL // = 11010101 1 bit failed (interference with start bit ?
124600 AA FAIL
140600 D5 FAIL
145500 AA FAIL
149000 D5 FAIL
163200 AF FAIL // = 10101111 ??
190500 FF FAIL // = 11111111 expect sync lost
190600 FF FAIL
190700 FF FAIL
190800 FF FAIL
190900 FF FAIL
191000 FF FAIL
...
The master and the slave were kept in sync by starting at the same baud rate an wait for each other.
To repeat the test start the master, then start the slave, and press a char in the serial monitor of the master.
Slave program (essentially echo)
//
// FILE: serialSlave (echo)
// AUTHOR: Rob Tillaart
// DATE: 2013-01-02
//
// PUPROSE: test SW serial with formulas
//
#include
SoftwareSerial mySerial(2, 3);
void setup()
{
Serial.begin(9600);
Serial.println("start slave...");
}
unsigned long baud=0;
void loop()
{
baud += 100;
mySerial.begin(baud);
while (mySerial.available() == 0);
int b = mySerial.read();
mySerial.write(b);
Serial.println(b,DEC);
delay(10);
}
master program
//
// FILE: serialMaster
// AUTHOR: Rob Tillaart
// DATE: 2013-01-02
//
// PUPROSE: test SW serial
//
#include
SoftwareSerial mySerial(2, 3);
void setup()
{
Serial.begin(9600);
Serial.println("start...");
}
unsigned long baud=0;
void loop()
{
if (Serial.available() > 0)
{
Serial.flush();
baud += 100;
mySerial.begin(baud);
mySerial.write(0x55);
while (mySerial.available() == 0);
int b = mySerial.read();
if (b != 0x55)
{
Serial.print(baud);
Serial.print("\t");
Serial.print(b, HEX);
Serial.print("\t");
Serial.println(" FAIL");
}
delay(20);
}
}
As always comments/remarks are welcome
same test with stepsize 10 gave some more errors (typical run started with baud rate 10, step 10)
start...
BAUD BYTE
70660 D5 FAIL
81950 AD FAIL
88870 AF FAIL
89570 BD FAIL
94410 D5 FAIL
95340 AA FAIL
96590 D5 FAIL
98980 AA FAIL
100750 AB FAIL
103590 BD FAIL
105740 D5 FAIL
110600 AA FAIL
113260 AF FAIL
120200 AA FAIL
...
Up till 70K no failures ( that are 7000 different baudrates tested !) between 70K and 115K "only" 13 failures (13 fail on 4500 baudrates tested ~~ 1/300 above 120K the failures increased, not shown
Conclusion from the tests, SoftwareSerial "by formula" works very good up to 70.000 and reasonable well up to 115.200 Tweaking the formulas further may improve the test results but for now I'm quite satisfied.
This SoftwareSerial "by formula" allows one to build a communication channel in which the baud rate is constantly altered, making it very difficult to eavesdrop - and yes to get in sync :)
@Rob,
I applaud this effort. Thanks for investigating so thoroughly. I always thought it might be fun to develop some equations that allow the synthesis of the "table" values on the fly, and now it looks like you are pretty close to doing just that.
It's good that you are getting error-free transmission up to 70K. Make sure you test not just the single byte round trip, but also lots of bursts. The values should vary Example:
- Arduino sends 0x55 as fast as possible to host for one minute.
- Arduino sends 0xFE as fast as possible to host for one minute.
- Arduino sends 0x01 as fast as possible to host for one minute.
- Host sends 0x55 as fast as possible to Arduino for one minute.
- Host sends 0xFE as fast as possible to Arduino for one minute.
- Host sends 0x01 as fast as possible to Arduino for one minute.
When constructing the tables, I found several times that I thought the values were good--until I tested the large bursts.
If we want to improve performance at baud rates > 57.6K, I think we're going to have to optimize the timer tick vector. I studied this for some time with the logic analyzer and discovered that the occasional glitch was due to a timer tick interrupt being processed exactly when a pin change was pending.
Lastly, and you probably already know this, but if your formula is off a bit for the lower baud rates, it shouldn't be a big deal. They are very tolerant.
Nice!
Mikal
robtillaart
like the idea of a formula, but could be slow to change baud rate could it not.
not certain, and for interest, did you try two boards connected using the standard software serial code, did you try two boards using the standard hardware uart.
@Dr John,
Yep a changing baudrate communication would certainly be slower than a fixed speed, but calculating the values take micro-seconds, no FP math involved.
I did the test with 2 Arduinos - UNO + DUemillanove - so one with a crystal and one with resonator (?) and used for both SW serial (you could have seen this in the code
I did not try a HW serial against the SW serial yet although I did test it with a (19200) HW serial LCD - see earlier post.
This analysis is not final yet as I expect the formulas can be improved a bit for the higher speeds. This can be done by non-linear polynomes at the cost of extra footprint or maybe by slighty tuning the constants in the formulas. Need some time to test (a lot more)
@Mikal, stuff to think through, thanks
well done
testing this sort of thing is a real pain I know,
Tweaked the numbers in the spreadsheet to minimize the cumulative relative error. There was a large relative error in the higher baud rates, now the relative error is minimized, while keeping the functions linear
(not extensively tested yet)
// 16MHZ
rxstop = 16000000L/(7 * baudrate) - 2;
rxintra = rxstop;
tx = rxstop - 4;
rxcenter = rxstop/2 - 7;
// 8MHZ
rxstop = 8000000L/(7 * baudrate) - 4;
rxintra = rxstop;
tx = rxstop - 2;
rxcenter = rxstop/2 - 10;
// 20MHZ
rxstop = 20000000L/(7 * baudrate) - 3;
rxintra = rxstop;
tx = rxstop - 3;
rxcenter = rxstop/2 - 7;
to be continued...
Run with the previous formulas
start...
BAUD BYTE
70660 D5 FAIL
81950 AD FAIL
88870 AF FAIL
89570 BD FAIL
94410 D5 FAIL
95340 AA FAIL
96590 D5 FAIL
98980 AA FAIL
100750 AB FAIL
103590 BD FAIL
105740 D5 FAIL
110600 AA FAIL
113260 AF FAIL
120200 AA FAIL
...
Now a run with the new offsets
start...
BAUD BYTE
90440 D5 FAIL
97150 AD FAIL
101140 AA FAIL
101210 D5 FAIL
103180 D5 FAIL
105430 AA FAIL
106130 D5 FAIL
108400 A9 FAIL
108990 AA FAIL
109440 D5 FAIL
111270 AA FAIL
111320 D5 FAIL
117300 D5 FAIL
118480 AA FAIL
...
The first fail with new parameters lies about 20 K higher, but other runs started to fail at ~79/80K .
Conclusion for now: The new offsets are definitely better than the previous, but still not good enough to get a fail free software serial up to 115200. TODO: test @8Mhz and @20Mhz (don't have such duinos)
A deep dive in the code might be needed. TBC...
Next test - longer string "the quick brown fox jumps over the lazy dog" (42 chars) sent from A-> B at different speeds starting at 100 baud step size 100. B sends back the number of chars correctly received from start of the string. so when receiving "theXquick brown fox jumps over the lazy dog" the answer would be 3.
Again we see baud rates up to 70K perform 100%, above failing starts...
BAUD CHARS
74800 34 FAIL
79900 11 FAIL
81000 3 FAIL
84900 31 FAIL
85200 23 FAIL
85800 21 FAIL
86200 15 FAIL
86300 20 FAIL
86600 38 FAIL
86900 41 FAIL
87100 38 FAIL
87300 15 FAIL
87400 20 FAIL
87700 38 FAIL
88300 20 FAIL
88400 170 FAIL <<<< a very strange one ???
88500 37 FAIL
88700 18 FAIL
88800 15 FAIL
89000 38 FAIL
89200 38 FAIL
89500 15 FAIL
89600 12 FAIL
89800 18 FAIL
90000 31 FAIL
..
For the statistics: highest successful transfer rate was 103800 baud,
Conclusion: Up to 70K the formula based SoftSerial does work as expected, that is about 20% faster than 57600 from the fixed tables.
115200: As the values are identical to the table based SS for 115200, I do not expect the (exisiting) table version to work at least for receiving data.
Another test; steps of 1000 start @ 1000 (so not as fine grained but much faster).
The string length is 42, the value returned is now the number of same characters.
So 41 means that of the characters received 41 matched “the quick brown fox …dog”.
Missing chars start now at 83K and we see the quality gradually drop, missing 8 chars at 115K => that is 20%!)
(baudrates without extra info are OK)
70000
71000
72000
73000
74000
75000
76000
77000
78000
79000
80000
81000
82000
83000 41 FAIL
84000
85000 41 FAIL
86000
87000 40 FAIL
88000 41 FAIL
89000
90000
91000 41 FAIL
92000 41 FAIL
93000 41 FAIL
94000
95000 39 FAIL
96000 41 FAIL
97000 41 FAIL
98000 41 FAIL
99000 41 FAIL
100000 40 FAIL
101000 40 FAIL
102000 39 FAIL
103000 41 FAIL
104000 40 FAIL
105000 39 FAIL
106000 39 FAIL
107000 39 FAIL
108000 38 FAIL
109000 39 FAIL
110000 37 FAIL
111000 38 FAIL
112000 39 FAIL
113000 39 FAIL
114000 37 FAIL
115000 34 FAIL
116000 34 FAIL
117000 39 FAIL
118000 36 FAIL
119000 36 FAIL
120000 38 FAIL
No new conclusions from this test.
I like what you are doing robtillaart!
Would it be possible to write a code that auto calibrates the "magic numbers".
I would imagine using two serial communication links. One link could be the hardware serial port that would send to the slave the baud rate to be tested. Next, at the determined baud rate, the master would send a byte or string that is predetermined. The slave would then adjust the timing variables (within an allowed range) until the string is captured successfully "x" number of times. Lastly, the slave would perhaps save the variable results to EPROM or send to the serial monitor.
Would it be possible to write a code that auto calibrates the "magic numbers".
Definitely, but because the tables existed It was easier to let Excel find them.
In fact you can derive the magic numbers from the protocol. You know that the a data bit has 1/baudrate seconds between the edges. As you can see how much time the machine code takes to read and store a bit you can derive the magic numbers even quite exact for a given baud rate.
The real problem is that you need to find a protocol that works for all given baud rates, so an auto calibrating mode needs a number of known patterns to learn. Best to start with the highest baud rate as this is the most critical one. If you have the basic formula rxbit = CLOCKSPEED/(alpha * baudrate) - beta; you can try all possible combinations for alpha and beta between 1..20 so after 400 bytes you get the ranges for alpha and beta that work. You try the next baud rate and the ranges will decrease until all baudrates done. Then take the middle of the ranges and you're done.
A better faster approach is first find the optimal alpha, the search the optimal beta then alpha again then beta again until same values appear.
That will bring you to the optimal value within 50 or so bytes (~10x faster).
Instead of linear search through the ranges you can do a binary search,...
For me the strange thing is the value SEVEN where I expected EIGHT in the formula - 16000000L/(7 * baudrate) - 3; but I did not really investigate 16000000L/(8 * baudrate) + BETA ... => todo list ;)
I'm glad to hear it is possible. You have helped my understanding of the process a lot although, it is a bit over my head to try to program an auto-calibration program.
Thanks, Mark
There are 4 levels of acting -> goal -> strategy -> tactics -> operations
You did define your goal, but you were thinking tactics (how to program) and operations (details) . You skipped one level.
You must think of the strategies how such thing can be done and do "thought experiments" to find the tactics that belong to the strategy to make the strategy choice.
strategies can be : - brute force (test all), random search (aka darting;), hill climbing (change any param that improves the result), analytics etc
imho it is not over your head, it's not easy but you can do it, give it a try
Hey Rob, I just wanted to say thank you for your research on this. As it turns out I have a need for SoftwareSerial to do 7800 baud to read from a motorcycle ECU->Dash communication link and it sounds like you've solved my problem. :)
@synfinatic
Let me hear the results of your tests, I'm interested. I can imagine that a motorcycle can generate quite some noise which may be disruptive to the signal. Did you think shielding/grounding etc?
I haven’t even begun to think/worry about that yet. Hopefully it just works. Ha!
In all seriousness, I’m literally waiting for FedEx to deliver my 'scope (hopefully Friday) so I’ll have a better idea then for what I’m dealing with. I do know that at 9600 baud it seems to kinda-sorta work, but I do seem to be getting some corruption, but that mostly seems to be due to the timing differences from what I can tell/guess. So it appears the line is pretty clean.
That being said, I haven’t tried it with the bike actually running yet, so things could get a whole lot noisier once those coils start firing.