Serial comms / hardware flow control = nightmare

Hopefull this is the right section for this topic?

I have just spent a week or so trying to port some code. I just need to "vent" in the hope...

  1. someone else might be able to help
  2. someone googling for an answer in the future can avoid the same headaches
  3. let others know of my code to try to implement diagnostics
  4. let others know that there is a hardware flow control solution (but YMMV)
  5. my blood pressure reduces

So I have a hardware project which is controlled by a Teensy 3.5 varient of Arduino. It has an in-built SD card socket and I need to get 500k to 1Mbyte of data from the PC to the card. After much searching I found that someone had implemented ZMODEM for the Arduino. I normally use TeraTerm for comms and NOT the built in terminal. TeraTerm can transfer files in ZMODEM protocol. Well I was able to port the code, with the usual minor headaches, mainly to do with the SDcard library. Amazingly it all works and I can transfer data at 921600 baud.

Now I have version 2 of my project. It needs more Arduino I/O pins. Since I don't have a spare Teensy but I do have a Mega 2560, I thought that this would be a perfect solution. So I added a SD card module and wired up. The SD card worked first time! So did my own user code. The ZMODEM part however did not work and there were no diagnostics. I suspected some kind of serial comms overflow. I dug deep into the code, adding lots of debug in order to debug the problem. Often having made steps forward, I went backwards again. I have spent tens of hours going down this rabbit hole! Ironically adding debug code (serial print) slows the code and makes data loss more likely.

NOTE: I am using a FTDI board for the ZMODEM data channel (on Serial3). I use the standard serial for user I/O and debug messages.

WARNING: the Mega 2560 will NOT operate reliably (at all?) at any standard baud rate over 115200.
At 230400 there are framing errors. At 921600 TeraTerm just displays Japanese characters!!! In hindsight this should not be a surprise, but I just wanted to alert anybody googling about this. The problem is that the 16MHz xtal frequency does not divide down nicely for these higher baud rates. HOWEVER if you use a non-standard baud rate of 250k or 500k, then serial comms once again work (no framing errors). These baud rates can be derived easily from the 16MHz xtal. This information can be found by googling but is not obvious. It took me a while to figure out, but you can simply type the non-standard baud rate into the TeraTerm baud drop down menu. NOTE: I debugged all this using simple software which sent data typed in Serial to Serial3 and data typed in Serial3 to Serial.

Adding the debug code takes Mega 2650 CPU time and makes serial receive buffer overflow more likely. I have ZMODEM sort of working at 115200 baud for small files, but it sometimes works and sometimes doesn't. The lack of diagnostics has annoyed me.

So I searched for a way to increase the Arduino RX buffer size. The best solution was found on a website from where I have bought Arduino boards in the past. I do not know if I can post links, but search for "Arduino Serial Port Buffer Size Mod". The idea was to copy the Arduino core folder, modify the serial buffer size #define (now has a different name to that given in the article), and add a new board to boards.txt (so you do not impact compiles for any other Arduino boards). This was a part solution to the problem, but I found that TeraTerm was trying to send big packets. There are some variables in the file, but although I reduced them, TeraTerm was still trying to send serial packets bigger than my new buffer size (512 bytes FWIW). I have a 32k bytes file of test data. Sometimes it transferred correctly (32767 bytes) other times the file size was smaller suggesting that data was being dropped.

So now with more searching I found someone had posted a hardware solution. The FTDI board has some extra pins, one of which is CTS. I found that when this pin was held high, TeraTerm stopped sending, but buffered data (i.e. it was not lost). When CTS was taken low, the buffered data was transmitted. I used a spare port bit (initially forgetting to set PinMode OUTPUT doh!) and was able to confirm that this worked. But again I had mixed success with the bigger files (>500k).

Now I jumped into the system Serial source code to add diagnostics. In the RX interrupt routine, it checks for parity error (no idea why as framing error would be a better choice) and RX buffer full (in which case the last byte received is thrown away). I added new variables to HardwareSerial.h to track status and buffer fullness and modified the interrupt routine in HardwareSerial_private.h. Now perhaps there is a better way to achieve this and I can re-define this routine in my own code files rather than modifying system files??? The code hints at this, but I am no C++ expert. I can now reset my new variables to zero with a call to a new function, do my big data transfer, then call the same function to read back the values. At least now I can confirm buffer overflow. Even though I added the CTS control in the interrupt routine I am still getting buffer overflow.

So I really need to get hardware flow control working because I need to use high baud rates such as 500k baud to minimise the data transfer time of 1Mbyte files. At least now I have a way of seeing what is going wrong. Anyone interested can either reply to this thread or PM(?) me and I would be happy to share details. If I am not understanding the correct way to use C++ to achieve this, please help educate me.

BTW my solution is far from ideal because...

  1. all 3 serial hardware buffers become 512 bytes when I only need one (wasting 1k bytes for Serial1 and Serial2)
  2. the PORT used for flow control is hard wired in (there must be a better way to pass this information from my user code to the low level code
  3. the new "debug" code impacts all 3 serial hardware ports as the same code is used for all 3, and I have no idea how to use #defines (or however) so that my extra code only is implemented for Serial3.

Any help or advice would be gratefully accepted!