Windows serial unreliable - Linux 100% reliable

In my extensive and brutal testing, I’ve found that Arduino serial communication cannot be trusted at all on Windows. Either, I don’t get all of the bytes back, or one or more bytes are corrupt. Conversely, on Linux, every single byte makes it’s way to and from the device with 100% accuracy, every time. Has anyone else noticed this? Does anyone know how to make Windows reliable?

Note that I’m currently using 500000 baud, but I’ve found that NO baud rate makes Windows reliable.

Windows Code:

#include "windows.h"
#include 
#include 
#include "fcntl.h"
#include "commctrl.h"
#include "conio.h"
typedef unsigned char byte;
typedef uint64_t uint64;
HANDLE _sp = INVALID_HANDLE_VALUE;
uint64 Ms() {return GetTickCount64();}

bool Open()
{
    if(_sp != INVALID_HANDLE_VALUE) ::CloseHandle(_sp);
    _sp = ::CreateFile(L"\\\\.\\COM10", GENERIC_READ|GENERIC_WRITE, 0, 0, OPEN_EXISTING, 0, 0);
    if(_sp==(HANDLE)-1) return 0;
    DCB dcb = {0}; dcb.DCBlength = sizeof(dcb);
    dcb.fBinary = 1;
    dcb.BaudRate = 500000;
    dcb.ByteSize = 8;
    dcb.fDtrControl = DTR_CONTROL_ENABLE; //IMPORTANT!
    if(!::SetCommState(_sp, &dcb)) return 0;

    COMMTIMEOUTS timeouts;
    timeouts.ReadIntervalTimeout          = MAXDWORD;
    timeouts.ReadTotalTimeoutMultiplier   = 0;
    timeouts.ReadTotalTimeoutConstant     = 0;
    timeouts.WriteTotalTimeoutMultiplier  = 0;
    timeouts.WriteTotalTimeoutConstant    = 0;
    if(!SetCommTimeouts(_sp, &timeouts)) return 0;

    Sleep(2000);
    return 1;
}

int Write(const byte* d, int len)
{
    DWORD w = 0;
    ::WriteFile(_sp, d, len, &w, NULL);
    return w;
}
int ReadAll(byte* buf, int len, uint64 timeoutTime)
{
    for(byte* p = (byte*)buf;;)
    {
        DWORD r = 0;
        ::ReadFile(_sp, p, len, &r, NULL);

        if(r > 0)
        {
            len -= r;
            p += r;
            if(!len) return int(p-buf);
            timeoutTime = Ms()+500;
        }
        else if(Ms() >= timeoutTime)
            return int(p-buf);
        else
            Sleep(0);
    }
}

int main()
{
    if(!Open()) return -1;
    PurgeComm(_sp, PURGE_RXCLEAR|PURGE_TXCLEAR);
    const int sendSize = 1024; byte buf[sendSize];

    for(int loop = 1;; ++loop)
    {
        int left = sendSize; byte b = 0xF;
        for(int i = 0; i < sendSize; ++i) buf[i] = 0xF; Write(buf, sendSize);

        buf[0] = 0; ReadAll(buf, 1, Ms()+1000);
        if(buf[0] != 6) return -3;
        if(ReadAll(buf, sendSize, Ms()+1000) != sendSize) return -2;

        for(int i = 0; i < sendSize; ++i) if(buf[i] != 0xF0) return -4;
        printf("%i\n", loop);
    }
}

Arduino Code:

typedef unsigned char byte;
int expSize = 1024, left = expSize;

void loop()
{
  while(Serial.available())
  {
    byte b = (byte)Serial.read();
    if(b != 0xF) {left = expSize; Serial.write((byte)5);}
    else if(!(--left))
    {
      left = expSize; Serial.write((byte)6);
      while(left--) Serial.write(0xF0);
      left = expSize;
    }
  }
}

void setup() {Serial.begin(500000);}

Linux Code:

#include <iostream>
#include <cstdlib>
#include "termios.h"
#include "fcntl.h"
#include "unistd.h"
#include "sys/ioctl.h"
typedef unsigned char byte;

int _port;

bool Open()
{
 _port = open("/dev/ttyACM0", O_RDWR|O_NOCTTY|O_SYNC|(nonBlocking?O_NONBLOCK:0));
 if(!_port==-1)
 return 0;
 termios tty;

 if(tcgetattr(_port, &tty) < 0)
 return 0;

 cfsetspeed(&tty, B500000);
 tty.c_cflag |= (CLOCAL|CREAD);
 tty.c_cflag &= ~CSIZE;
 tty.c_cflag |= CS8;
 tty.c_cflag &= ~PARENB;
 tty.c_cflag &= ~CSTOPB;
 tty.c_cflag &= ~CRTSCTS;
 tty.c_iflag &= ~(IGNBRK|BRKINT|PARMRK|ISTRIP|INLCR|IGNCR|ICRNL|IXON);
 tty.c_lflag &= ~(ECHO|ECHONL|ICANON|ISIG|IEXTEN);
 tty.c_oflag &= ~OPOST;
 tty.c_cc[VMIN] = 1;
 tty.c_cc[VTIME] = 1;
 if(tcsetattr(_port, TCSANOW, &tty) != 0)
 return 0;
 int v = TIOCM_DTR;
 ioctl(_port, TIOCMBIS, &v); //For Arduino
 return 1;
}

int Write(const byte* d, int len)
{
 int w = write(_port, d, len);
 return w;
}
int ReadAll(byte* buf, int len)
{
 for(byte* p = (byte*)buf;;)
 {
 int r = read(_port, p, len);

 if(r > 0)
 {
 len -= r;
 p += r;
 if(!len) return (p-buf);
 }
 }
}

int main(int argc, char* argv[])
{
 if(!Open())
 return -1;
 sleep(2);
 const int sendSize = 1024; byte buf[sendSize];
 for(int loop = 1;; ++loop)
 {
 int left = sendSize; byte b = 0xF;
 for(int i = 0; i < sendSize; ++i) buf[i] = 0xF; Write(buf, sendSize);
 buf[0] = 0; ReadAll(buf, 1);
 if(buf[0] != 6) return -3;
 if(ReadAll(buf, sendSize) != sendSize) return -2;
 for(int i = 0; i < sendSize; ++i) if(buf[i] != 0xF0) return -4;
 printf("%i\n", loop);
 }
}

Someone else here who reported the same problem, but didn’t provide source code:

Now you have all of the source code you need.

Windows serial unreliable - Linux 100% reliable

Nice to know.

But I'm sufficiently broad minded not to trust that assertion.

...R

awarnick: I've found that NO baud rate makes Windows reliable.

NOW look what you have done.........

You are the first person in the known universe to have discovered that dirty little secret, Microsoft will be out of business by next Tuesday, and thousands of people all over the world will be out of a job,

Still, all those Arduino<>Windows users will wonder how the hell they missed it, but this explains the problems they have been suffering for years, so karma.......

I'm so disappointed. Compiled your code in VS2017 (community edition), ran the executable from the command prompt and redirected the output to a file. Stopped it after a while. 11529 lines written to file and last number is 11529; checked a couple of numbers in the file as well against the line number and they match.

Windows 7 Home Basic on Intel I3 with 8GB.

//EDIT Windows code initially did not compile (something with the different types of String in the create file call); I'm not a Windows C++ developper so did a rough fix without bothering.

Thank you for posting your OS. I ran some tests with Win7, and it’s more stable, but still fails after around 3000 requests. With Win8 and up, it fails after 200 requests or less. Compile it as a console program, and just run it instead of piping to a file. Let it run all night and see if it fails.

I think it is a little hasty to blame Windows, I believe a better description would be “Windows serial is unreliable with my Windows application”

I paired a different Windows application with your Arduino sketch and ran numerous tests without a single failure, all but one of the tests were run on a Win 10 machine at 500000 baud with a total transmit/receive of 10 Gb.(5000 iterations)

I am sure your Win app and Arduino sketch were put together for test purposes and perhaps a little fun so you are probably aware of certain optimizations that can be made to vastly increase the transaction speed. By using Arduino Serial.write[buf,len] the total transmit/receive cycle can be reduced by 80% to 90% , it is not always feasible but using byte packets over single byte transmission will always show an increase in speed.

I don't normally send one byte at a time. I only did so because I wanted to keep the sample as small and simple as possible. This isn't for fun; we need reliable software for production environments.

  1. You must have modified the sample, because only 1K is being sent at a time, so 5000 iterations is only about 5 megs. If you did in fact send 10G, I'd love to know what serial settings you're using. Maybe there is a magic setting I need to toggle in Windows, or maybe you're using flow control....?

  2. If you try the sample code without modification (both Windows and Arduino) is it reliable for an hour or more?

Just in case you were insinuating that writing the entire buffer all at once would fix the problem:

typedef unsigned char byte;
byte buf[1024+1], *p = buf+1, *pSt = p, *pEnd = buf+sizeof(buf);

void loop()
{
  while(Serial.available())
  {
    *p = (byte)Serial.read();

    if(*p != 0xF)
    {
      p = pSt;
      Serial.write((byte)5);
    }
    else
    {
      *p++ = 0xF0;
      if(p==pEnd)
      {
        p = pSt;
        Serial.write(buf, sizeof(buf));
      }
    }
  }
}

void setup()
{
  Serial.begin(500000);
  *buf = 6;
}

It doesn't.

Hi awarnick, I was not insinuating writing the entire buffer would fix the problem, I figured that you were already aware that writing packets had a huge speed advantage over writing single bytes, my comments were aimed at anyone following this thread that might not have known that.

I am not criticizing what you are doing and I see you have a serious interest in what you are doing, for me it is fun and so far I have enjoyed this thread.

Sorry 10 Gb was a typo and should have been 10 Mb. The application I wrote transmitted one 1024 byte and received one 1024 byte packet 5000 times, so as you pointed out 5 Mb each way for a total ~10 Mb. There were a few extra bytes for the value of 6 written at the head of each Arduino packet. I went to 5000 because you mentioned 3000 as being a failure point and I wanted to go well beyond that.

This is the Arduino code, not as pretty as the original but it worked for testing.

typedef unsigned char byte;
int expSize = 1025, left = expSize;
byte buf[1025];

void loop()
{
  while(Serial.available())

  {
    byte b = (byte)Serial.read();
    if(b != 0xF) {left = expSize; Serial.write((byte)5);}
    else if(!(--left))
    {
    
      Serial.write(buf,1025);
    
      left = expSize;
    }
  }
}

void setup() {Serial.begin(500000);
while(expSize--) buf[expSize]=0xF0;
buf[0]=6;
expSize=1025;}

The Windows app was a .Net app and I used the System IO Ports SerialPort object at 500000 baud, no flow control and standard read buffer (which is 4096 I think). The program initialized the cycle by transmitting 1024 bytes of 0xF then waited for the byte of 6 followed by 1024 bytes of 0xF0 continuing for 5000 iterations, I did try a higher baud but that failed immediately, I may revisit that.

The data was written to a rich text box, I thought that may be slowing things down but after trying a few things I could see it was not making a lot of difference. The text box gave me a visual of what was happening and at the end of each run I would get the total time taken and a character count of all received characters so I could determine if there were errors. The total time for 10 Mb was in the range of 4m which when calculated against the baud rate was fairly good.

In my mind the Arduino works well and looks to me to be reliable with the right Windows environment.

This issue depends on many factors. One major factor is CPU usage that can drain time from serial polling. I've found that the issue is much worse when there are other applications interfering.

When you run a test:

  1. Use the code I wrote, or at least model your code the same, and don't log to file or to a textbox: Make your app a console app. Reducing CPU usage in your own application in between serial communication is critical for maximum throughput.

  2. Let the test run for at least an hour before you give up. Ideally, let it run for at least 8 hours. If it's not running when you come back, there was a failure somewhere.

  3. Run applications that hog CPU time. For us, this isn't necessary, but for an initial test, do this. Once you've proven the concept, you should be able to reproduce it under normal conditions.

It's very reproducable if one don't redirect to file; it might even be reproducable if one redirect to file if one waits long enough. I've been playing a bit around with the Windows code; created a bigger serial buffer (eventually 16k) using setupComms(), but it did not help. Reducing baudrate at both sides seemed to help (possibly resolve it, not tested long enough); I used 115200 to test.

Your Arduino code has a flaw. You send 1024 bytes to the Arduino; once you have received the first one in the Arduino, you send 1024 bytes back. Sending will block once the software transmit buffer is full (64 bytes) and as a result the software receive buffer (64 bytes) will overflow and you will loose about 960 bytes that were send by the PC.

Now the question is what you're trying to achieve? Your code represents a theoretical scenario flooding both the PC and the Arduino with data. So what's the real purpose of the exercise?

Note Have you considered / tried to give you application a higher priority? See e.g. https://stackoverflow.com/questions/4208/windows-equivalent-of-nice.

Reproduced, GREAT!!! Thank you! I've tried a TON of different things. Reducing speed does help, but there are still failures. I even tried sending in small chunks, and again, more reliable, but still not acceptable. The purpose is simply to make sure that when I send a message, I know it will be received properly every time. I'm really starting to think I should be using Teensy instead.

I don't see the coding flaw you refer to.

If you're able to try Linux, you'll see that it will run forever with 0 failures.

Quote by pjrc:

"On Teensy, where Serial is USB virtual serial, it's possible to transfer to the PC at speeds approaching 1 Mbyte/sec, if the data is written in blocks with Serial.write(buf, size). It works extremely well, with zero data loss, due to USB's end-to-end flow control. Yes, that's 1 million BYTES per second, not 1 million BITS per second."

It seems to me that this is a known issue, but most people here pretend that it doesn't exist.

awarnick: Reproduced, GREAT!!!

Note; I only encountered situations where the return value indicted missing data (only 1023 bytes received instead of 1024. Forgot to mention that.

awarnick: I don't see the coding flaw you refer to.

I tried to modify the Windows code to prove the flaw but failed; so I'm probably not quite understanding your Arduino code :(

awarnick: It seems to me that this is a known issue, but most people here pretend that it doesn't exist.

Do you have references that this is a known issue? Just trying to understand what it might be or what is given as the root cause.

awarnick: I'm really starting to think I should be using Teensy instead.

I'm currently running the Windows code against a Leonardo (native USB). Currently at 40k 'packets' (2 to 10 times further as I got before).

Currently at 611k 'packets' ;)

Hmm, and it died at 633570 :( Just when I switched the light off; maybe a power spike.