Serial.flush() hangs on 2Mbps Baud Rates

With a high baud rate (2Mbps), Serial.flush() hangs after a very short time. I might get a 100 or so lines from the script below before it hangs. How long it takes to hang varies from time to time.

void setup(){
  Serial.begin(2000000);
}

void loop(){
  Serial.println("a");
  Serial.flush();
}

I suspect the race condition that matthijskooijman mentioned here is occurring at this baud rate. I do not seem to have this issue at say 1Mbps. Is there any fix that could be used to prevent this race condition? I realize the issue on Github is a few years old but I assume it still applies. My setup is Arduino IDE 1.8.4, Uno R3, and Windows 10 with Putty

I can reproduce the problem with an Arduino Uno 3.

The problem disappears if flush is removed.

There is at least on other problem...

...
a
a
aa
a
a
...
...
a
a

a
a
...

Adding a blink is interesting...

void setup( void )
{
//  Serial.begin(250000);
  Serial.begin(2000000);
  DDRB |= (1 << PINB5);
}

typedef uint8_t millis_t;
static millis_t mark;
const millis_t rate = 125;

void loop( void )
{
  Serial.println("a");
  Serial.flush();

/**/  
  if ( (millis_t)((millis_t)millis() - mark) >= rate )
  {
    PINB = (1 << PINB5);
    mark += rate;
  }
/**/
}

With flush the blink never occurs.

Yes, without flush() the output occasional skips bytes. My best guess to what is happening without flush() is the Serial buffer is being written over before it has had time to send the data out. That is why I was trying to implement flush(). To my understanding, flush() waits for the Serial buffer to finish sending everything out before it will continue. That really is my end goal. If that can be accomplished some other way, I would be up for hearing that solution. I just figured I would post this possible bug to get some other/better eyes on it.

wlong007:
Yes, without flush() the output occasional skips bytes.

Are you basing that on what arrives inside PuTTY?

Yes that is correct. Sorry, I should have clarified that. I have not hooked up an oscilloscope or anything to it to check what it is actually sending out. I am simply basing it off of what Putty prints.

Well, it is within the realm of possibility that PuTTY is dropping the characters.

Is PuTTY part of your final solution?

No, I have this same issue when using Node.js to read the Serial line using the serialport package found here. Node.js is my final solution. I have tried the Arduino setup on two different computers (both Windows 10 with Putty and Node.js). Both of which are having the issue. I have even tried the sketch on an Uno R1 and a Sparkfun Redboard with no luck. My end goal is to have an Arduino Uno R3 talk with Node.js reliably at 2Mbps via serial. I am use Node.js v6.11.3. Ideally, I do not want to have to implement some sort of handshaking between the two devices to ensure data integrity but I may have to resort to that if I cannot get reliable data transfers.

Do you actually need 2 Mps? A typical Arduino is going to struggle to maintain that rate.

Have you considered using a board with a USB pad?

I believe 2Mbps is within the acceptable limits of the Atmel ATmega328 specs (see page 190 of the datasheet). As to my need, I would like to keep it to the Uno platform. I am professor at a university and all the students already have an Uno from previous classes. As to the 2Mbps need, I am having to push a lot of data so the faster the better. Regardless of my needs and setup, flush() is still behaving unexpectedly. Whether my expectation is incorrect or flush() is misbehaving is really the question at hand. If the answer is simply you cannot do what I am trying to do then that is fair. I just wanted to bring to light a possible bug in flush().

wlong007:
Whether my expectation is incorrect or flush() is misbehaving is really the question at hand.

We already have an answer. There is no doubt flush has a nasty deadlock bug and little doubt Serial has a "dropped data" bug.

There are two more questions at hand: what to do about the bugs and what to do to help you move forward.

I believe 2Mbps is within the acceptable limits of the Atmel ATmega328 specs (see page 190 of the datasheet).

Yes, but that's essentially besides the point.

  • Have you made sure that everything else works at 2Mbps? Did you take care to lay out the tracks like a transmission line, with proper termination and such? At 2Mbps, such things start to be relevant.
  • 2Mbps is essentially 5 microseconds, or a max of 80 instructions, for each byte. I count (roughly) 100 cycles or more in the Serial TX ISR; it's just not written to be that fast. So by running at 2Mbps instead of (say) 1Mbps, you probably wouldn't get any additional throughput, even if it worked. And it's not working. if it DOES work at 1Mbps, that would be both interesting to know wrt to fixing the bugs, AND it might be fast enough for your purposes.

westfw:
if it DOES work at 1Mbps...

For what it's worth, it does for me. (I even use a 1 Mbps version of Optiboot.) There is definitely something magic about 2 Mbps.

I count (roughly) 100 cycles or more in the Serial TX ISR

You might try an older version of Arduino. The TX ISR in v1.0.6 is about half the size of the current ISR, and it's enough different that it might not have the bugs, either. Or different bugs.

Edit: ok, it's not QUITE that bad. I count 138 bytes and 123 cycles for the current code, vs 116 bytes and 73 cycles for the v1.0.6 code (not including ISR entry and reti) The old code (not "objectified") has a bunch (13) of 4-byte "lds/sts" instructions while the new code makes good use of index registers.

westfw:
I count (roughly) 100 cycles or more in the Serial TX ISR; it's just not written to be that fast.

The serial converter (ATmega8U2) may be responsible for the dropped data bug. If the Serial code is not able to sustain 2 Mbps then the serial converter code may not be able to either.

So I got a chance to try v1.0.6 and flush() still hangs. Without flush(), there does not seem to be any dropped bytes. However, whenever I run my project code (I did not list it here because it is beyond the scope of my initial post) I get results in line with when I use a 250Kbps speed. Just to explain a little bit of my project, from the Arduino I am sending out micros() over the Serial line (along with other data). Then with Node.js I calculate the time difference between two packets (packets being the micros() value and my other data sent from the Arduino). With my project, I can incrementally increase the baud rate and I see the time difference decrease to a certain point. After which, the time difference flat lines no matter how high the baud rate is. I suspect an issue similar to the answer provide here is occurring when using the v1.0.6. I actually do not know if that is the case seeing as I have not taken an oscilloscope to it. I will keep digging.

Again, according to the ATmega8U2 datasheet (see page 175) it should be able to support this speed. Now I do understand the difference between theoretical limits and what actually happens in practice. However, I just feel that if it is in the docs then we should be able to do. I suppose that might be naive of me. Call me an optimist if you want.

wlong007:
Again, according to the ATmega8U2 datasheet (see page 175) it should be able to support this speed.

The chip being able to support that speed and the code on the chip being able to support that speed are two completely separate questions. You've shown that the first should be affirmative. But the datasheet doesn't say anything about the code running on the chip.

Delta_G:
The chip being able to support that speed and the code on the chip being able to support that speed are two completely separate questions.

Fair enough. It would also appear that the latter is not possible from our tests so far.

Yup. The serial converter is a bottleneck. It is (partially? / fully?) responsible for dropped data. If you want 2 Mbps from a stock Uno you are going to have to modify the ATmega8U2 program. (I will post the test program later today.)

(It is still possible that PuTTY is (partially? / fully?) responsible for dropped data. I get the same results with a Python application. PuTTY is not the problem.)