Data loss when sending to native USB port (SerialUSB)

Hi Stimmer,

I found kicking out the circular buffer a great demonstration "out of the box" thinking.

I have tested your implementation with ArduinoISP. I think this is a good real life use scenario of the serial port, with lots of messages of different sizes (though relatively small) in both directions. I powered my leonardo from the due's 3v3 and modified ArduinoISP to use SerialUSB instead of Serial. It worked well: I could upload download the bootloader to/from the leonardo several times.

Nevertheless dropping the serial buffer also has consequences for feeding the due lots of small messages. Each message is received in a separate bank and you only have two of them. Take for example this shell script:

n=0
while true; do

        echo "$n" > /dev/ttyACM1
        echo $?
        sleep 1
        let "n = $n + 1"
done

I ran this sketch on the due:

static const int led = 13;
static const int button = 3;

void setup() {
  // initialize serial:
  SerialUSB.begin(9600);
  Serial.begin(115200);
  pinMode(led, OUTPUT);     
  pinMode(button, INPUT);
  digitalWrite(button, HIGH); 
}

void loop() {
  // if there's any serial available, read it:
  int i = 0;
  if (digitalRead(button)) {
    while (SerialUSB.available() > 0) {
      char c = SerialUSB.read();
      Serial.write(c);
      i++;
    }
  }
  Serial.print("i=");
  Serial.println(i);
  digitalWrite(led, HIGH);
  delay(500);
  digitalWrite(led, LOW);
  delay(500);
}

If you press the button for a few seconds, some of the messages are lost.
Odd enough the return code of echo is always 0.

I am not sure where the bytes get lost. And maybe it can be prevented with stty calls, but a circular buffer would prevent this. A circular buffer which you fill up with memcpy would also make sense as an implementation.

Interesting - I am not losing any data with that script. What happens for me is that echo blocks until ttyACM1 accepts the data. I expect that there is some mysterious stty setting which controls this.

However, it isn't really the right way to use a serial port in a script. The echo command opens the serial port, writes to it, then closes it again each time it executes. That's not what you'd do in, say, a C++ program, where you'd open the serial port once, write to it periodically, and close it at the end of the program. To do this in a shell script you'd use a fifo or a fd, for example:

exec 3<>/dev/ttyACM1
n=0
while true; do

        echo "$n" >&3
        echo $?
        sleep 1
        let "n = $n + 1"
done

This doesn't lose any data for me either, and it buffers the data on the computer rather than blocking.

What happens for me is that echo blocks until ttyACM1 accepts the data.

Well, that is what I expected to happen for me too when I wrote the script. (And on another pc (linux 3.5 vs. 3.0.0)), I did actually observe echo blocking). The point I wanted to make is that having no circular buffer, one could temporarily block a program on the pc.

But you are right. Closing and reopening the file every time causes the os to flush the file and this makes echo block. If a program would write to ttyACM1 in a normal way, the os would buffer the output... So it looks like we can indeed live without the buffer in the Due.

I read you have Albert's test running. Is it easy to test with SerialUSB?

Yes, Albert's program does work with my SerialUSB. In SimpleArduinoSerPro3 change "Serial." to "SerialUSB." in all 5 places. Then in setup() after SerialUSB.begin(BAUD_RATE); add this line: delay(1000);
I'll have a think about buffers. I'm sure I could reintroduce a buffer with not too much loss of speed. But right now I am sick at the sight of USB code :grin: So I'm going to take a break for a week or two and try some other projects which are more fun.

PeterVH:
I read you have Albert's test running. Is it easy to test with SerialUSB?

Please note that the real author of SerPro is Alvaro (on arduino forum Arduino Forum) who build some interesting applications here, one being arduinoscope running on UNO and MEGA, see old posts on arduino forum like http://arduino.cc/forum/index.php/topic,7762.0.html

The topic of secure USB serial communication with arduino's was always tricky so Alvaro came up with SerPro. What happened is that my scientific and extreme electronics project required very safe RX/TX error-free so Alvaro added 16 CRC's along with HDLC protocol and other stuff as we both are telco engineers with critical software.

For some of you interested in non-java SerPro like GTK or C or C++, you'll find a mine of info directly at Alvaro's github GitHub - alvieboy/arduino-serpro: Serial Protocol (packet/frame based) library for Arduino intercommunication

The IDE version i've published on my website has removed the arduino-oscilloscope hence not using anymore ADC's link.

In fact, Alvaro is building a new arduino tech based on ZPU, namely you create your own softcore uC inside a FPGA with dedicated instructions, like emulate part of the Atmega2560 you want keep and build low level instructions not found in the market
http://www.alvie.com/zpuino/

Regarding ElectroShaman.jar, I've made for my own projects and might publish on github if there is more interest.

stimmer:
Fasten your seatbelts 8)

I have been trying to debug and simplify the code, and also add in multi-byte reads. The (modified) loopback test now runs at 4.3 MB/s (that's about 45 million baud). My 888888888-byte test passes through in about 3 min 30 sec. I also added in some code to control the LEDs. I did this by removing the troublesome ringbuffer code and using UDD_ calls directly (My reasoning was, since there is 2 banks of USB fifo for buffering, why do we need more?)

I do seem to be getting occasional errors but only about once every several gigabytes, which makes it hard to debug as I don't know if it's a problem with the code or the cable or with the way I am testing it. Testing so far indicates it is a problem with transmission (SerialUSB.write())

The new loopback test with multi-byte read looks like this:

void setup() {

Serial.begin(115200);
 SerialUSB.begin(0);
}
int s=0;
int t=0;
void loop() {
 int l;
 byte b[2048]; // 512 might be enough
 if(l=SerialUSB.available()){
   l=SerialUSB.read(b,l);
   SerialUSB.write(b,l);
   s+=l;
 }
 if((millis()-t) > 1000){t=millis();Serial.println(s);}
}




I've attached the three changed USB files as there are many changes.

The way I am testing on linux is as follows:


seq 99999999 >o1
stty -F /dev/ttyACM1 sane raw -iexten -echo -echoe -echok -echoctl -echoke -onlcr min 1
time cat o1 >/dev/ttyACM1 & cat /dev/ttyACM1 >o2

ctrl-c when it is finished

cmp o1 o2



I've also tested with gtkterm (which has been 100% reliable)

Please test thoroughly! I am especially interested to see if removing the ringbuffer causes performance degradation in any situation - I haven't found anything yet which runs slower despite having less buffering. Also I am unable to test on Windows or Mac.

Sorry to revive an old topic, but,

How do you achieve that speed? (4.3MB's) ?

I dont have linux to test your sending-code but i have one made on VB.NET which is fairly simple, made a USB packet of 100KB and receive on the arduino, it takes 700ms to do so, which matches this tests:

http://dangerousprototypes.com/2013/06/03/usb-virtual-serial-benchmarks/

That say that the arduino serialUSB speed is around 110/130kbs and it matches my speed.

My receiving code is:

void setup() {
  // put your setup code here, to run once:
      SerialUSB.begin(1200000);
      delay(2000);
      //Wait for start-mark packet
      while (SerialUSB.available() < 1){}

      unsigned long ChronoOne = micros();      
      int readed = 0;
      int total_bytes = 100000;      
      while (readed < total_bytes)
      {
          while (SerialUSB.available() > 1){SerialUSB.read();readed = readed + 1;}
      }
      
      ChronoOne = micros() - ChronoOne;
      SerialUSB.println("T:" + String(ChronoOne));    //Total bytes received           
      SerialUSB.println("R:" + String(readed));         //Time it took
      delay(50000);
}

void loop() {
  // put your main code here, to run repeatedly: 
  
}

Sending code on VB.NET

        Dim dt1970 = DateTime.Now

        Dim bArraySend2(1) As Byte
        bArraySend2(0) = 1
        SerialPort1.Write(bArraySend2, 0, bArraySend2.Length) 'Start Packet


        Dim bArraySend(100012) As Byte
        bArraySend(0) = 255
        bArraySend(1) = 47
        SerialPort1.Write(bArraySend, 0, bArraySend.Length)

        Dim now = DateTime.Now
        MsgBox("done:" & (now - dt1970).TotalMilliseconds.ToString())

I cant get any faster than this, any ideas? Using 1.5.2

Did you replace the three files Stimmer attached to his post?

I can't explain it any better than I already did - I modified the Arduino source code to add a multi-byte read method and removed some unnecessary buffering. The speedup comes from using multi-byte reads, how that works is explained quite well on the blog off the page you linked to. Note that I have overloaded the read method instead of overriding readBytes, the signature is the same as readBytes though.

I don't think my modification would get accepted into a future Arduino release as it is too much of a change. But it is interesting to note the benchmark results, at the moment the Due is 8x slower than a Teensy, with my code it would be over 4x faster :slight_smile: (The reason is that the Due can use USB 2.0 High Speed)

I've modified my code so that readBytes now also uses the multi-byte read and now Paul Stoffregen's benchmark will run as intended. Here are the results:

~/arduino/sketchbook/usb_serial_receive/host_software$ ./receive_test /dev/ttyACM0 
port /dev/ttyACM0 opened
ignoring startup buffering...
ignoring startup buffering...
ignoring startup buffering...
ignoring startup buffering...
ignoring startup buffering...
Bytes per second = 7385524
Bytes per second = 7273609
Bytes per second = 7414731
Bytes per second = 7309051
Bytes per second = 7339450
Bytes per second = 7260407
Bytes per second = 7387343
Bytes per second = 7384615
Bytes per second = 7293946
Bytes per second = 7390073
Bytes per second = 7265682
Bytes per second = 7386434
Bytes per second = 7385524
Bytes per second = 7275373
Bytes per second = 7384615
Average bytes per second = 7342425

$) $) $) $) $)
I've reattached the 3 files that need to be changed (in the arduino/hardware/arduino/sam/cores/arduino/USB directory)

USBAPI.h (7.07 KB)

CDC.cpp (7.83 KB)

USBCore.cpp (22.7 KB)

duh, sorry, i thought this was implemented in the current version since this thread was from jan.

Alright, i used the new files and with your read-example it works alright, but with the original read method (reading 1 by 1, my posted receiving-code) i only receive 512b out of 100k and gets stuck, i assume this might be because every time i read the buffer with read() or read(b,1) it does not read just 1 byte and it drops all the data received so i end up with only 512b out of 100k.

Is there a way to make it backwards compatible with the current read method, so we can use the old-read-1-by-1 and when needed fast speed for fast transfer use this one

Thanks for trying it out. It is my intention that the code should be backwards compatible - reading one byte shouldn't drop the rest of the packet. I knew that there was a problem with data loss, the difficulty for me has been triggering the data loss bug frequently and reliably (in my own code it only happens once every few hours, which makes debugging very slow work.) There may be some difference between USB handling in Linux and Windows complicating things too.

I've found a way to get the receive_test benchmark to hang with my multi-byte code, which should also help with debugging.

From what i understand, this Read code:

int Serial_::read(void)
{
	LockEP lock(CDC_RX);
  	USB_LED_UPDATE;
	if(peeked){
		peeked=0;
		return peeked_u8;	  
	}
	if (!UDD_FifoByteCount(CDC_RX))
		return -1;
	
	uint8_t c = UDD_Recv8(CDC_RX);
	if (!UDD_FifoByteCount(CDC_RX)){
		UDD_ReleaseRX(CDC_RX);
	}
	return c;
}

In here:
uint8_t c = UDD_Recv8(CDC_RX);
do you receive just 1 byte with UDD_Recv8?
Then if you releaseRX, doesnt it drops all the data stored? if so, we readed one byte and then dropped all the rest, it is consistent with the 512b im getting, since the packets seem to be coming in by 512 im only getting the 1st byte and then dropping everything till the next packet.

I managed to modify the library and by not using a ring buffer (i still use a buffer tho, using memcpy) i managed to achieve 220kbps instead of the 110kbps, but this is far far away from your tests so it would be good if we can get the 1-byte read working :slight_smile:

You can try my code, it makes the bug instant, i can compile the app and post it if u dont have VB

UDD_ReleaseRX should only get called when UDD_FifoByteCount is 0, therefore no data should get lost.

If it is only getting one byte in 512 it should be fairly easy to prove by sending packets such as 0 1 2 3... 254 255 and seeing what bytes get lost.

I don't have (easy) access to windows to try your code but might be able to get the VB source running in mono on Linux.

@stimmer,

would you like to make a pull request for your patch?

may this issue Atmel SAM UHD_Pipe_Read can't handle packet size >255 bytes · Issue #82 · arduino/ArduinoCore-sam · GitHub be related to the data loss?

C

Not yet, I'll wait until the patch is reliable before making a pull request. At the moment it's still losing data somewhere.

It can't be directly related to the issue you linked to (because UHD_ functions are for USB host, but here we are using UDD_ functions for USB device mode). However I am convinced that the bug is in libsam rather than SerialUSB.

were you able to test it with my code?

I do seem to be getting occasional errors but only about once every several gigabytes, which makes it hard to debug as I don't know if it's a problem with the code or the cable or with the way I am testing it. Testing so far indicates it is a problem with transmission (SerialUSB.write())

I think there's a problem in the ring buffer, where OVERFLOW of variable "i" (which is an unsigned 32 bit integer) is not dealt with. In the ring buffer code below, where i = (i + 1) % CDC_SERIAL_BUFFER_SIZE;, the value of "i" will keep counting up by one for each iteration (until it overflows), but the value of the statement will count from 0-511 then recycle (for CDC_SERIAL_BUFFER_SIZE = 512).

Ring buffer code in CDC.cpp:

	ring_buffer *buffer = &cdc_rx_buffer;
	uint32_t i = (uint32_t)(buffer->head+1) % CDC_SERIAL_BUFFER_SIZE;

	// if we should be storing the received character into the location
	// just before the tail (meaning that the head would advance to the
	// current location of the tail), we're about to overflow the buffer
	// and so we don't write the character or advance the head.
	while (i != buffer->tail) {
		uint32_t c;
		if (!USBD_Available(CDC_RX)) {
			udd_ack_fifocon(CDC_RX);
			break;
		}
		c = USBD_Recv(CDC_RX);
		// c = UDD_Recv8(CDC_RX & 0xF);
		buffer->buffer[buffer->head] = c;
		buffer->head = i;

		i = (i + 1) % CDC_SERIAL_BUFFER_SIZE;
	}

I think there's a problem in the ring buffer

The ring buffer no longer exists in the code being discussed (see stimmers attached files and the thread).

I'm having a lot of problems running Paul Stoffregen's multi-byte benchmark on a fairly new Macbook pro, which is strange as he managed to get reportable results on a Macbook pro.

stimmers test scripts all work perfectly however :~.

The host hangs before the end of the test. Resetting the due causes the host to signal an error (number of bytes delivered) and it aborts. The due however is happily waiting in a 'nothing to receive' loop.

I'm now sending sequenced data and stuff appears to be going missing. I really want to use this so I'll be resuming in the morning!

This is an updated version of the alternative USB serial code. This fixes one of the problems I was having with the receive_test benchmark and fixes a memory corruption bug.

CDC.cpp (7.84 KB)

USBAPI.h (7.07 KB)

USBCore.cpp (22.7 KB)