Pages: [1] 2 3   Go Down
Author Topic: Data loss when sending to native USB port (SerialUSB)  (Read 8179 times)
0 Members and 1 Guest are viewing this topic.
Offline Offline
Full Member
***
Karma: 9
Posts: 109
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I noticed that when sending 64 or more bytes from a pc to the Due's native USB port, bytes get lost.
This sketch reproduces it:

Code:
static const int led = 13;

void setup() {
  // initialize serial:
  SerialUSB.begin(9600);
  pinMode(led, OUTPUT);     
}

void loop() {
  // if there's any serial available, read it:
  int i = 0;
  while (SerialUSB.available() > 0) {
    SerialUSB.read();
    i++;
  }
  SerialUSB.println(i);
  digitalWrite(led, HIGH);
  delay(250);
  digitalWrite(led, LOW);
  delay(250);
}

If you send say 80 bytes to the serial port (e.g. simply via an echo to /dev/ttyACM1), the sketch prints series of numbers like this : 0 0 0 63 0 0, which indicates that (at least) 17 bytes are lost.

It can easily be fixed in CDC.cpp: the code defines a ring buffer of size CDC_SERIAL_BUFFER_SIZE (==512). But the code that uses it uses the SERIAL_BUFFER_SIZE macro which is 64. So replacing SERIAL_BUFFER_SIZE with CDC_SERIAL_BUFFER_SIZE in CDC.cpp fixed the problem for me (I wanted to run ArduinoISP which requires receiving 128 byte messages from avrdude).

It will go wrong if you feed the due more than 511 bytes at once. Therefore it would be better to use USB's built in handshaking mechanism (make the device reply with NAK or NYET until it has room to store more incoming data). The same was done for the Leonardo. I could look into how to do that with the due's otg controller, but if someone already knows, that would save time...

Logged

Forum Administrator
Milano, Italy
Offline Offline
Sr. Member
*****
Karma: 23
Posts: 292
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

PeterVH,

It will go wrong if you feed the due more than 511 bytes at once. Therefore it would be better to use USB's built in handshaking mechanism (make the device reply with NAK or NYET until it has room to store more incoming data). The same was done for the Leonardo. I could look into how to do that with the due's otg controller, but if someone already knows, that would save time...

Thanks for the suggestion I've fixed the buffer macros. Can you point me out how the Leonardo implements the (NAK,NYET) handshaking?
Logged

C.

Offline Offline
Full Member
***
Karma: 9
Posts: 109
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

The idea is to consume only as much data from the usb controller as you can store. As long as you don't have room for new data, the usb controller must be instructed to reply NACK or NYET to the host (it does that autonomously).

On the sam, the usb controller sends these NAKs/NYETs as long you don't "return" the fifo (or rather the memory bank(s) attached to it) to the the contoller. This is done by the  udd_ack_fifocon(CDC_RX) function. There is an excellent diagram in the data sheet at pg. 1092 (40.5.2.13 Management of OUT Endpoints).

I gave it a try. In USBCore.cpp in USB_ISR():

Code:
...
#ifdef CDC_ENABLED
        if (Is_udd_endpoint_interrupt(CDC_RX))
        {
                udd_ack_out_received(CDC_RX);

                // Handle received bytes
                if (USBD_Available(CDC_RX))  // (this test is probably not needed)
                        SerialUSB.accept();

//              udd_ack_fifocon(CDC_RX);  // <-- don't return the fifo to the controller here,
                                                               // do it from accept() if we could store all bytes
                  ...
        }
In CDC.cpp:
Code:
void Serial_::accept(void)
{
        ring_buffer *buffer = &cdc_rx_buffer;
        uint32_t i = (uint32_t)(buffer->head+1) % CDC_SERIAL_BUFFER_SIZE;

        // if we should be storing the received character into the location
        // just before the tail (meaning that the head would advance to the
        // current location of the tail), we're about to overflow the buffer
        // and so we don't write the character or advance the head.
        while (i != buffer->tail) {
                uint32_t c;
                if (!USBD_Available(CDC_RX)) {
                        udd_ack_fifocon(CDC_RX);
                        break;
                }
                c = USBD_Recv(CDC_RX);
                // c = UDD_Recv8(CDC_RX & 0xF);
                buffer->buffer[buffer->head] = c;
                buffer->head = i;

                i = (i + 1) % CDC_SERIAL_BUFFER_SIZE;
        }
}
In read(), after we consumed some data we must call accept again, thi might return the fifo to the controller:
Code:
int Serial_::read(void)
{
        ring_buffer *buffer = &cdc_rx_buffer;

        // if the head isn't ahead of the tail, we don't have any characters
        if (buffer->head == buffer->tail)
        {
                return -1;
        }
        else
        {
                unsigned char c = buffer->buffer[buffer->tail];
                buffer->tail = (unsigned int)(buffer->tail + 1) % CDC_SERIAL_BUFFER_SIZE;
                if (USBD_Available(CDC_RX))
                        accept();
                return c;
        }
}


It does not yet work correctly: it misses every 512th byte. The best test is to use Stimmer's loopback sketch http://arduino.cc/forum/index.php/topic,132811.msg1000171.html#msg1000171
Cat a file of e.g. 8KB to /dev/ttyACM1. Capture the reply also via cat:
cat /dev/ttyACM1 > txt. A diff shows you miss every 512th byte.

I don't think it is some off by one error in the circular buffer because if I define the CDC_SERIAL_BUFFER_SIZE as 64, I still loose each 512th byte.

I think it is somewhere in the code accessing the rx endpoint, but the low level api's are complex and there is some redundancy in them.

To be continued...
I thought I already share this as a starting point, maybe somebody else sees what is wrong...



 

Logged

Offline Offline
God Member
*****
Karma: 32
Posts: 507
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

You have a race condition - under certain circumstances accept() can be executing twice simultaneously in the main loop (called by read() ) and in the interrupt handler. This causes the data loss.

The following change greatly reduces the problem:
Code:
void Serial_::accept(void)
{
  static int r=0;
  if(r)return;
  r=1;
         ring_buffer *buffer = &cdc_rx_buffer;
        uint32_t i = (uint32_t)(buffer->head+1) % CDC_SERIAL_BUFFER_SIZE;

        // if we should be storing the received character into the location
        // just before the tail (meaning that the head would advance to the
        // current location of the tail), we're about to overflow the buffer
        // and so we don't write the character or advance the head.
        while (i != buffer->tail) {
                uint32_t c;
                if (!USBD_Available(CDC_RX)) {
                        udd_ack_fifocon(CDC_RX);
                        break;
                }
                c = USBD_Recv(CDC_RX);
                // c = UDD_Recv8(CDC_RX & 0xF);
                buffer->buffer[buffer->head] = c;
                buffer->head = i;

                i = (i + 1) % CDC_SERIAL_BUFFER_SIZE;
        }
   r=0;
       
}
It's still theoretically possible for accept() to get called twice even with that guard in place although it hasn't happened to me yet in over 20 megabytes of testing. There is probably a better way to solve the problem. But I think we are getting somewhere smiley
Logged


Forum Administrator
Milano, Italy
Offline Offline
Sr. Member
*****
Karma: 23
Posts: 292
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

PeterVH, stimmer,

awesome! can't wait to check this one.

It's still theoretically possible for accept() to get called twice even with that guard in place

Using synchronization primitives could help in this case?
I'm talking about:

Code:
unsigned int __ldrex(volatile void *ptr);
int __strex(unsigned int val, volatile void *ptr);
void __clrex(void);

From the datasheet (paragraph 12.5.7):

Quote
12.5.7.1
A Load-Exclusive instruction
Used to read the value of a memory location, requesting exclusive access to that location.

12.5.7.2
A Store-Exclusive instruction
Used to attempt to write to the same memory location, returning a status bit to a register. If this
bit is:
0: it indicates that the thread or process gained exclusive access to the memory, and the write
succeeds,
1: it indicates that the thread or process did not gain exclusive access to the memory, and no
write is performed,
The pairs of Load-Exclusive and Store-Exclusive instructions are:
• the word instructions LDREX and STREX
• the halfword instructions LDREXH and STREXH
• the byte instructions LDREXB and STREXB.
Software must use a Load-Exclusive instruction with the corresponding Store-Exclusive
instruction.
Logged

C.

Forum Administrator
Milano, Italy
Offline Offline
Sr. Member
*****
Karma: 23
Posts: 292
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

It works great for me!

https://github.com/arduino/Arduino/pull/1255

I'll merge this before the coming release 1.5.2.

Thank you guys!
Logged

C.

Offline Offline
God Member
*****
Karma: 32
Posts: 507
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Thanks, that's useful... I'm not sure I'm using the instructions exactly right but the following seems to work:
Code:
void Serial_::accept(void)
{
  static uint8_t r=0;
  int t=__LDREXB(&r);
  if(t)return;
  if(__STREXB(1,&r))return;
         ring_buffer *buffer = &cdc_rx_buffer;
        uint32_t i = (uint32_t)(buffer->head+1) % CDC_SERIAL_BUFFER_SIZE;

        // if we should be storing the received character into the location
        // just before the tail (meaning that the head would advance to the
        // current location of the tail), we're about to overflow the buffer
        // and so we don't write the character or advance the head.
        while (i != buffer->tail) {
                uint32_t c;
                if (!USBD_Available(CDC_RX)) {
                        udd_ack_fifocon(CDC_RX);
                        break;
                }
                c = USBD_Recv(CDC_RX);
                // c = UDD_Recv8(CDC_RX & 0xF);
                buffer->buffer[buffer->head] = c;
                buffer->head = i;

                i = (i + 1) % CDC_SERIAL_BUFFER_SIZE;
        }
   r=0;      
}

(cmaglie - I just saw your code, looks good, I'll try it out)

It's not amazingly fast (I am getting a 7MByte file through in about 50s) but it is reliable - no diff in 50MBytes of testing so far smiley
« Last Edit: January 29, 2013, 05:43:16 am by stimmer » Logged


Offline Offline
God Member
*****
Karma: 32
Posts: 507
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

It works great for me!

https://github.com/arduino/Arduino/pull/1255

I'll merge this before the coming release 1.5.2.

Thank you guys!

I've been testing this code - no errors in 250Mbytes of data so far smiley-grin

Update: No errors in 500Mbytes of data in ~ 1 hour  smiley-mr-green

Update 2: All 888888888 bytes transferred in 1h48m24s, no errors  smiley-cool
« Last Edit: January 29, 2013, 07:44:16 am by stimmer » Logged


Offline Offline
Full Member
***
Karma: 9
Posts: 109
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Woohoo!
I just got back from work and noticed the problem is diagnosed, fixed, tested and committed in git-hub!
Thanks guys.

Nevertheless I still feel we should think about the approach taken to avoid entering accept() from the isr. This is what happens without the guards:
- bytes arrive for the 1st time from the host
- the isr fills up the serial rx buffer, it can't free the fifo because it can only write 511 bytes and it has 512 bytes available.
- the sketch performs SerialUSB.read() so it reads 1 byte.
- so we end up in accept() which calls USBD_Recv(). This function automatically frees the fifo if we read the last byte from it! I think this is where the problem lies. This causes a new irq to arrive while we are still in our loop and did not yet update the head ptr of the rx buffer.

Therefore I tried with the function UDD_Recv8(): it does not automatically free the fifo, so we can finish our loop and call udd_ack_fifocon(CDC_RX) when we are done accessing the state in rx buffer.

I could not see the problem yesterday, but now, after seeing your fix, I can. Following code works without guards:
 
Code:
void Serial_::accept(void)
{
        ring_buffer *buffer = &cdc_rx_buffer;
        uint32_t i = (uint32_t)(buffer->head+1) % CDC_SERIAL_BUFFER_SIZE;

        if (!USBD_Available(CDC_RX))
                return;

        // if we should be storing the received character into the location
        // just before the tail (meaning that the head would advance to the
        // current location of the tail), we're about to overflow the buffer
        // and so we don't write the character or advance the head.
        while (i != buffer->tail) {
                // UDD_Recv8() does not automatically free fifo
                uint32_t c = UDD_Recv8(CDC_RX & 0xF);
                buffer->buffer[buffer->head] = c;
                buffer->head = i;

                if (!USBD_Available(CDC_RX)) {
                        udd_ack_fifocon(CDC_RX);
                        break;
                }
                i = (i + 1) % CDC_SERIAL_BUFFER_SIZE;
        }
}

Safer would be to disable the rx interrupt during accept(). The various low level api's like USBD_Recv() already do something like that via the LockEP class. (though I don't like the idea to use a class for this)

Let us further think about this...

Quote
It's not amazingly fast (I am getting a 7MByte file through in about 50s) but it is reliable
Indeed, consuming these bytes one by one is really inefficient. But now we have something reliable, we can think about optimization strategies.
B.t.w if we define CDC_SERIAL_BUFFER_SIZE as 513, would this be noticable? (then we have room to accept an entire fifo bank).
 





Logged

Offline Offline
God Member
*****
Karma: 32
Posts: 507
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

513-byte buffers might cause the compiler to calculate all those % operators by division. In any case I did try the code with 2048-byte buffers, but got data loss for some unknown reason. I'll have to look into that further (I can't think of any logical reason why it should happen). Shortening the buffer down to 4 bytes surprisingly didn't cause much of a performance hit.

We aren't being bottlenecked by write speed - I get nearly 10MByte/sec from the Due to the computer and transferred 800MBytes with perfect reliability.

Transferring more than one byte at a time is probably the best idea. I noticed there is a UDD_Recv(ep,*data,len) which can be used to get multiple bytes at once, and maybe we could have an overloaded SerialUSB::read(*data,len) too.

Update: PeterVH, I tried your fix without the guard but I can't get it to work reliably. It works for 100KB or so, then corruption, then nothing. Are there any other changes I have to make elsewhere?
« Last Edit: January 29, 2013, 07:13:38 pm by stimmer » Logged


Offline Offline
Full Member
***
Karma: 9
Posts: 109
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Update: PeterVH, I tried your fix without the guard but I can't get it to work reliably. It works for 100KB or so, then corruption, then nothing.


I can confirm the behavior. I had only tested with an 8KB file. With an 170K file I have corruption too: it goes well for some time but then each  byte gets sent several times.

I don't understand. I also tried with replacing the guard code with:
Code:
irqflags_t flags = cpu_irq_save();
... body of accept ...
cpu_irq_restore(flags);
The idea was instead of avoiding the isr to enter the routine, to avoid an isr alltogether by disabling interrupts. It does not work either.

The good thing is that I can also confirm the version in github works ok (tested with a 170KB file).

I will do tests with ArduinoISP also.

Logged

Offline Offline
God Member
*****
Karma: 32
Posts: 507
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Fasten your seatbelts smiley-cool

I have been trying to debug and simplify the code, and also add in multi-byte reads. The (modified) loopback test now runs at 4.3 MB/s (that's about 45 million baud). My 888888888-byte test passes through in about 3 min 30 sec. I also added in some code to control the LEDs. I did this by removing the troublesome ringbuffer code and using UDD_ calls directly (My reasoning was, since there is 2 banks of USB fifo for buffering, why do we need more?)

I do seem to be getting occasional errors but only about once every several gigabytes, which makes it hard to debug as I don't know if it's a problem with the code or the cable or with the way I am testing it. Testing so far indicates it is a problem with transmission (SerialUSB.write())

The new loopback test with multi-byte read looks like this:
Code:
void setup() {
  Serial.begin(115200);
  SerialUSB.begin(0);
}
int s=0;
int t=0;
void loop() {
  int l;
  byte b[2048]; // 512 might be enough
  if(l=SerialUSB.available()){
    l=SerialUSB.read(b,l);
    SerialUSB.write(b,l);
    s+=l;
  }
  if((millis()-t) > 1000){t=millis();Serial.println(s);}
}

I've attached the three changed USB files as there are many changes.

The way I am testing on linux is as follows:
Code:
seq 99999999 >o1
stty -F /dev/ttyACM1 sane raw -iexten -echo -echoe -echok -echoctl -echoke -onlcr min 1
time cat o1 >/dev/ttyACM1 & cat /dev/ttyACM1 >o2
# ctrl-c when it is finished
cmp o1 o2
I've also tested with gtkterm (which has been 100% reliable)

Please test thoroughly! I am especially interested to see if removing the ringbuffer causes performance degradation in any situation - I haven't found anything yet which runs slower despite having less buffering. Also I am unable to test on Windows or Mac.

* CDC.cpp (7.83 KB - downloaded 46 times.)
* USBAPI.h (6.99 KB - downloaded 39 times.)
* USBCore.cpp (22.66 KB - downloaded 46 times.)
« Last Edit: February 02, 2013, 12:52:57 pm by stimmer » Logged


France
Offline Offline
Sr. Member
****
Karma: 0
Posts: 266
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Hi Stimmer,

Would it be possible for you to make the same test but on Programming Port USB ?

Thank you, Albert
Logged

Offline Offline
God Member
*****
Karma: 32
Posts: 507
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Albert: not really - programming port serial doesn't have a direct multi-byte read, it is too slow to need one. And an 888888888 byte test would take over 21 hours to run smiley

If you need a programming port loopback sketch use MultiSerialMega and connect pins 18 and 19. As for the 16u2 firmware it is almost the same as the Uno firmware and I have tested that to death 2 years ago - it can be tested by uploading Blink and linking pins 0 and 1.
Logged


France
Offline Offline
Sr. Member
****
Karma: 0
Posts: 266
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Ok Stimmer I see now but I'm confused, in other threads there were issues with USB routines, are there just on native USB ?

As i've reported when moving an arduino MEGA code to arduino DUE, the USB link between Java running on my MacBook air and DUE would never work, in that case i was using USB programing port.

Maybe it would help someone makes clear thread on USB programming port and USB native port, different type of issues, have they been solved in new IDE...

I'm lost now  smiley-confuse
Logged

Pages: [1] 2 3   Go Up
Jump to: