Pages: [1] 2 3   Go Down
Author Topic: Measuring Arduino's FT232 throughput rate ?  (Read 5514 times)
0 Members and 1 Guest are viewing this topic.
0
Offline Offline
Jr. Member
**
Karma: 0
Posts: 55
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Hi all,

I was interested in quantifying the actual USB/serial throughput rate that one can get from an Arduino Duemillanove, which has an FTDI FT232 USB chip. I am using a Duemillanove with ATmega 328 + Arduino IDE 0018 on Linux (Ubuntu 10.04/Lucid).

I have concluded that: regardless of the USB serial rate which has been set - the FTDI chip 'adjusts' itself, so that the average throughput (from the whole program loop) becomes about 60-80% of the requested rate; even though the chip is perfectly capable of even higher rates than the requested one (i.e., an individual Serial.print or Serial.write could return sooner (faster) that the expected duration for the requested rate)!

My questions are:
  • Is this conclusion correct - or do I have a problem with my measurement process?
  • If the conclusion is correct - is it something due to
  • USB protocol (i.e. we should expect the same from all USB devices)?
  • Particular connection between Arduino's AVR microcontroller and FTDI USB chip (i.e. [mis]match of clock frequencies) ?
  • Particular connection settings between Arduino's FTDI USB chip and a PC?
  • Something entirely different? smiley  
[/list]

Let me clarify a bit the measurement process: for the purpose, I used the following small Arduino .pde program:

Code:
static const long SERSPEED=9600; //115200, 2000000

static unsigned long ts_value = 0;

void setup()
{
  Serial.begin(SERSPEED);  
  Serial.println("xSTART:");
}

void loop()
{
  ts_value = micros();
  Serial.write((uint8_t *)&ts_value, 4);
}

This code simply takes a timestamp as 'unsigned long' (4 bytes) and sends this value through USB serial to the PC.

Then:
  • I capture the output of this Arduino sketch on the PC
  • Since I can limit the time of capture, I can calculate the throughput "seen from the PC side" as captured file size / time of capture
  • I analyze the output using a separate C program - where I read timestamps, and I calculate a delta difference (difference between 'current' and 'previous' timestamp).
  • Since the 'delta' in this case describes the time in microseconds that it takes one loop to execute - it describes the time it takes for one micros() command (which I assume is small, so I can ignore it) + the time it takes to send four bytes via serial;
  • this I take as a base, to calculate the time it takes to send one byte via serial - and thus the throughput "seen from the Arduino side"
[/list]

For the capture - first a note, that XON/XOFF should be turned off when sending binary data as per Long arrays, Serial, and the case of missing bytes. Then, I either use screen to capture a log file:
Code:
$ SPEED=9600 # or 115200
$ screen -L /dev/ttyUSB0 $SPEED,-ixon
... however, with the note that screen apparently doesn't handle speeds greater than 115200 (in that case it seems to set the speed to 9600 anyways)...

Or, I use cat:
Code:
# init serial settings with 'stty'
# note 'min 100 time 2' is needed right after programming
#  the Arduino, else 'cat' will exit after reading only
#  a few bytes
# SPEED can be 9600, 115200, 2000000

$ SPEED=9600; stty $SPEED -ixon min 100 time 2 </dev/ttyUSB0

# capture output with 'cat', with a pre-set timeout (here 55 seconds)

$ tm=55; xa=$(timeout -2 $tm cat /dev/ttyUSB0 > btest) ; bts=$(du -b btest | cut -f1) ; echo "bytes: $bts; time: $tm s; Bytes/s:" ; wcalc $bts/$tm

Timeout: aborting command ``cat'' with signal 2
bytes: 52869; time: 55 s; Bytes/s:
 = 961.255

Obviously, repeated measurements with timeout / cat will give different Bytes/s values, but those will still be close. For the above speed of 9600, I get a rather stable delta of 4160 [ch956]s; which points to a throughput of 4/4160e-6 = 961.538 Bytes/s.

That at least tells me, that the measurements from PC side (via timeout / cat) and from Arduino side (via timestamp delta) mostly agree: that for a serial setting of 9600 - we get effectively about 961 Bytes/s throughput.  

So, I repeat the same experiment for serial speed setting of 115200 and 2000000 - with that note, that for 2000000, I use serialBeginU2X (from Arduno Serial - Gabe's Arduino/AVR Blog, via Arduino Forum - 2Mbaud hack, but i dont get more than ~300Kbaud...) instead of the usual Serial.begin..

And so, here are the results (cannot get table border to show - select the table text to highlight it, columns are more easy to see then):


set speed bits/sexpected B/smeasured B/sefficiency deltadelta rate B/s
9,6001,20096180.10% 4160 [ch956]s962
115,20014,40011,76281.68% 340 [ch956]s11,765
2,000,000250,000151,48660.59% 28 [ch956]s142,857
[/tt]

The table seems to state: no matter what speed we request by setting, the system will respond with an effective speed of anywhere between 60-80% of the requested one - even though the chip is, in itself, capable of delivering the exact requested speed! Also, note:
  • For 115200 bps request, I can get anywhere between 77% and 81% efficiency
  • For 2000000 bps request, I can get anywhere between 60% and 66% efficiency
  • For 2000000 bps request, a delta of 24 [ch956]s occurs ~44% of the time, and a delta of 28 [ch956]s occurs ~53% of the time. This is because micros() function has a resolution of four microseconds - and so:
    • a more accurate assessment for delta in this case would be (0.44*24+0.53*28)/0.97 =26.1856 [ch956]s;
    • in which case we get Bytes/s of 4/26.1856e-6 =152756, which is closer to the measured rate of 151486 Bytes/s.
Well, as this behavior is truly puzzling to me - I am thankful in advance for any comments in response to this! smiley

Cheers!

Logged

UK
Offline Offline
Sr. Member
****
Karma: 1
Posts: 491
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I don't know much about serial comms but I believe sending an 8 bit byte actually requires something like 10 signal transitions worth of time. A serial speed of 9600 is actually 9600 baud (signal transitions per second) not 9600 bits per second. So if an 8 bit byte takes 10 signal transitions, 9600 baud gives you 960 bytes per second with is exactly what you got (give or take).

I don't know why the higher baud rates deviate from this.

Andrew
Logged

0
Offline Offline
Faraday Member
**
Karma: 13
Posts: 2857
ruggedcircuits.com
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I think there are two things that explain your observations. First, the time it takes for Serial.write to execute should be factored in, or your code should bypass it and write right to the serial port for minimum overhead. At higher baud rates, the fixed overhead of Serial.write becomes a larger and larger percentage of the total byte transmission time. At 2000000bps a byte takes 5us to transmit, or 80 AVR clock cycles at 16 MHz. Clearly, the code execution time of writing to the serial port is going to eat into that as a non-trivial percentage.

Second, as the previous poster mentioned, sending an 8-bit byte at 9600 bps actually requires 10 bits (overhead is start bit and stop bit) thus the most you could get is 960Bps rather than your expected 1200Bps, or 80% efficiency right off the bat due to serial protocol overhead. Your measurement of 961.538Bps is bang-on within reasonable measurement errors.

--
Check out our new shield: http://www.ruggedcircuits.com/html/gadget_shield.html
Logged

0
Offline Offline
Jr. Member
**
Karma: 0
Posts: 55
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Hi guys,

Thanks for your responses!

Quote
A serial speed of 9600 is actually 9600 baud (signal transitions per second) not 9600 bits per second.
Thanks so much for the correction - I *always* make this mistake, of thinking of Baud as byte/s (and that is because I usually think of a byte as a 'symbol' - not a bit [or rather, a 'signal transition']).

Quote
I believe sending an 8 bit byte actually requires something like 10 signal transitions worth of time.
Again, thanks for the clarification - while I have observed 8-N-1 previously on an oscilloscope, I had completely forgotten to relate that here - probably because I thought that since USB signaling is different from RS-232, the USB/serial driver in any case has to emulate a serial device; and so I thought this emulation would simply honor the baud rate on a 'byte' level, that is... ehm... in bytes/sec smiley

So, taking that the USB/serial driver actually does the serial emulation on a 'bit' level, then - indeed:
Quote
So if an 8 bit byte takes 10 signal transitions, 9600 baud gives you 960 bytes per second with is exactly what you got (give or take).

So, with this correction, the previous table becomes:



set speed baudexpected B/smeasured B/sefficiency deltadelta rate B/s
9,600960961100.13% 4160 [ch956]s962
115,20011,52011,762102.10% 340 [ch956]s11,765
2,000,000200,000151,48675.74% 26.18 [ch956]s152,756
[/tt]

And it can be easily seen that:
Quote
the most you could get is 960Bps rather than your expected 1200Bps, or 80% efficiency right off the bat due to serial protocol overhead. Your measurement of 961.538Bps is bang-on within reasonable measurement errors.

Then only the higher baud rate remains to be clarified:
Quote
... the time it takes for Serial.write to execute should be factored in, or your code should bypass it and write right to the serial port for minimum overhead. At higher baud rates, the fixed overhead of Serial.write becomes a larger and larger percentage of the total byte transmission time.
Thanks for that - I wasn't even aware that Serial.write has a "fixed overhead"; that certainly explains a lot. I'll try to look into it further...

Well, thanks again for the responses - they helped a lot,
Cheers !


Logged

Huntsville, Alabama, USA
Offline Offline
Sr. Member
****
Karma: 2
Posts: 327
Arduino rocks
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

smd,

That's a nice experiment you came up with.

It's moot whether the USB protocol requires all 10 bits or not.  On USB Arduino boards, the UART on the AVR chips clocks in/out 10 bits for every byte.  The UART pins are connected to the FT232 chip.  So, no matter how many bits the FT232 sends over the USB wire, it's going to take 10 bit times for each byte to get to or from the FT232.

To be precise, "baud" is really only meaningful when you are talking about modems, although it's colloquially used as a synonm for bits per second.  For our purposes here, 9600 baud _IS_ 9600 bits per second.  It's just that we're talking about 10-bit words, not 8-bit words.

Regards,

-Mike
« Last Edit: August 12, 2010, 09:44:08 am by mfm9 » Logged

0
Offline Offline
Jr. Member
**
Karma: 0
Posts: 55
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Hi everyone,

Just wanted to say - thanks again for the good pointers; I finally managed to get the following stats for 2000000 baud:


set speed baudexpected B/smeasured B/sefficiency deltadelta rate B/s
2,000,000200,000199,77699.89% 20.0168 [ch956]s199,832
[/tt]

I'd say, this is as good as it gets (without getting into the 'magic clock frequency' issue mentioned in [TUT] [SOFT] Using the USART - Serial communications :: AVR Freaks - especially since this measurement falls within the acceptable +/- 2% mentioned in the same topic.)

Note that the value for delta is calculated as '(20*97.21+24*1.49+16*1.07)/(97.21+1.49+1.07)', as those were the statistics (the three top ones, anyway) for the delta values I got for that particular measurement.

Here is the Arduino sketch used:
Code:
static const long SERSPEED=2000000;

static unsigned long ts_value = 0;
static uint8_t* tsval_byte;

//http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1231326297/0
#define sbi(sfr, bit) (_SFR_BYTE(sfr) |= _BV(bit))

//http://arduino.ghearing.com/
void serialBeginU2X(long baud) {
      // enable U2X mode
      UCSR0A |= 1 << U2X0;
 
      // set the baud rate
      int ubbr = ( F_CPU / 4 / baud - 1 ) / 2;
      UBRR0H = ubbr >> 8;
      UBRR0L = ubbr;
 
      // enable the serial pins
      sbi(UCSR0B, RXEN0);
      sbi(UCSR0B, TXEN0);
 
      // enable the serial interrupt
      sbi(UCSR0B, RXCIE0);
}

void setup()
{
  serialBeginU2X(SERSPEED);            // Serial.begin(SERSPEED);  
  tsval_byte = (uint8_t *) &ts_value;
  Serial.println("xSTART:");
}

void loop()
{
  ts_value = micros();
  
  // Serial.write((uint8_t *)&ts_value, 4);
  
  // trying only one write at a time, to avoid while check
  // not much difference: 172476, 173420 Bytes/s
//  Serial.write(tsval_byte[0]);
//  Serial.write(tsval_byte[1]);
//  Serial.write(tsval_byte[2]);
//  Serial.write(tsval_byte[3]);  

  // with inline while...
  // use direct defines as in:
  //  Arduino Forum - Porting HardwareSerial to C -
  //  http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1248653427
  // ... getting 199377, 199776 Bytes/s with this
  
  while (!((UCSR0A) & (1 << UDRE0)))
    ;
  UDR0 = tsval_byte[0];  
  while (!((UCSR0A) & (1 << UDRE0)))
    ;
  UDR0 = tsval_byte[1];  
  while (!((UCSR0A) & (1 << UDRE0)))
    ;
  UDR0 = tsval_byte[2];  
  while (!((UCSR0A) & (1 << UDRE0)))
    ;
  UDR0 = tsval_byte[3];
}

I simply looked through:
Code:
// ./hardware/arduino/cores/arduino/Print.cpp:

void Print::write(const uint8_t *buffer, size_t size)
{
  while (size--)
    write(*buffer++);
}

// ./hardware/arduino/cores/arduino/HardwareSerial.cpp:

HardwareSerial::HardwareSerial(ring_buffer *rx_buffer,
  volatile uint8_t *ubrrh, volatile uint8_t *ubrrl,
  volatile uint8_t *ucsra, volatile uint8_t *ucsrb,
  volatile uint8_t *udr,
  uint8_t rxen, uint8_t txen, uint8_t rxcie, uint8_t udre, uint8_t u2x)
...

// shift 1 for UDRE(5) places left (so we get 00100000)
// check that bit in *_ucsra by &ing - if it is 0, (then !0=1), loop emptily
// so - wait until 6th LSbit becomes 1
//   and only then write a character to *_udr..
void HardwareSerial::write(uint8_t c)
{
  while (!((*_ucsra) & (1 << _udre)))
    ;

  *_udr = c;
}
And realized one overhead could be the "while (size--)", so I tried to avoid it by calling Serial.write explicitly with a single byte argument - that did not gain much.

Then, I thought I'd get rid of the Serial.write function call altogether and simply repeat the loop it performs four times - and that did, indeed, work smiley

Mike - thanks for the kind words! smiley

Quote
To be precise, "baud" is really only meaningful when you are talking about modems, although it's colloquially used as a synonm for bits per second.  For our purposes here, 9600 baud _IS_ 9600 bits per second.  It's just that we're talking about 10-bit words, not 8-bit words.
Thanks for this formulation - it is far easier for me to think in that way, to be honest.

Quote
It's moot whether the USB protocol requires all 10 bits or not.  On USB Arduino boards, the UART on the AVR chips clocks in/out 10 bits for every byte.  The UART pins are connected to the FT232 chip.  So, no matter how many bits the FT232 sends over the USB wire, it's going to take 10 bit times for each byte to get to or from the FT232.
And since I don't actually know how the USB protocol is really on wire ( as opposed to RS232 ) - it is most helpful to think of the AVR-to-FT232 serial connection, as the bottleneck that determines overall transfer speed from Arduino to PC.

Thanks again for all the responses, everyone!

Cheers!


Logged

0
Offline Offline
Jr. Member
**
Karma: 0
Posts: 55
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Hi all, me again,

Well, I hope that this post will not go unnoticed; I am trying to test the, I believe, last stage in these experiments - which would be full-duplex throughput, and I'm getting dropped bytes @ 2000000 baud - so I hope to get some comments on that.

Here is what I have been doing. First of all, my concept in measuring throughput for full-duplex is that the Arduino is programmed simply to write back (echo) any byte it has received from serial. Then, from the PC side, we could initiate sending a file to serial - and at the same time, we could read the response from serial and save it into a logfile. We could then measure how much time this operation took, and given that the number of bytes sent in each direction would be the size of the original file sent - we can then determine the throughput rate.

There are a couple of problems here. First, in this case we need to both read and write to the serial port "at the same time", and so, using just plain Linux tools (like cat) in bash (like in the first examples above) gets to be a bit difficult. I thought that "process substitution" in bash could help (since it can, sort of, "parallelize" running of processes), but as far as I can see, there is simply no way to control which process starts first, when using substitution. I have posted about this on "Serial port loopback/duplex test, in Bash or C? (process substitution) - Stack Overflow", and in the process, put together a small program, writeread.c. The source of writeread.c is pasted on the StackOverflow post, and I am using the "threaded" version to run the tests (note that writeread.c takes care of setting the proper serial port speed, and setting the serial port in "raw" (character) mode).

The second thing is, that using the standard Serial commands on the Arduino side, will again be problematic (at high speeds in particular), because of the time (computational overhead) it takes for function calls; so Arduino .pde code like this:
Code:
void loop()
{
    // read in a byte, and send it back immediately
  
        // send data only when you receive data:
      while (Serial.available() > 0) {
            // read the incoming byte:
            _c = Serial.read();

            // say what you got:
            Serial.write(_c);
      }
        
      // must have delay for 'emulino' to work
      //delay(1);
}
will, in essence, not work: when used at 115200 baud, writeread.c on the PC side will read a couple of hundred bytes and then freeze; at 2000000 baud, writeread.c most often fails with "SERIAL EAGAIN ERROR" (althrough sometimes it can also read a couple of hundred bytes, and then freeze like in the 115200 case).

So, like previously, I thought of getting rid of all "Serial" function calls altogether; and "Serial.write" was already done in a previous post. For "Serial.read" - if we look through the source of /arduino-0018/hardware/arduino/cores/arduino/HardwareSerial.cpp, we can see that basically reception of bytes from serial happens through a callback/interrupt; i.e. each time a byte arrives, the interrupt is triggered, which stores the byte in an array (in a ring_buffer struct); afterwards, when we call "Serial.read", we simply obtain this byte which is stored in this intermediate "ring_buffer" array.

Well, in short - it seems the only way I can get "loopback" code on the Arduino which is more sturdy in respect to higher speeds, is to put the code for "Serial.write" directly in the interrupt handler - and avoid the use of intermediate "ring_buffer" array altogether - and below is the .pde sketch code:
Code:
/*
duplexTstamp - b2b - profiler
// sdaau 2010

b2b - byte by byte
read incoming byte;
then send it back
*/


static const long SERSPEED=2000000;
static unsigned char _c;

int ledPin = 13;  

//http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1231326297/0
#define sbi(sfr, bit) (_SFR_BYTE(sfr) |= _BV(bit))

//http://arduino.ghearing.com/
void serialBeginU2X(long baud) {
      // enable U2X mode
      UCSR0A |= 1 << U2X0;
 
      // set the baud rate
      int ubbr = ( F_CPU / 4 / baud - 1 ) / 2;
      UBRR0H = ubbr >> 8;
      UBRR0L = ubbr;
 
      // enable the serial pins
      sbi(UCSR0B, RXEN0);
      sbi(UCSR0B, TXEN0);
 
      // enable the serial interrupt
      sbi(UCSR0B, RXCIE0);
}


// ./arduino-0018/hardware/arduino/cores/arduino/HardwareSerial.cpp
#if defined(__AVR_ATmega8__)
SIGNAL(SIG_UART_RECV)
#else
SIGNAL(USART_RX_vect)
#endif
{
#if defined(__AVR_ATmega8__)
  unsigned char c = UDR;
#else
  unsigned char c = UDR0;
#endif
      // skip everything relate to 'ring_buffer' here
      // and simply write back to serial the byte (c)
      // that was just received:
      while (!((UCSR0A) & (1 << UDRE0)))
        ;
      UDR0 = c;  
}


void setup()
{
  //Serial.begin(SERSPEED);  // cannot use .begin here
  serialBeginU2X(SERSPEED);

  pinMode(ledPin, OUTPUT);   // for debug purposes
}

void loop()
{

  // usually nothing needs to happen in the loop,
  // so usually the content here is just ';'
  ;

  // however, for debug purposes,
  //   here we make a small timer:
  // each 10 seconds, the LED on pin13 wil blink,
  // and at the same time, a single byte
  // will be written to serial
  
  digitalWrite(ledPin, HIGH);
  
  while (!((UCSR0A) & (1 << UDRE0)))
      ;
  UDR0 = 65;
  
  delay(100);
  digitalWrite(ledPin, LOW);
  delay(9900);
  ;
}


Note there is a slight addition above for debugging - so one has to be careful to run the tests in the time period between two LED pulses (which also send a single character via serial) smiley-wink  

Now, with this code, at 2000000 baud, I get (almost) no problems doing a duplex text using a short file (below, it is writeread.c source itself, 6633 bytes in size):
Code:
$ ./writeread /dev/ttyUSB0 2000000 writeread.c 3>myout.txt
stdalt opened; Alternative file descriptor: 3
Opening port /dev/ttyUSB0;
Got speed 2000000 (4107/0x100b);
Got file/string 'writeread.c'; opened as file (6633).
write_thread_function spawned
   write: 6633 - 6633
   read: 1188
   read: 2382
   read: 3579
   read: 4994
   read: 5388
   read: 6578
   read: 6633

+++DONE+++
Wrote: 6633 bytes; Read: 6633 bytes; Total: 13266 bytes.
Start: 1284541725 s 954277 us; End: 1284541725 s 989516 us; Delta: 0 s 35239 us.
2000000 baud for 8N1 is 200000 Bps (bytes/sec).
Measured: write 188228.95 Bps (94.11%), read 188228.95 Bps (94.11%), total 376457.90 Bps.
... although, sometimes even short files can cause a freeze, or a "SERIAL EAGAIN ERROR". Also, the above example shows measurement of 94% of the expected speed - note that it can get as low as 70%.

Now, if sending a larger file, it almost always blocks - and I have noticed that usually it blocks a couple of bytes before reaching the full file size; therefore the 10-second 'pulses' that send a single byte to serial - they end up completing this transfer so writeread.c finishes, and then we can compare the sent and the received files. Below is example of sending the writeread executable, 25072 bytes in size:
Code:
$ ./writeread /dev/ttyUSB0 2000000 writeread 3>myout.txt
stdalt opened; Alternative file descriptor: 3
Opening port /dev/ttyUSB0;
Got speed 2000000 (4107/0x100b);
Got file/string 'writeread'; opened as file (25072).
write_thread_function spawned
   write: 25072 - 25072
   read: 799
   read: 1599
   read: 2400
   read: 3199
   read: 3999
...
   read: 24185
   read: 24981
   read: 25071      <= NOTICE, here it paused, 1 byte before reaching file size
   read: 25072      <= here the 10 second pulse kicks in, and completes the transfer

+++DONE+++
Wrote: 25072 bytes; Read: 25072 bytes; Total: 50144 bytes.
Start: 1284542153 s -214842 us; End: 1284542154 s 81605 us; Delta: 1 s 296447 us.
2000000 baud for 8N1 is 200000 Bps (bytes/sec).
Measured: write 19339.01 Bps (9.67%), read 19339.01 Bps (9.67%), total 38678.02 Bps.

$ diff -Naur <(hexdump -v -e "1/1 \" %02x\n\"" writeread) <(hexdump -v -e "1/1 \" %02x\n\"" myout.txt)
--- /dev/fd/63      2010-09-15 11:18:34.421458885 +0200
+++ /dev/fd/62      2010-09-15 11:18:34.421458885 +0200
@@ -12033,7 +12033,6 @@
  00
  00
  00
- 00            <= missing byte
  ff
  ff
  ff
@@ -25070,3 +25069,4 @@
  69
  74
  00
+ 41            <= 0x41 = 65 dec; the byte sent by the 10 second pulse

Here I have used hexdump to convert bytes of binary files into an ascii representation where each byte occupies one line - and then use diff (can use meld as well for visual) to detect single byte deletions/insertions (see Is there a Linux application that can compare two binary files? - Ubuntu Forums).

So, from the diff, we can see that the original file (writeread) has a byte 0x00 located at approx 12033 bytes in, which the read back file (myout.txt) doesn't (additionally, myout.txt has a 0x41 byte at the end which writeread doesn't have - obviously, this is the byte sent by the 10-second pulse; and as we needed to wait for this, it also messes up the time measurements, as expected).

(Continues with questions in next post - bloody 9500 chars per post limit...  >smiley-sad)
« Last Edit: September 15, 2010, 04:51:10 am by smd » Logged

0
Offline Offline
Jr. Member
**
Karma: 0
Posts: 55
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

(... continued from previous post ...)

In other words - a single byte was dropped from the incoming stream; this could be the PC failing to write it (not very likely, I guess) - or the Arduino failing to write back this byte, when it received it. And the only reason I can imagine for the Arduino dropping the byte, is that another call to the interrupt kicked in, before the previous interrupt managed to finish writing to the port ...


So this is where my question is - what could be the cause of this byte dropping, and would it be possible to fix it?

In other words - did we come to a sort of a fundamental limit of the Arduino as a system (i.e. at 2000000 baud, we can *always* expect a byte dropped, say, each 30000 bytes); or is there anything we could do to expect a stream without drops with programming (either the PC or the Arduino side)?

Well, I must say thanks again to all that have helped in this thread - and I really hope to hear some comments on this full-duplex byte-drop issue..
Cheers!
Logged

0
Offline Offline
Faraday Member
**
Karma: 13
Posts: 2857
ruggedcircuits.com
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I would be lying if I said I read your whole article, but as soon as you said 2000000 baud I thought of a limitation with the FT232RL. It has an internal buffer of 256 bytes for receive and 128 bytes for transmit. The way USB works is every 1ms the PC asks the FT232RL if it has any data for it -- the FT232 can never initiate a transmission on its own but must wait for this request.

So how many bytes can queue up in its buffer in 1 ms? At 2e6 bits/second that's 2e5 bytes per second or 2e2=200 bytes per 1ms interval. So you might be running up against the 128 byte internal buffer limit for transmission.

--
The Gadget Shield: accelerometer, RGB LED, IR transmit/receive, light sensor, potentiometers, pushbuttons
Logged

Offline Offline
Edison Member
*
Karma: 3
Posts: 1001
Arduino rocks
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
So how many bytes can queue up in its buffer in 1 ms? At 2e6 bits/second that's 2e5 bytes per second or 2e2=200 bytes per 1ms interval. So you might be running up against the 128 byte internal buffer limit for transmission.
Not likely. The FTDI design specification is for 3Mbit sustained. There may however be driver issues/limitations on the PC side.
Logged

Global Moderator
Dallas
Online Online
Shannon Member
*****
Karma: 178
Posts: 12291
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset


Turn off Timer 0 (millis will stop running) and rerun the test.  Any change?
Logged

0
Offline Offline
Jr. Member
**
Karma: 0
Posts: 55
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Hi all,

Thanks so much for your responses!

Quote
.. the FT232RL. It has an internal buffer of 256 bytes for receive and 128 bytes for transmit. The way USB works is every 1ms the PC asks the FT232RL if it has any data for it -- the FT232 can never initiate a transmission on its own but must wait for this request.

Thanks for mentioning this! I was aware of the difference between the sizes of the receive and transmit internal buffers of the FT232, but hadn't thought of connecting it to this problem. Even more, this is extremely confusing to me - what is the benefit of having the receive and transmit buffer in different sizes? If there was place on chip, couldn't they have split this into equal TX and RX buffers?! (come to think of it, maybe that is the reason - maybe there's no actual benefit, but there was just available place on chip, just for the extra bytes for RX buffer?)

Quote
So how many bytes can queue up in its buffer in 1 ms? At 2e6 bits/second that's 2e5 bytes per second or 2e2=200 bytes per 1ms interval. So you might be running up against the 128 byte internal buffer limit for transmission.

Thanks - a good calculation to have in mind; and also puts my confusion with the differing RX / TX buffer sizes in context. However, it also assumes that the FT232 is the 'weak' link here - and instead, it turns out that it is the Atmega328 eating the byte (see further in the post). The difficulty for me, however, is that I cannot see any regularity/periodicity in the drops that I experience - if it was a buffer overflow issue, I guess some repetative pattern would be observed?

Quote
Not likely. The FTDI design specification is for 3Mbit sustained. There may however be driver issues/limitations on the PC side.

If I ever knew the specs go to 3Mbit, I certainly have forgotten it until I read this - thanks! smiley And now, I certainly believe the FT232 is capable of 2Mbit/s sustained - and I think I can also confirm that the problem is not the PC side (see further in the post).

Quote
Turn off Timer 0 (millis will stop running) and rerun the test.  Any change?

Thanks for this suggestion! Just to make sure that I properly understand it: for 2MBPs, I am not using the code in the  OP which calls 'micros()'; I am instead using the code in Reply #6 (I've even removed everything in the 'loop()', except for a single ';').

However, from the way you phrase it, I guess I should understand that "Timer 0" actually runs all the time in the background (possibly providing the counter increases for 'millis()') - regardless of whether I actually call 'millis()' or not? It's certainly worth giving a try - I'll post back about the results.

Well, in any case, in the meantime I tried to get a bit better understanding of the problem - and I was lucky to be able to get my hands on an old Agilent 54621A scope, and capture RX and TX traces of both working and non-working transmission; which will be outlined in the next post.

(Continues...)
Logged

0
Offline Offline
Jr. Member
**
Karma: 0
Posts: 55
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

(...Continued:)

Well, as mentioned, I used an Agilent 54621A to acquire captures of RX and TX signals during both a succesful and non-succesful transmission of 6633 bytes. Now, I have definitely experienced drops even for short ammount of bytes, so my previous assertion "I get (almost) no problems doing a duplex text using a short file" doesn't hold (luckily, I must add - coz I'm not sure I'd have had enough memory on the scope for a longer capture).

For the interested, I have posted captured data (see serial_loopback_channel_data.tar.bz2) and screenshots, discussed in this post, here:
Hope you won't mind I document a couple of n00b notes: since I want to observe a signal with transitions at 2MHz, by Nyquist I need at least twice that rate in my sampling equipment (if we're talking of a sinusoid; for square signals we'd have to take into account at least up to third harmonic, I guess); although I've heard professionals recommend at least 3x the bandwidth, or "the sample rate should be at least four times the highest frequency component of the waveform." (Agilent Oscilloscopes User Manual). Luckily, the 54621A claims 60 MHz bandwidth, which should be quite enough; unfortunately, I'm not sure there is a cheaper method to do this kind of measurement.

In any case, thanks to the lovely, open-source agiload, I can "fetch data and screenshots from Agilent 5462x oscilloscopes", so here I'd like to post some of these results. Note that the 54621A supports only 9600 bps transmit speed for its captures; it also adjusts how many samples it will take depending on the span shown based on the current time/div. For me, I needed a capture of 50 ms, and 54621A could sample two channels at 25MHz each. In the end, ironically, I end up with some 1.2 million points per channel, representing a span of 50 ms - and it takes 45 minutes for each of them (1.5 hours for both) to be downloaded from the scope to the PC over a 9600 bps serial smiley

Additionally, as mentioned previously, I'm using the Arduino code in Reply #6 - having removed everything in the 'loop()', except for a single ';'. So, that means that the PC side must be able to know when to stop waiting for data - so I modified writeread.c to send (an) extra byte(s) after 1ms, whenever it detects it has collected one or two bytes less than the size of the file sent originally; this is writeread_bonus.c. Measurements were taken using two probes, sat into the RX and TX pin of the Arduino, respectively, through a 5cm piece of wire; for most captures, trigerring was set to occur on negative transition across 2.5V level on the RX signal (as it always starts first in the loopback process).

Also, I have captured in analog mode, simply because I don't know any better smiley (like, how to work with digital mode (if any) on this scope) - analog may be a bit of overkill, but at least it is fun to observe a bit of 'crosstalk' between TX and RX (although, that may as well be an effect of the probes proximity). Also, in most of the scope screenshots below, RX is channel 1 rendered on top - while TX is channel 2 on bottom.

Well, there are a couple of things that can be said from first measurements; first - the RX signal seems to be somewhat 'lowpassed' in respect to the TX [that is, RX (on left) has somewhat smoother edges than TX (on right) on image below]:



At first I thought something was wrong with my probes, but the image above was obtained with the same probe, going into each pin socket directly. Maybe this has to do with differences in copper trace lengths for the RX and TX, up to the pin sockets - or with some capacitors along the way, I cannot tell. In any case, this doesn't seem to be a problem.

Second thing - there is jitter. Hardly a surprise, but its somewhat different when you observe it for the first time smiley-wink For your viewing pleasure, I have captured the start of several fully working transmissions, and assembled them using imagemagick in an animated gif:



So, we can see, that even with this level of jitter, data seems to be able to be fed back without problems from PC to Arduino and back to PC. Additionally, to - sort of - quantify the jitter, I took a few captures in the scope's persistant mode, which reveals a bit about the size of the jitter (some 0.26 [ch956]s):



I think it's also important to note, that in the above images, RX (the 'lowpassed' signal from the FT232) actually seems "rock" stable (even in a persistent capture) - it is, in fact, the TX, which is under the control of the Atmega328, that is jitterin' ! ( smiley )

In addition, note that in the animation above, one of the first frames doesn't seem right - it is as if TX doesn't follow RX bit by bit (notice how jitter-13.png's RX ends at a high level, unlike jitter-12.png):



As we'll see later on, this could just mean that the TX is 'taking a break', but not necesarilly a "dropped" byte (again, all these captures for the animaton were kept only for transfers that were succesful).

Here is also a zoomed out, family photo of all 6633 bytes in the (succesfully transmitted) transfer packet, in all their glory:




And here a zoomed out capture of the unsuccesfully transmitted sequence - notice the single byte popping up about 1ms (actually, 1.8ms) later, sent by writeread_bonus.c:



Finally, one more thing I was interested in, is how much time does it take from the first incoming serial RX transition to the first outgoing TX transition (which would be the time it takes to raise the interrupt and copy the byte from RX to TX on the ATmega328 side):



and the answer to that, seems to be - around 6.940 [ch956]s.

However, this doesn't necesarilly reveal what happened "on wire" in the unsuccesful case, as opposed to the succesful case. For that, I needed to perform some analysis of the high-res data capture of each transfer - which is the topic of post below.

(Continues...)

« Last Edit: September 24, 2010, 08:28:43 pm by smd » Logged

0
Offline Offline
Jr. Member
**
Karma: 0
Posts: 55
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

(...Continued:)

The first thing I did when I managed to capture a failed transmission (which had a single byte missing), was to take a diff between the original and the received file:
Code:
$ diff -Naur <(hexdump -v -e "1/1 \" %02x\n\"" writeread.c) <(hexdump -v -e "1/1 \" %02x\n\"" myout.txt)
--- /dev/fd/63      2010-09-19 16:22:27.657840947 +0200
+++ /dev/fd/62      2010-09-19 16:22:27.657840947 +0200
@@ -5649,7 +5649,6 @@
  74            116,      't'
  61            97,       'a'
  2e            46,      '.'
- 63            99,      'c'
  6f            111,      'o'
  6d            109,      'm'
  6d            109      'm'
@@ -6631,3 +6630,4 @@
  0a
  7d
  0a
+ 58
This tells me that the received file has a byte 'c' missing at offset 5649, between byte strings 'ta.' and 'omm'. The interesting thing now, would be to try and analyse this particular location in the captured data.

As mentioned, agiload can download high-res captures from the scope as ASCII data. Each (succesful or failed) transfer, produced a capture of both TX and RX signals across a 50ms span, sampled at 25MHz; which results with two ASCII data files - for each signal - around 9MB in size each.

This asks the question - what sort of software can be used to analyse this kind of data? I looked into some open source tools, and thought the effort would be worth a writeup - and you can read about my initial observations in Waveform Viewers - Plotting Large Analog Data - Ubuntu Wiki.

In brief, what worked for me was to use python with matplotlib, and write a sort of a serial decoder for mwfview.py - you can find this software here:
The path to the data files should be written directly inside the script, and then you can call it with python mwfview-ser.py. This script will load the RX and TX channel ASCII data of a given loopback transmission, and run a serial analysis, which will: mark start (green) and stop (red) bits, decode each byte, and label the start of each byte with its value. Then it is relatively easy to navigate to start of each decoded byte with the mwfview interface.

And, here are the results - for the succesful transmission, we can see the bytes happily passing along, in a sequence, after byte 5649 - as we expect them to; and no bytes are dropped:



For the unsuccesful transmission, we can see that a byte has been dropped (I have tried to indicate byte 5651, which is the last correctly transmitted byte, in both RX (top) and TX (bottom) trace):
 


... however, there is no delay whatsoever, or any visible problem on TX line! It seems that RX signal has promptly pushed the 'c' byte value into the ATMega328, and the ATMega328 simply "lost" it somehow - without otherwise loosing rhythm in its transmission sequence!!

However, this is not to say that delays or hiccups don't occur in the TX signal - they do, as the following image indicates:



... however, you can notice that the ATMega, here, simply picks up from where it left before the "break" - so this is not a case of a dropped byte! Btw, note that I used some statistics analysis of the times between consecutive start bits, in order to identify where these "pauses" occur (again, see the mwfview for more); I should note that the statistics doesn't reveal such deviations for the FT232's RX signal.

All of the above taken into account, I can conclude that:
  • The FT232 produces an RX signal towards the ATMega328, which even if somewhat 'lowpassed', is very stable in respect to jitter
  • The ATMega328 produces a TX signal, which even if it is sharp, it still jitters - although this in itself is not a cause for a drop
  • ATMega328's TX signal can occasionally "pause" in the idle state, but that is not necesarilly a cause for a drop either
  • When the ATMega328 actually does drop a byte, the traffic between the previous and the next byte seems uninterrupted - so it seems that the ATMega somehow "failed to place" the dropped byte in its TX buffer, to begin with
... which, in essence, doesn't tell me anything other, than that the ATMega is somehow at fault smiley-wink However, I'd like to have a better understanding in what way is it actually "at fault" - so I'd appreciate any comments to this!

(In the meantime, I'll try to experiment with the "Timer 0" suggestion...)

Thanks again, all, for a great discussion - and I hope to hear some more feedback ! smiley

Cheers!


« Last Edit: September 24, 2010, 08:32:40 pm by smd » Logged

Huntsville, Alabama, USA
Offline Offline
Sr. Member
****
Karma: 2
Posts: 327
Arduino rocks
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

smd,

Timer0, being a hardware timer, does not directly incur any processor overhead.  However, in the Arduino enviroment, it's set up to produce an interrupt, which is used to count milliseconds for benefit of millis().  It's the interrupt that has the potential to cause you problems.

A syntax note:  you don't need a semicolon to have an empty function.  The following is perfectly valid:
Code:
void loop()
{
}

Regards,

-Mike
Logged

Pages: [1] 2 3   Go Up
Jump to: