I ran a program with a few sensors first on a Uno (R2) board then on a Micro board and I have the feeling the Micro is not as fast as the Uno, at least when transmitting data to the host computer trough the USB port. Is it normal altough both board do have the same CPU clock speed ? Does it have anything to do with the CPU type (ATmega32u4 versus ATmega328) or the way the board is connected to the USB port ?
I think its the USB - the 32U4 has to do all the USB processing along with the sketcch, vs the USB being done in the '16U2 in parallel with the sketch being done in the '328.
Thanks CrossRoads. Do you think I would have better results with the Micro using an external USB FTDI daughterboard ? Of course it increases the global volume where the Micro alone took a very small one.
On the other hand, using the micro's native USB there is no real uart in the game that is normally the bottleneck.
I think the core's implementation for sending from the arduino is simple but not optimized. WIth a little tweaking I expect this can be made much faster than the classic uart.
Can you tell us how at what pace data becomes available, and how many bytes it are? Then with a simple sketch we could measure performance difference real uart versus virtual com port over native usb.
I don't know amundsen, have to experiment & see.
Maybe PeterHV is on to something. I don't know anything about what the bootloader/core files/serial library/whatever to do interface a sketch with the 32U4 hardware.
amundsen:
I ran a program with a few sensors first on a Uno (R2) board then on a Micro board and I have the feeling the Micro is not as fast as the Uno, at least when transmitting data to the host computer trough the USB port.
That directly contradicts all of my experiences with the ATmega32u4 processor (including the Arduino Micro). Given the fact that the m32u4 is capable of sending data at 12 Mbps in chunks of up to 256 bytes your claim defies logic.
The problem very likely has nothing to do with the processor. Post details of your test.
I agree with Coding Badly. After all, if the USB is a bottleneck, you will reach it sooner or later. However the USB can run much faster than 115200 baud.
Not that this really matters because sending one byte at 115200 baud takes 87 uS, which is not very long. For example it takes 104 uS to make an analog reading.
I did some measurements with a (borrowed) usb analyzer. I took the ASCIITable example, changed the baudrate to 115200, compiled it with ide 1.0.5 and uploaded it to a Duemilanove and a Leonardo. (It is a pity I don't have an Uno).
The dataflow looks like this:
The Duemilanove sends 4974 bytes in 368 msec (13.52KBps). About every msec ~14 bytes get sent.
(The duemilanove has to send more data because the ftdi protocol sends two bytes protocol header upon every chunk of data. So the 14 bytes are only 12 bytes useful payload)
The Leonardo sends 4224 bytes in 151 msec (27.14KBps). About every msec ~30 bytes get sent.
So the native USB is indeed faster, but not spectacularly and nohere near the theoretical upperbound of 12Mbps.
But actually I expected even worse, I noticed the code in CDC.cpp sends out one byte at a time, but fortunately the lower level puts the single bytes into the hw fifo. When the fifo is full, it gets send out as one chunk over USB.
The chunks are 64 bytes (max packet size) (not 256). No idea why I see only 30 bytes. Will do a test sending out 1 byte in loop to avoid effects/delays imposed by formatting code...
More importantly the fifo is flushed only once per frame of1 msec. The analyzer shows that the host is polling ~75 times more for new data during the rest of the frame... I think this is the first place to look for improvements
I wonder how usb serial port speed compares to the teensy implementation, which is completely different.
Would a teensy owner care to run the test sketch below on a teensy?
Or can I run a sketch comiled for teensy2.0 on my leonardo? Since this sketch needs only the usb controller, no other pins ... I thought this should work and uploaded the hex file as is to my Leo. The sketch seems not to run, well the device is not enumerated...
Here is the sketch. On a Leo it prints results like this:
Leonardo, chunk size 16, sent 4096 bytes in 84 millis (46)KBpsec
Leonardo, chunk size 64, sent 4096 bytes in 74 millis (53)KBpsec
#define DATA_SIZE 4096
void setup() {
//Initialize serial and wait for port to open: Serial.begin(115200);
while (!Serial) {
; // wait for serial port to connect. Needed for Leonardo only
}
}
I think there are errors in the sketch you uploaded because it doesn't make use of the parameter chunk_size.
Doooooh! Thanks for pointing this out. I edited the post to correct the sketch.
10 bytes per msec is 10,000 per second which, I think, is about what one would expect at 115200 baud.
Sure. That is normal for an Uno, its uart is the limit. But this thread is about the difference in throughput between a classic arduino and one with native usb (Leonardo, Micro, ... Teensy).
though I want to send and receive data.
Ok, but I first want to check out the send path, the receive part (of coarse) follows a completely different code path.
I now have a JRuby program (using the Java Rxtx serial port stuff) which sends some bytes to the Uno which immediately sends them back.
JRuby and the Uno seem quite happy to communicate at 230 400, 500 000 and 1 000 000 baud.
However if I send (and receive) 50 bytes at a time it takes about 460 msecs to do 30 repeats.
That's 30 x 50 x 2 = 3000 bytes in 460ms or about 6500 bytes per second or 65k bits per second or 6.5% of the baudrate.
Increasing the number of bytes in each parcel has increased the throughput, but I haven't yet tried more than 50 bytes at a time.
Interesting, I did not know the max baudrate is that high, but it is indeed ok with the datasheet.
Time is easily lost if at some point in the chain the arrival of data is not noticed soon enough or not forwarded immediately.
This can be anywhere in the chain: pc -> 16u2 usb -> 16u2 uart -> 328 uart (HardwareSerial.cpp) -> sketch -> 328 uart -> 16u2 uart -> 16u2 usb -> pc
The 328p uart (HardwareSerial.cpp) parts looks ok to me.
The sketch: do you use flush() when sending? Do you use write()?
I would certainly also check out the 16u2 firmware.
What transfer rates do you achieve with my sketch, at 1M baud? Even for this one directional test the chain is already rather long.
I don't have an Uno but can do tests with my Due.
Maybe post your test?
I decided to change my program a little to see how much time is consumed by the JRuby code. I did this by writing the data to a local string rather than the serial port and reading from the same string rather than from the serial port. On that basis the "sending and receiving" of 3000 bytes (1500 each way) takes 36msecs of which printing the data on-screen takes 30ms. So JRuby is not causing any significant delay.
However the interesting thing was that, in order to keep the "real" send and receive as close to the "dummy" version I changed the JRuby code from
ck = ""
x = "z"
while (x != "Z" && x != "W")
x = $sp.getc
ck = ck + x
end
to
while $sp.available < 50
end
ck = $sp.read
and the time to do the 3000 bytes fell from about 500ms to about 180ms - clearly reading each byte separately is very slow compared to reading all at once. This didn't matter at lower baud rates. 3000 bytes in 180ms is about 16500 bytes/sec.
To make the above easier to follow, $sp is a global variable representing the serial port and I have suffixed the data with a Z or W which is interpreted by the Arduino as the end of one piece of data or the end of all the data, respectively.
I haven't tried your code at 1MB yet as the Arduino serial monitor doesn't go beyond 115200.
It just occurred to me that I could test your code at higher speeds using puTTY. That worked fine up to 500000 baud but wouldn't read all the data at 1000000 baud.
At 500000 baud the transfer rate was about 46 bytes per msec which is pretty close to the full baud rate.
I think it is safe to conclude that the transfer rate is unlikely to cause problems. The bottleneck will be the code to make sense of the transferred data.