Probably i wasn't clear enough...i try again.
As the code show (pls check it), the PC is simply sending 8bit of data (doesn't matter what its meaning is, but if you'r curious, it's describred in the code i posted too) and soon after, it waits for a response and then it keeps the message received:
s.Write(New Byte() {byteTosend}, 0, 1)
While s.BytesToRead() <= 0
End While
byteRead = s.ReadByte()
This transmission is the blu waveform (the "pulse" is the trasmission of 8 bit all set to "0" @115200bps), taken with the probe of the scope on the Arduino pin RX (RX0 for the native usb, RX3 for the usb-dongle).
When Arduino receive this data, it analyze it (again: doesn't matter in which way, but everything is in the sketch i posted) and it answer with 8 bit of data. This waveform is the yellow one at the bottom and as you can see, it takes less then 50us after the end of the blu one to be transmitted.
This process (PC send 8bit data+Arduino answer) is repeated in a loop many times; with usb-dongle the PC is able to send another 8bit after about 200us (first graph: from the end of the first yellow "pulse" to the beginning of second blu one there are about two units, so @100us/div is 200us); but with native usb there are 4 units @1ms/div=4ms.
The effect is: when i repeat (let's say 2000 times) this loop, with usb-dongle takes 1sec, while with the native-usb it takes 20sec.
I'll try to post some easier code asap; if you'r not familiar with VB, pls let me know which language u prefer.
Other suggestions are welcome (if something is not clear, i'll happy to (try to) clarify it).