Go Down

Topic: fast SerialUSB for Arduino DUE (Read 1 time) previous topic - next topic

GaM3r2Xtreme

I am curious to see if DMA is possible with the due. Thanks for telling us where to look in the datasheet Graham. I plan to read though it to see if I can make any sense out of it. That datasheet is over 1400 pages of a weird world to me, haha. Do you guys think its possible to code an implementation? I don't think its a hardware layout that's an issue. Looks like DMA is built into the chip. Maybe a 3rd party library can be created?

ghlawrence2000

I am absolutely certain a DMA HS implementation can be made, the Atmel application note gives the basic layout and requirements for ASF/Atmel Studio. But how to join it in with Arduino I have no clue.

Regards,

Graham
UTFT_SdRaw now included in library manager!! ;) High speed image drawing from SD card to UTFT displays for Mega & DUE.
UTFT_GHL - a VASTLY upgraded version of UTFT_CTE. Coming soon to a TFT near you! 8) Shipping April 1 2016!

GaM3r2Xtreme

I checked out the Atmel application notes you mentioned for the SAM3X8E and did a quick ctrl+f for the term "DMA" and these are the links that popped up for anyone curious to read up on the examples.

[PDF] Atmel AT07892: SAM3A/3U/3X/4E DMA Controller (DMAC) Driver
(file size: 284KB, 28 pages, revision B, updated: 07/2015)
This application note describes how to use the ASF driver for interfacing to the Direct Memory Access Controller on SAM4E.

[PDF] Atmel AT08642: SAM3A/3N/3S/3U/3X/4E/4N/4S/G Peripheral DMA Controller (PDC) Driver
(file size: 246KB, 19 pages, revision B, updated: 07/2015)
This application note describes how to use the ASF driver for interfacing to the Peripheral DMA controller on SAM4.



Those are the only two that popped up for the SAM, but below are what shows up for "Atmel AVR 8-bit and 32-bit Microcontrollers". They might be irrelevant, but probably a good read for information on DMA.

[PDF | ZIP] Atmel AVR1304: Using the XMEGA DMA Controller
(file size: 200KB, 11 pages, revision D, updated: 05/2013)
This application note describes the basic functionality of the XMEGA DMAC with code examples to get up and running quickly. A driver interface written in C is included as well.

[PDF | ZIP] AVR1502: Xplain Training - Direct Memory Access Controller
(file size: 22766, 10 pages, revision A, updated: 08/2011)
This Application Note will get you started with Atmel® AVR® XMEGA® Direct Memory Access Controller (DMAC). You will learn to perform simple memory transfers almost without using CPU time, and reading and writing to peripherals with hardly any CPU intervention.

[PDF | ZIP] AVR32108: Using the 32-bit AVR AP7 Peripheral Direct Memory Access controller
(file size: 57022, 8 pages, revision A, updated: 05/2006)
The 32-bit AVR AP7 has a dedicated Peripheral Direct Memory Controller (PDC). This application notes describes how to use it and includes an example of using the USART with the Peripheral DMA Controller (PDC) with and without interrupt control.

[PDF | ZIP] Atmel AVR1504: XMEGA-A1 Xplained training - XMEGA Event System
(file size: 30990, 15 pages, revision A, updated: 08/2010)
This Application Note will get you started with Atmel® AVR® XMEGA™ Event System which allows inter-peripheral communication, enabling a change of state in one peripheral to automatically trigger actions in other peripherals, without any use of interrupts or CPU and DMA resources.

[PDF | ZIP] Atmel AVR1510: XMEGA-A1 Xplained training - XMEGA USART
(file size: 26637, 10 pages, revision A, updated: 08/2010)
This Application Note will get you started with using Atmel® AVR® XMEGA™ USART (Universal Synchronous Asynchronous Receiver Transmitter) in polling mode, interrupt mode and how to use the DMA Controller to transfer data without CPU interaction.

[PDF | ZIP] Atmel AVR1514: XMEGA-A1 Xplained Training - Direct Memory Access Controller
(file size: 26324, 10 pages, revision A, updated: 07/2011)
This application note covers the basic features of the Atmel® AVR® XMEGA® Direct Memory Access Controller (DMAC). The goal for this training is to get started with simple memory transfers almost without using CPU time, and reading / writing to peripherals with hardly any CPU intervention.

[PDF | ZIP] Atmel AVR1522: XMEGA-A1 Xplained Training - XMEGA USART
(file size: 114259, 10 pages, revision A, updated: 07/2011)
The USART (Universal Synchronous Asynchronous Receiver Transmitter) is the key element in serial communications between computers, terminals and other devices. This training covers basic setup and use of the Atmel® AVR® XMEGA® USART and the three tasks will demonstrate how to use the USART In polling-mode, interrupt mode and how to use the DMAC (Direct Memory Access Controller) to transfer data without CPU interaction.



I've also found a good read about the USB standard and might help with understanding how everything works together here.

ghlawrence2000

Quote
USB in a NutShell
Making sense of the USB standard
Starting out new with USB can be quite daunting. With the USB 2.0 specification at 650 pages one could easily be put off 
No shit!!

I thought the Atmel notes were enough! :P

I do think the DMA aspect is only part of this larger picture, how the f*** do we start to integrate it into Arduino?....

I am comfortable with libraries in general, I am moderately comfortable with C(++), what I am struggling with is the overall structure and how the arduino modules link together..... this does really need to be a Arduino team project if it is going to progress......... but since they dumped the DUE, it is unlikely to happen :( .

Regards,

Graham
UTFT_SdRaw now included in library manager!! ;) High speed image drawing from SD card to UTFT displays for Mega & DUE.
UTFT_GHL - a VASTLY upgraded version of UTFT_CTE. Coming soon to a TFT near you! 8) Shipping April 1 2016!

GaM3r2Xtreme

You mean there is no support from the Arduino team for the DUE? How is that possible when this board seems to be one of the fastest ones they have? Most due seem to migrate to either the UNO or the Mega, but I would assume the DUE has more potential.

I'm still trying to find time to read through most of that documentation as of now, but we might be able to tap into the same libraries Sin was in. I guess the first thing to do is find some method of using the DMA controller. Maybe lock the processor into constantly checking a memory block of data and change a digital pin (maybe 8 for a bytes worth) of LEDs?

If I understand correctly, the processor will not have to spend time looking into a buffer to see if anything was received and then storing that data into memory if there was something, which would then be read again to change those LEDs. The DMA controller would automatically take care of watching for data transfers and send that into memory while the processor is doing other things.

Collin80

Kind of. DMA allows you to tell the hardware that you've got a chunk of data somewhere and it should be transferred somewhere else but you don't want to have to make the processor do it in 4 byte chunks like you'd normally have to do. In some cases this isn't a big deal. Let's say you need to transfer 256 bytes of data from one place to another. The processor can do it in 4 byte chunks (32 bit) so that's 64 transfers. You can probably get it to do a transfer and loop in like 3-4 instructions. That's 256 instruction times if it takes 4 instructions. They're all on an 84MHz clock so it takes right about 3 microseconds to do the copy. Chances are you can spare the time most of the time. Where it really shines is if things are happening at a slower rate but you want to transfer a big batch of data and not have to babysit. A classic example is analog readings. They take some time (microsecond for each reading) and so if you want to capture 256 readings you either have to periodically check back (a lot) to read one, store it, read one, store it, over and over. With DMA you can tell the ADC hardware "read these ADC channels - 1, 3, 4, 6 64 times each and store the result here with DMA" and, it'll do it like magic. At the end you get a notification that you've got 256 readings waiting for you. You didn't have to poll or constantly deal with interrupts or start ADC readings each time or any of that. You just fire and forget and get an interrupt when the job is done. It's the awesomest thing ever. Now, getting back to USB, the best use of DMA would be for streaming a lot of data out of the Due. You could say "Hey, USB hardware, here's 4096 bytes of data I need to transfer out of the USB port. Lemme know when that's done" It will then do it as quickly as the hardware and other side allows. While this is happening you are free to do anything else in the sketch. Maybe you want to work on getting the next batch of data ready while this batch is transferring. If you have two buffers you can. In fact, DMA lets you tell it the next buffer to use while it is doing the current buffer. That way you can load one, tell it the next one, and then it'll automatically go to the next one and interrupt. If you have three buffers you can have it send one, have one ready to go as the next buffer, and work in the third one. Then when the first one transfers it automatically loads the second. Hopefully you're done with the third so you then set the third as the next buffer to transfer. The first is open so you begin working in there. That way there is never a single moment where the DMA hardware does not have something to send. This way you can max out the hardware transfer rate. You might not get a full 480mbit (you pretty much cannot) but you can do as well as the Due is capable of doing. For reception it's a little more weird. DMA requires a length you want to transfer so you wouldn't know the length generally. Except that the USB hardware has a 4K buffer so things transferred to the Due end up there first. Once the USB hardware has a block of data stored in its special buffer you could use DMA to transfer it to RAM. This doesn't save a whole lot of time but it might help a bit.

Hopefully that kind of explains DMA and how it is used.

GaM3r2Xtreme

Alright I think I'm starting to understand how the DMA controller works. Thanks for the explanation Collin! I do have a few questions if you have any spare time to answer these:

When you say instructions, your talking about the assembly line variant after the compiler converts the c code we use. I can't remember the exact assembly terms or the correct syntax, but it would almost be:
- Something with a loop, BEGIN maybe?
- READ x, dma_buffer
- STORE ram_buffer, x
- Something to return back to the top, RETURN maybe?

From what I understand, we decide where in RAM we want the DMA controller to send the data to maybe through the use of some set function given by Atmel in the datasheet for the microcontroller. If we have some char array, say 8bytes long to give 8 total characters, would the interrupt be thrown once the buffer is completely filled in or read out, or could we decide to tell the controller, "HEY, I'd like you to read/write only half way up this buffer"?

I love the idea of the interrupt being thrown once complete, but what would happen if the microcontroller is already taken care of another interrupt? I know for the UNO, this new interrupt would be lost. My guess is that for the DUE, interrupts are put into a queue, since we have to clear the fired interrupt within the function call.

Would you happen to know what the hardware buffer size limit is for the DMA controller? What if we are reading/writing a big chunk of data, say 1kB worth. Would we have a possible buffer overflow?



Again, thanks for the great explanation. It does help to understand how everything is really working internally.

ghlawrence2000

#52
Apr 26, 2016, 07:46 pm Last Edit: Apr 26, 2016, 10:10 pm by ghlawrence2000
I have to give you points for enthusiasm! Unfortunately, as I have already stated, the DMA aspect is only a small part of this issue, the larger part is how to use the HS interface in preference to the FS interface..... and that is well beyond my capabilities/time available at this point. Judging by your huge over simplification of a DMA transfer, I suspect yours too. Your comment about poking around in the same area that sined modified would also lead me to the same conclusion. What we need is the Arduino team (who already know exactly how all of the modules gel together) to do something as I believe I have a better grasp of what is required than you do. It is a huge undertaking, more so if you first need to trace your way through all of the existing modules! This is not just going to be a bit of a modification to a couple of already existing modules (no offence sined)!! 

Don't let me put you off, and I would be one of the first to wish you well if you think you can do this, but honestly, it is going to be a serious project if you do.

Regards,

Graham
UTFT_SdRaw now included in library manager!! ;) High speed image drawing from SD card to UTFT displays for Mega & DUE.
UTFT_GHL - a VASTLY upgraded version of UTFT_CTE. Coming soon to a TFT near you! 8) Shipping April 1 2016!

Collin80

When you say instructions, your talking about the assembly line variant after the compiler converts the c code we use. I can't remember the exact assembly terms or the correct syntax, but it would almost be:
- Something with a loop, BEGIN maybe?
- READ x, dma_buffer
- STORE ram_buffer, x
- Something to return back to the top, RETURN maybe?
Yes, I was talking about assembler / machine instructions. The ARM processor has a lot of neat modes that fold lots of operations into a single instruction but it still takes several instructions to build a loop that copies data from one place to another.

Quote
From what I understand, we decide where in RAM we want the DMA controller to send the data to maybe through the use of some set function given by Atmel in the datasheet for the microcontroller. If we have some char array, say 8bytes long to give 8 total characters, would the interrupt be thrown once the buffer is completely filled in or read out, or could we decide to tell the controller, "HEY, I'd like you to read/write only half way up this buffer"?
With DMA what you do is give the hardware a starting address and a size you want to transfer. Then it does that. You can optionally give it the starting address and length of a second buffer. It'll automatically load those values into the primary location and keep going if the second buffer details are filled out. Those second details are then cleared and you either have to load it again or you're done after the second buffer. The hardware doesn't care what the actual size of the array or buffer was. You can send half if it, you can go past the buffer, you can send the middle third. It doesn't matter. All it cares about is the starting address and transfer length.

Quote
I love the idea of the interrupt being thrown once complete, but what would happen if the microcontroller is already taken care of another interrupt? I know for the UNO, this new interrupt would be lost. My guess is that for the DUE, interrupts are put into a queue, since we have to clear the fired interrupt within the function call.
The DMA interrupt will wait until the current interrupt is done and then be serviced. You don't miss interrupts. You're right, if you don't clear an interrupt it can repeatedly fire forever and that's no fun.

Quote
Would you happen to know what the hardware buffer size limit is for the DMA controller? What if we are reading/writing a big chunk of data, say 1kB worth. Would we have a possible buffer overflow?

Again, thanks for the great explanation. It does help to understand how everything is really working internally.
I don't remember what the size limit is. But, the DMA controller doesn't have buffers. It transfers from one place to another. There are specific connections to the DMA controller and you can only do transfers along those connections. One such connection is from the ADC hardware to RAM. I don't know what connections the USB hardware has only that it does support DMA in some capacity. I guess I have to agree with ghlawrence2000 that this is a bigger, more complicated job than you might think. USB is terribly complicated and not that many people understand it. I'd love for the USB stack to use DMA but I'm not sure I really care to put the time in to understand the hardware well enough to make it a reality. So, it seems we could be stuck unless someone wants to really put some hard time in on this.

GaM3r2Xtreme

The USB protocol is definitely a hard one to understand. I have learned a little bit thus far, but no where near as well to recall everything about it. All those different tokens that need to be passed before any data can be send/received are all over the place! Not only that, but then the syncing, the error checking, the handshakes...

I agree with the both of you 100% that this is not an easy task by far. Ghlawrence and Collion, I have to give you guys some credit for knowing how all this works and sharing this with me. Most of what was mentioned in the thread is all new, mainly the DMA. Although, it sparks that interest and desire to learn more about it seeing as it all could be used for future projects. And heck the harder something is to learn, the more likely it is to stick around as knowledge.

From what I have grasped, we have the USB protocol sending/receiving chunks of data to/from dedicated USB hardware buffers (I believe DPRAM?), which then the DMAC will kick in to move this around where it needs to be to/from RAM as stated by the processor itself telling the DMAC what to do next. That's a start!

On an off topic, there has to be separation between UART hardware and USB hardware, right? I ask this because I did a little digging between all the different communication protocols to find which allowed for a quicker bit rate before stumbling across this topic.

ghlawrence2000

I am really happy you wish to learn, and seemingly are doing quite well  ;D. If you have read some of the articles I have linked to here, you would see how DMA can link with certain USB functions. This is a very complicated subject. I found another series of articles earlier, again from Atmel/ASF and it appears the Arduino team actually removed significant functionality from the Arduino core... :( I think this is going to run for some time...... Never the less, all input and ideas are welcome!! Keep up the good work!


Regards,

Graham
UTFT_SdRaw now included in library manager!! ;) High speed image drawing from SD card to UTFT displays for Mega & DUE.
UTFT_GHL - a VASTLY upgraded version of UTFT_CTE. Coming soon to a TFT near you! 8) Shipping April 1 2016!

ghlawrence2000

Just an interesting factoid to give you some inspiration as if you needed any! ;) I read somewhere yesterday during my browsings on the subject, that although 480mbps is the limit, approx 38MB/s is achievable on a DUE..... But there was no information to substantiate the claim nor any software to do so!

One thing I am not sure of, and I have not been able to find an answer, even if we (us/you/arduino team) can produce something workable, will the Windows usbser.sys driver work in HS mode? That has got to be a good start point, as I would have NO IDEA how to start to write a Windows driver....

 Regards,

Graham
UTFT_SdRaw now included in library manager!! ;) High speed image drawing from SD card to UTFT displays for Mega & DUE.
UTFT_GHL - a VASTLY upgraded version of UTFT_CTE. Coming soon to a TFT near you! 8) Shipping April 1 2016!

Collin80

As far as I know it already is working in HS mode. The electrical hook up on a Due is specifically done to allow for HS mode. The pipe buffer size is set to a larger size than you can do in non-HS mode, and I've tried to use a due compatible board through an USB isolator that doesn't support HS mode and it kills the connection and locks up the board. I'm pretty sure I saw somewhere deep in the bowels of the USB code a call to enable HS mode and disable low speed mode.

So, as far as I know it's already in HS mode if at all possible (obviously if you hook a due up to a USB1.1 port it would not be in HS mode).

38MB/s is about 304mbs so that's a little lower than the USB maximum but not by too much. It should be obvious to pretty much everyone that an 84MHz processor is unlikely to be able to sustain 38MB/s unless you use DMA or spend every single processor cycle moving data to the USB buffers.

Go Up
 


Please enter a valid email to subscribe

Confirm your email address

We need to confirm your email address.
To complete the subscription, please click the link in the email we just sent you.

Thank you for subscribing!

Arduino
via Egeo 16
Torino, 10131
Italy