SAMD21 / Zero PWM with DMA for DSHOT

I’m looking for help getting DSHOT set up on a SAMD21 microcontroller, Arduino Zero clone. DSHOT is a brushless motor control protocol and uses PWM at fixed frequency and a variable duty cycle of either 1/3 to 2/3 to signify 0s and 1s. Here I’m using a DSHOT test frequency of 200KHz as pulses line up clearly on a scope. There are 16 bits in a DSHOT packet but in the following I’m using 7 data bits to make images clearer.

I believe that the way to do this is to use a lookup table of duty cycles with DMA, as described by MartinL in:
https://forum.arduino.cc/t/ideas-for-pwm-with-repeating-pattern/847923

This page has been a great starting point. I’ve taken this as a basis entering a DMA test lookup table of 1/3 and 2/3 PWM duty cycles as:

[ 2/3 1/3 2/3 2/3 1/3 1/3 1/3 0 0 0]

Note that I’ve padded the lookup table to 10 elements with 0 duty cycle pulses so that one DSHOT packet can be discriminated from the next. Eventually I’ll need to stop and restart the PWM to accomplish this in a more versatile way (using a timer as I think interrupts are unreliable at 200kHz). This is implemented in the code:

//  Set up DSHOT on SAMD21 using DMA with PWM at 300000Hz                                                                                                                                              
#define FREQUENCYDSHOT 200000  // Frequency of DSHOT PWM                                                                                                                                               

// Time PWM was started and stopped                                                                                                                                                                    
uint32_t TStart=0, TStop=0;

volatile uint16_t LookUp[10]; // DMA PWM lookup table                                                                                                                                                  

typedef struct                                                                // DMAC descriptor structure                                                                                             
{
  uint16_t btctrl;
  uint16_t btcnt;
  uint32_t srcaddr;
  uint32_t dstaddr;
  uint32_t descaddr;
} dmacdescriptor ;

volatile dmacdescriptor wrb[12] __attribute__ ((aligned (16)));               // Write-back DMAC descriptors                                                                                           
dmacdescriptor descriptor_section[12] __attribute__ ((aligned (16)));         // DMAC channel descriptors                                                                                              
dmacdescriptor descriptor __attribute__ ((aligned (16)));                     // Place holder descriptor                                                                                               

uint16_t PERIOD=48000000/FREQUENCYDSHOT-1;
uint8_t NELEM=10; // Number of elements in DMA lookup table: 7 plus 3 zero pads                                                                                                                        

void setup()
{
  // Set the PWM lookup table [2/3 1/3 2/3 2/3 1/3 1/3 1/3 0 0 0]                                                                                                                                      
  //  PWM should look like this:                                                                                                                                                                       
  // _--__-_--_--__-__-__-_________                                                                                                                                                                    
  LookUp[0]=2*PERIOD/3;
  LookUp[1]=  PERIOD/3;
  LookUp[2]=2*PERIOD/3; LookUp[3]=2*PERIOD/3;
  LookUp[4]=  PERIOD/3; LookUp[5]=  PERIOD/3; LookUp[6]=  PERIOD/3;
  LookUp[7]=  0; LookUp[8]=0; LookUp[9]=0;

  DMAC->BASEADDR.reg = (uint32_t)descriptor_section;                // Set the descriptor section base address                                                                                         
  DMAC->WRBADDR.reg = (uint32_t)wrb;                                // Set the write-back descriptor base adddress                                                                                     
  DMAC->CTRL.reg = DMAC_CTRL_DMAENABLE | DMAC_CTRL_LVLEN(0xf);      // Enable the DMAC and priority levels                                                                                             
  DMAC->CHID.reg = DMAC_CHID_ID(0);                                 // Select DMAC channel 0                                                                                                           
  // Set DMAC channel 0 to priority level 0 (lowest), to trigger on TCC0 overflow and to trigger every beat                                                                                            
  DMAC->CHCTRLB.reg = DMAC_CHCTRLB_LVL(0) | DMAC_CHCTRLB_TRIGSRC(TCC0_DMAC_ID_OVF) | DMAC_CHCTRLB_TRIGACT_BEAT;
  descriptor.descaddr = (uint32_t)&descriptor_section[0];                 // Set up a circular descriptor                                                                                              
  descriptor.srcaddr = (uint32_t)&LookUp[0] + NELEM * sizeof(uint16_t);   // Read the current value in the LookUp                                                                                      
  descriptor.dstaddr = (uint32_t)&TCC0->CC[3].reg;                        // Copy it into the TCC0 counter comapare 0 register                                                                         
  descriptor.btcnt = NELEM;                                                 // This takes the number of LookUp entries = NELEM beats                                                                   
  descriptor.btctrl = DMAC_BTCTRL_BEATSIZE_HWORD | DMAC_BTCTRL_SRCINC | DMAC_BTCTRL_VALID;   // Copy 16-bits (HWORD), increment the source and flag discriptor as valid                                
  memcpy(&descriptor_section[0], &descriptor, sizeof(dmacdescriptor));  // Copy to the channel 0 descriptor                                                                                            

  GCLK->GENDIV.reg = GCLK_GENDIV_DIV(1) |          // Divide the 48MHz clock source by divisor 1: 48MHz/1=48MHz                                                                                        
    GCLK_GENDIV_ID(4);            // Select Generic Clock (GCLK) 4                                                                                                                                     

  GCLK->GENCTRL.reg = GCLK_GENCTRL_IDC |           // Set the duty cycle to 50/50 HIGH/LOW                                                                                                             
    GCLK_GENCTRL_GENEN |         // Enable GCLK4                                                                                                                                                       
    GCLK_GENCTRL_SRC_DFLL48M |   // Set the 48MHz DFLL48M clock source                                                                                                                                 
    GCLK_GENCTRL_ID(4);          // Select GCLK4                                                                                                                                                       
  while (GCLK->STATUS.bit.SYNCBUSY);               // Wait for synchronization                                                                                                                         

  GCLK->CLKCTRL.reg = GCLK_CLKCTRL_CLKEN |         // Select GCLK4 as a clock source for TCC0 and TCC1                                                                                                 
    GCLK_CLKCTRL_GEN_GCLK4 |     // Select GCLK4                                                                                                                                                       
    GCLK_CLKCTRL_ID_TCC0_TCC1;   // Select GCLK4 as a co TCC0 and TCC1                                                                                                                                 

  // Enable the port multiplexer for the digital pins D7                                                                                                                                               
  PORT->Group[g_APinDescription[7].ulPort].PINCFG[g_APinDescription[7].ulPin].bit.PMUXEN = 1;

  // Connect the TCC0 timer to the port output D7                                                                                                                                                      
  PORT->Group[g_APinDescription[7].ulPort].PMUX[g_APinDescription[7].ulPin >> 1].reg |= PORT_PMUX_PMUXO_F; //| PORT_PMUX_PMUXE_F;                                                                      

  // Normal (single slope) PWM operation: timers countinuously count up to PER register value and then is reset to 0                                                                                   
  TCC0->WAVE.reg = TCC_WAVE_WAVEGEN_NPWM;         // Setup single slope PWM on TCC0                                                                                                                    
  while (TCC0->SYNCBUSY.bit.WAVE);                // Wait for synchronization                                                                                                                          

  TCC0->PER.reg = PERIOD;                           // Set the frequency of the PWM on TCC0 to 25kHz                                                                                                   
  while(TCC0->SYNCBUSY.bit.PER);                  // Wait for synchronization                                                                                                                          

  // Divide the 48MHz signal by 1 giving 48MHz (20.83ns) TCC0 timer tick and enable the outputs                                                                                                        
  TCC0->CTRLA.reg = //TCC_CTRLA_PRESCSYNC_PRESC |   // Set timer to overflow on the prescaler rather than GCLK                                                                                         
    TCC_CTRLA_PRESCALER_DIV1;     // Divide GCLK4 by 1                                                                                                                                                 

  TCC0->CTRLA.bit.ENABLE = 1;                     // Enable the TCC0 timer                                                                                                                             
  while (TCC0->SYNCBUSY.bit.ENABLE);              // Wait for synchronization                                                                                                                          

  DMAC->CHID.reg = DMAC_CHID_ID(0);               // Select DMAC channel 0                                                                                                                             
  DMAC->CHCTRLA.reg |= DMAC_CHCTRLA_ENABLE;       // Enable DMAC channel 0                                                                                                                             
  TStart=micros(); // Time PWM started                                                                                                                                                                 
}

void loop(){}

I’m having a couple of problems right now. First I notice that the very first PWM pulse does not include the 1st lookup table value. Instead the DMA transfer seems to start on the second value. I can easily work around this but I’m obviously doing something wrong.

The 2nd problem is that I seem to be getting 100% duty cycle pulses (5.0uS in plot) when the trace should be zero.

I’d appreciate any pointers to help me address or understand the cause of these problems.
sjwalker66

Hi @sjwalker66

Yes, I encountered this issue as well. The workaround is to use two padding 0 pulse width timer cycles before the DShot packet and two after. This appears to give enough slack DMA to timer timing, to allow timer to output the whole DShot packet to be transmitted.

Using my own implementation, I found that the SAMD21 was only capable of reliably implementing DShot150 and Dshot300. The SAMD51 on the other hand could go up to DShot1200. Above these rates the timers started to occasionally fail to output all the bits in the packet. In any case, this would have caused the ESCs to drop these erroneous packets as they'd have failed the CRC check.

A single motor channel was set up by running a TCC timer from a 48MHz generic clock, I set its period (PER) register to 319 (6.6us) for DShot150 and 159 (3.3us) for DShot300.

Generating the DShot output involved left shifting the 11-bit data by 1, to make way for the the 1-bit telemetry then calculating the 4-bit CRC. Once calculated the data and telemetry were shift up 4-bits to the left and the CRC appended to create the full 16-bit DShot packet.

I then loaded a simple array with two leading and trailing padding 0s (as mentioned above) followed by the timer pulse width values for each of the 16 binary bits in the DShot packet (data, telemetry & CRC). Binary 0 = short pulse, binary 1 = long pulse. For DShot150 the pulse widths are 120 and 240, for Dshot300 they're 60 and 120 respectively.

Finally, I configured the DMA to fetch the data from the array and load the TCC timer's buffered CC register (CCBx) everytime the TCC timer overflowed (OVF). Subsequently enabling the DMA, caused the timer to automatically fire off the corresponding salvo of DShot packet pulses on the PWM output pin without any further CPU intervention.

This process was repeated for each motor channel, each requiring their own separate TCC timer and DMA channel.

Hi @MartinL
That's a great sanity check - I'd noticed worse behavior at 600kHz but at least got close at lower speeds. Concerning the padding with two 0 pulse widths rather than just one, I guess that means you've also noticed that the first 0 pulse seems to come out as 100% duty.

I was intending to move this to the SADM51 anyway soon, for speed reasons. It may be time for me to do that.
Thanks!

sjwalker66

I've now switched to using the SAMD51 in order to be able to run DSHOT600. I'm again testing at PWM frequency of 200KHz, and have that running fine on pin7 / PB12 (this is a Metro M4). However I'm having a difficult time getting any response from DMA.

Looking at the datasheet the configuration is not surprisingly different from on the SAMD21. The SetupDMA section of the code below is my best understanding of the configuration, but it's clearly not right as it's giving no response. Any pointers would be appreciated as SAMD51 DMA write-ups seem quite scarce.

sjwalker66

/  Set up DSHOT on SAMD21 using DMA with PWM at 300000HzA                                                                                                                                                                                    
#define FREQUENCYDSHOT 200000  // Frequency of DSHOT PWM                                                                                                                                                                                      

#define PERIOD 600 // 600 corresponds to 200KHz @120MHz clock                                                                                                                                                                                 
uint16_t PERIODShort=200;
uint16_t PERIODLong=400;

#define NELEM 12 // Number of elements in DMA lookup table: 7 plus 3 zero pads                                                                                                                                                                
volatile uint16_t LookUp[NELEM]; // DMA PWM lookup table                                                                                                                                                                                      

static DmacDescriptor descriptor1 __attribute__((aligned(16)));

void setup(){
  // Set the PWM lookup table [2/3 1/3 2/3 2/3 1/3 1/3 1/3 0 0 0]                                                                                                                                                                             
  //  PWM should look like this:                                                                                                                                                                                                              
  // _--__-_--_--__-__-__-_________                                                                                                                                                                                                           
  LookUp[0]=0; LookUp[1]=0;
  LookUp[2]=PERIODLong;
  LookUp[3]=  PERIODShort;
  LookUp[4]=PERIODLong; LookUp[5]=PERIODLong;
  LookUp[6]=  PERIODShort; LookUp[7]=  PERIODShort; LookUp[8]=  PERIODShort;
  LookUp[9]= 0; LookUp[10]=0; LookUp[11]=0;

  SetupDMA();
  SetupPWM();
}

void SetupDMA( void ){
  // SAMD21 has 12 DMA channels, SAMD51 has 32                                                                                                                                                                                                
  static DmacDescriptor wrb __attribute__ ((aligned (16)));               // Write-back DMAC descriptors                                                                                                                                      
  static DmacDescriptor descriptor1 __attribute__ ((aligned (16)));         // DMAC channel descriptors                                                                                                                                       
  static DmacDescriptor descriptor __attribute__ ((aligned (16)));                     // Place holder descriptor                                                                                                                             

  // see ~page 379 of datasheet                                                                                                                                                                                                               
  // https://forum.arduino.cc/t/samd51-dac-using-dma-seems-too-fast/678418                                                                                                                                                                    
  DMAC->CTRL.bit.DMAENABLE = 0;  // Force disable so registers are accessible                                                                                                                                                                 
  DMAC->BASEADDR.reg = (uint32_t)&descriptor1;                // Set the descriptor section base address                                                                                                                                      
  DMAC->WRBADDR.reg = (uint32_t)&wrb;                                // Set the write-back descriptor base adddress                                                                                                                           
  DMAC->Channel[0].CHCTRLA.reg = DMAC_CHCTRLA_TRIGSRC(TCC0_DMAC_ID_OVF) | 0x2; // !! Check TRIGACT 0x0, 0x2 0x3 (0x1 reserved) should be beat mode                                                                                            
  DMAC->CTRL.reg = DMAC_CTRL_DMAENABLE | DMAC_CTRL_LVLEN(0xf);      // Enable the DMAC and priority levels                                                                                                                                    
  DMAC->Channel[0].CHPRILVL.reg = DMAC_CHPRILVL_PRILVL(0); // Set the channel priority level                                                                                                                                                  

  descriptor1.DESCADDR.reg = (uint32_t) &descriptor1;
  descriptor1.BTCTRL.bit.VALID    = 1;  // pg 379                                                                                                                                                                                             
  descriptor1.BTCTRL.bit.BEATSIZE = 1;  //                                                                                                                                                                                                    
  descriptor1.BTCTRL.bit.SRCINC   = 1;   //Source increment is enabled                                                                                                                                                                        
  descriptor1.BTCTRL.bit.DSTINC   = 0x0;   //Destination increment disabled                                                                                                                                                                   
  //descriptor1.BTCTRL.bit.BLOCKACT = 0x2;   //Suspend after block complete.                                                                                                                                                                  
  // ("Burst" size will be 1 "beat" = 1 HWORD, by default)                                                                                                                                                                                    
  descriptor1.BTCNT.reg           = NELEM;   //Number of elements to send                                                                                                                                                                     
  descriptor1.SRCADDR.reg         = (uint32_t)(&LookUp[NELEM]); //send from the data vevtor                                                                                                                                                   
  descriptor1.DSTADDR.reg         = (uint32_t)&TCC0->CC[3].reg;                        // Copy it into the TCC0 counter comapare 0 register                                                                                                   

  DMAC->Channel[0].CHCTRLA.reg |= DMAC_CHCTRLA_ENABLE; // Enable the DMAC channel                                                                                                                                                             

  DMAC ->Channel[0].CHCTRLA.bit.ENABLE = 1;     //Enable                                                                                                                                                                                      
  DMAC->CTRL.bit.DMAENABLE = 1;
}

void SetupPWM( void ){
  // Set up the generic clock (GCLK7) to clock timer TCC0                                                                                                                                                                                     
  GCLK->GENCTRL[7].reg = GCLK_GENCTRL_DIV(1) |       // Divide the clock source by divisor 1: 120MHz/1 = 120MHz                                                                                                                               
                         GCLK_GENCTRL_IDC |          // Set the duty cycle to 50/50 HIGH/LOW                                                                                                                                                  
                         GCLK_GENCTRL_GENEN |        // Enable GCLK7                                                                                                                                                                          
                         GCLK_GENCTRL_SRC_DPLL0;     // Select 120MHz DPLL clock source                                                                                                                                                       
  while (GCLK->SYNCBUSY.bit.GENCTRL7);               // Wait for synchronization                                                                                                                                                              

  GCLK->PCHCTRL[25].reg = GCLK_PCHCTRL_CHEN |        // Enable the TCC0 peripheral channel                                                                                                                                                    
                          GCLK_PCHCTRL_GEN_GCLK7;    // Connect generic clock 7 to TCC0                                                                                                                                                       

  // Enable the peripheral multiplexer on pin D7                                                                                                                                                                                              
  PORT->Group[g_APinDescription[7].ulPort].PINCFG[g_APinDescription[7].ulPin].bit.PMUXEN = 1;

  // Set the D7 (PORT_PB12) peripheral multiplexer to peripheral (even port number) E(6): TCC0, Channel 0                                                                                                                                     
  PORT->Group[g_APinDescription[7].ulPort].PMUX[g_APinDescription[7].ulPin >> 1].reg |= PORT_PMUX_PMUXE(6);

  TCC0->CTRLA.reg = TC_CTRLA_PRESCALER_DIV1 |        // Set prescaler to 1, 120MHz/1 = 120MHz                                                                                                                                                 
                    TC_CTRLA_PRESCSYNC_PRESC;        // Set the reset/reload to trigger on prescaler clock                                                                                                                                    

  TCC0->WAVE.reg = TC_WAVE_WAVEGEN_NPWM;             // Set-up TCC0 timer for Normal (single slope) PWM mode (NPWM)                                                                                                                           
  while (TCC0->SYNCBUSY.bit.WAVE)                    // Wait for synchronization                                                                                                                                                              

  TCC0->PER.reg = PERIOD;                            // Set-up the PER (period) register PWM                                                                                                                                                  
  while (TCC0->SYNCBUSY.bit.PER);                    // Wait for synchronization                                                                                                                                                              

  TCC0->CC[0].reg = PERIOD/2;                           // Set-up the CC (counter compare), channel 0 register for 50% duty-cycle                                                                                                             
  while (TCC0->SYNCBUSY.bit.CC0);                    // Wait for synchronization                                                                                                                                                              

  TCC0->CTRLA.bit.ENABLE = 1;                        // Enable timer TCC0                                                                                                                                                                     
  while (TCC0->SYNCBUSY.bit.ENABLE);                 // Wait for synchronization                                                                                                                                                              
}

void loop(){
}


Hi @sjwalker66

The DMAC's CHCTRLA register burst mode needs to be set with the DMAC_CHCTRLA_TRIGACT_BURST definition. This definition takes the 0x2 and left shifts it 20 places, so that it sets the CHCTRL register's TRIGACT bitfield at bits positions 20 and 21:

DMAC->Channel[0].CHCTRLA.reg = DMAC_CHCTRLA_TRIGSRC(TCC0_DMAC_ID_OVF) | DMAC_CHCTRLA_TRIGACT_BURST;

Register definitions for the DMAC (and other SAMD51 peripherals) can be found (on my Windows machine at least) in the DMAC.h file located here:

C:\Users\Computer\AppData\Local\Arduino15\packages\arduino\tools\CMSIS-Atmel\1.2.0\CMSIS\Device\ATMEL\samd51\include\component

Largely from trial and error, the DMAC's destination address descriptor should be the TCC0 timer's buffered CCBUF[3].reg register:

descriptor1.DSTADDR.reg = (uint32_t)&TCC0->CCBUF[3].reg;

The buffered CCBUF registers are automatically loaded into their corresponding CC registers at the beginning of each timer cycle, thereby preventing glitches from occuring on the timer's PWM output. Changes to the CC registers by contrast take effect immediately at the output.

In theory, loading the CC or CCBUF shouldn't really make any difference since the DMAC is triggered by the timer's overflow at the beginning of the timer cycle anyway, but CCBUF seemed to be the one that worked.

I suspect that this CCBUF to CC timing delay probably has something to do with having to add the padding 0s to the data array.

Finally, in the end I drove the TCC0 timer with a 48MHz clock source, since the 120MHz didn't really add any practical benefit, but I guest it doesn't really matter which clock source, so long as it's fast enough:

GCLK->GENCTRL[6].reg = GCLK_GENCTRL_DIV(1) |         // Divide the 48MHz clock source by divisor 1: 48MHz/1 = 48MHz
                         GCLK_GENCTRL_IDC |          // Set the duty cycle to 50/50 HIGH/LOW
                         GCLK_GENCTRL_GENEN |        // Enable GCLK6
                         //GCLK_GENCTRL_SRC_DPLL0;     // Generate from 120MHz DPLL clock source
                         //GCLK_GENCTRL_SRC_DPLL1;     // Generate from 100MHz DPLL
                         GCLK_GENCTRL_SRC_DFLL;      // Generate from 48MHz DFLL clock source
while (GCLK->SYNCBUSY.bit.GENCTRL6);                 // Wait for synchronization

All the preparation the CPU has to do just to get the DMAC to send DShot packet, makes me wonder whether if it has any real benefit over pulse width modulated protocols such as Oneshot125, Oneshot42 and Multishot. In this case all the CPU has to do is load the timer's pulse width and re-trigger to send it.

Hi @MartinL

Thanks again for your helpful response. My mistake with not bit shifting was dumb, fried brain. I will continue to jigger this, hoping to get DSHOT going.

I only moved from regular PWM (1-2msec) to OneShot125 6 months ago and realized a dramatic improvement in response. Guess there will be diminishing returns going to DSHOT, but it's a great (if very frustrating) learning experience, trying to get DMA working. And I guess that not having to calibrate ESCs with DSHOT is a plus. But if you have found DSHOT hard work/complicated, then maybe I'll end up happy with OneShot.

Hi @sjwalker66

I just meant that DShot takes quite a bit more CPU processing power to prepare everything for the DMA, but with modern microcontrollers these days, I guess that's not an issue.

I enjoyed working on implementing DShot. It's quite satifying when the motors start responding to the binary coded pulses.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.