Faster SPI communication?

One solution to keeping the SD selected is to use a serial port as dedicated SPI bus as mentioned above but still write whole blocks at a time. The SD card has a controller with a wear map of the flash and it can take a little extra time addressing a block so extra buffer space for incoming data is a good idea.

Nick Gammon has serial port as master mode SPI and slave mode SPI for AVR's on his blog pages, maybe by now he's done ARM as well.

Greetings

ard_newbie:
I will analyse later your code, but I can already see weird things:

Lines 265 and 266:

Timer Counter 0 Channel 2 triggers DAC conversions (you are no more in free running mode !) thru TIOA2 internally . However, you set TC_CPRD to timerperiod = 0 ? and TC_CDTY to timerperiod (?) = 0 ?  CDTY range should be: 0 < CDTY < CPRD and obviously timerperdiod not equal to 0.

TC_CPRD and TC_CDTY are not 0 they are defined in the setup, taking in account the sample rate of the wav file, before tc_setup() is called, anyway I’ve changed the default value.

It seems like enabling NVIC interrupts for DACC are disabling all the other interrupts like the SPI (SD) and the USART (Serial). Is it normal?

GoForSmoke:
One solution to keeping the SD selected is to use a serial port as dedicated SPI bus as mentioned above but still write whole blocks at a time. The SD card has a controller with a wear map of the flash and it can take a little extra time addressing a block so extra buffer space for incoming data is a good idea.

Nick Gammon has serial port as master mode SPI and slave mode SPI for AVR’s on his blog pages, maybe by now he’s done ARM as well.

I don’t understand, which serial port are you refering? The USART?

GoForSmoke:
I don’t have a Due but I do wonder about some things,

Does the Due have an SD or SDFat library like the AVR-duinos? It uses a buffer to write to SD in burst mode.

The Due has SD and SDFat libraries like the other AVR based arduinos.

ard_newbie:
GoForSmoke is right, there are faster SD libraries for DUE (SDFAT) or (more tricky) you can use High Speed Multimedia Interface peripheral (HSMCI). It has been rarely used for some reason, but IMO this is super fast reading/Writing to an SD card.

I have to check if the reading and writing functions are similar, otherwise, I’ll have to rewrite the reading code. Improving the SD reading speed could be helpful.

ard_newbie:
Since you are not using Free Running mode, to debug, I would first try to read the SD card then output the result at a relatively slow frequency (e.g. 44.1 KHz) far from 1 MHz and see what happens with higher frequencies.

I need to read the whole wav file and not a few samples, it’s very hard to tell if the signal is ok, with 100 or 200 ms of samples (I can’t save more time in the Due’s RAM). I need to be reading the file from the SD and feed samples to DAC.

ard_newbie:
Edit: TurboSpi is by far the best way to go to speed up SPI (42 MHz)

I don’t need to go that fast, I think somewhere between 10 MHz and 20 MHz is ok.

Anyway, my goal is to make something similar to the Audio library, that can read wav files, but with a recording feature added.

The code should:

  • Read an wav file sampled at up to 44.1 kHz or 48 kHz and feed samples to the DAC (preferrably external) because a resolution of at least 16 bits is desirable. Both DAC and SD card communicate thru SPI
  • Record an wav file to the SD sampled using the same frequency, 44.1 kHz or 48 kHz, with samples from an ADC (also SPI and external with at least 16 bit resolution)

The circuit is supposed to be used with ECG frontends to perform closed loop tests. For example introducing some noises to the ECG signal on a wav file to preform noise rejection tests to ECG frontends. The result is saved on the SD card using the wav format.

The circuit is made of:

|->>>>>>>>>>>>>>>>>>>>-|
SD Card <->Arduino Due → DAC → signal conditioning circuits → ECG frontend → ADC
|-<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<-|

For now I’m trying only to make the first step, reading a wav file sampled at 44.1 kHz or 48 kHz, but it’s very difficult.

Regards,
Daniel

DumpAudioFile5_optimized_dma.ino (23.5 KB)

IMO Audio projects are amongst the most difficult ones and probably require a lot of debugging.

There is an audio library for DUE which leverages PDC DMA to transfer SD card files to a DAC at 44.1 KHz:

However when I look at this library code, I don’t understand why they use the DAC PDC DMA by polling PDC registers rather than using interruptions generated by the DMA…

A useful Wav File tutorial to fully understand a Wav file structure:
http://www.topherlee.com/software/pcm-tut-wavformat.html

A/ Option 1 : use the 2 builtin DACs

Providing the Wav file is an audio signal sampled at 44.1 KHz, and you are using your builtin DACs, stepwise I would try this:

1/ Read the SD card at 44.1 * 2 KHz for a DAC stereo output, read the wav File header to store file parameters (file length,etc …). I guess your actual code does already that more or less.

2/ Declare two identical buffers: uint16_t AudioBuffer(2)(1024) of 1024 Half Words (or more). Remember you have 96 K Bytes of SRAM

3/ Program your 2 DACs to convert 32-bit audio signals at 44.1 KHz with e.g. Timer Counter 0 Channel 2 and a PDC DMA (see an example sketch below).

4/ Once you have filled two AudioBuffer of Half Words, reformat the 16-bit values retrieved from the Wav file to 12-bit values adapted for your DACs, and alternate the bit 12 value to dispatch on DAC0 and DAC1. Set a pointer of uint32_t words at the beginning of the buffer and pass this pointer of 512 words to the PDC DMA.

5/ In DACC_Handler(), each time you set the DMA to the next Buffer pointer, set a flag which will be leveraged inside the loop to load and reformate the next to next buffer (alternatively AudioBuffer0 and AudioBuffer1). Increment the file length already used and compare to the actual length.

B/ Option 2 : use 2 external DACs thru an SPI bus

A major interest of TurboSpi is the use of AHB DMA. In addition to the potentially high speed transfer, the CPU utilization is minimal as the DMA is handled off core, you can do another task in parallel in loop() while reading or writing on the SPI bus. You can also consider using 2 other SPI buses on USART0 and USART1.

An example sketch to output a sin wave on DAC0 and DAC1 thru a PDC DMA:

/***********************************************************************************/
/*   DAC0 and DAC1 output of a sin wave - Frequency of sin = 44.1 KHz / sinsize    */
/***********************************************************************************/

const uint32_t sinsize  = 256 ;   // Size of buffer must be a power of 2
uint32_t sinus[2][sinsize];
  
volatile uint32_t bufn;

void dac_setup () {

  PMC->PMC_PCER1 = PMC_PCER1_PID38;     // DACC power ON
  DACC->DACC_CR = DACC_CR_SWRST ;       // Reset DACC

  DACC->DACC_MR = DACC_MR_TRGEN_EN                   // Hardware trigger select
                  | DACC_MR_TRGSEL(0b011)            // Trigger by TIOA2
                  | DACC_MR_TAG_EN                   // enable TAG to set channel in CDR
                  | DACC_MR_WORD_WORD                // write to both channels
                  | DACC_MR_REFRESH (1)
                  | DACC_MR_STARTUP_8
                  | DACC_MR_MAXS;

  DACC->DACC_IER |= DACC_IER_TXBUFE;                 // Interrupt used by PDC DMA
                
  DACC->DACC_ACR = DACC_ACR_IBCTLCH0(0b10)
                   | DACC_ACR_IBCTLCH1(0b10)
                   | DACC_ACR_IBCTLDACCORE(0b01);

  NVIC_EnableIRQ(DACC_IRQn);                         // Enable DACC interrupt

  DACC->DACC_CHER = DACC_CHER_CH0                    // enable channel 0 = DAC0
                    | DACC_CHER_CH1;                 // enable channel 1 = DAC1

  /*************   configure PDC/DMA  for DAC *******************/

  DACC->DACC_TPR  = (uint32_t)sinus[0];         // DMA buffer
  DACC->DACC_TCR  = sinsize;
  DACC->DACC_TNPR = (uint32_t)sinus[1];         // next DMA buffer (circular buffer)
  DACC->DACC_TNCR = sinsize;
  bufn = 1;
  DACC->DACC_PTCR = DACC_PTCR_TXTEN;            // Enable PDC Transmit channel request

}

void DACC_Handler() {
  
  uint32_t status = DACC->DACC_ISR;   // Read and save DAC status register
  if (status & DACC_ISR_TXBUFE) {     // move DMA pointer to next buffer
    bufn = (bufn + 1) & 1;
    DACC->DACC_TNPR = (uint32_t)sinus[bufn];
    DACC->DACC_TNCR = sinsize;
  }
}

void tc_setup() {

  PMC->PMC_PCER0 |= PMC_PCER0_PID29;                      // TC2 power ON : Timer Counter 0 channel 2 IS TC2
  TC0->TC_CHANNEL[2].TC_CMR = TC_CMR_TCCLKS_TIMER_CLOCK2  // MCK/8, clk on rising edge
                              | TC_CMR_WAVE               // Waveform mode
                              | TC_CMR_WAVSEL_UP_RC        // UP mode with automatic trigger on RC Compare
                              | TC_CMR_ACPA_CLEAR          // Clear TIOA2 on RA compare match
                              | TC_CMR_ACPC_SET;           // Set TIOA2 on RC compare match


  TC0->TC_CHANNEL[2].TC_RC = 238;  //<*********************  Frequency = (Mck/8)/TC_RC = 44.1 MHz
  TC0->TC_CHANNEL[2].TC_RA = 40;  //<********************   Any Duty cycle in between 1 and TC_RC

  TC0->TC_CHANNEL[2].TC_CCR = TC_CCR_SWTRG | TC_CCR_CLKEN; // Software trigger TC2 counter and enable
}

void setup() {

  for(int i = 0; i < sinsize; i++) 
    {
   uint32_t chsel = (0<<12) | (1<<28);                      // LSB on DAC0, MSB DAC1 !!
   sinus[0][i]  = 2047*sin(i * 2 * PI/sinsize) + 2047;      //  0 < sinus [i] < 4096
   sinus[1][i] = sinus[0][i] |= sinus[0][i] <<16 | chsel;   // two buffers formated
                                                            // MSB [31:16]on channel 1
                                                            // LSB [15:0] on chanel 0
    }
  tc_setup();
  dac_setup();
}

void loop() {

}

The point of reading and writing buffers is to get the fastest speed possible. SD is already way faster than WAV playback.

Highest possible bit rate is the 84MHz cpu clock. I get 10.5MHz bytes max from there. How is there 42MHz SPI on a Due?

Greetings,

I’m trying to use some code from the Audio library, but the sketch is not working, the audio is really poor and there’s a strange periodic noise overlapping. I think the problem is the onTransmitEnd functions, with cb and data variables that I don’t know what they are used for. :confused:

Regards,
Daniel

sketch_mar10a.ino (13.6 KB)

Audio.cpp (2.07 KB)

Audio.h (1.69 KB)

DAC.cpp (3.5 KB)

DAC.h (1.05 KB)

GoForSmoke: How is there 42MHz SPI on a Due?

See this thread, reply #36:

https://forum.arduino.cc/index.php?topic=437243.30

ard_newbie: See this thread, reply #36:

https://forum.arduino.cc/index.php?topic=437243.30

Karma for you! Speed there referred to SCLK not byte rate. The default AVR divider is 4 giving 32 cpu cycles between SPI needing read/written events, I guess the default is > 2 for ARM as well. :)

84MHz divide by 2 for SPI then divide by 8 for bytes for that and add overhead for ECC?

5.5MB/s with no overhead. Compare that to the original 4.77MHz PC ISA bus. ISA cards are vintage salvage now.

If you really want speed, google teensy 3.6, it's an Arduino-compatible with a 180MHz (OC to 240) M4F (FPU) ARM chip with 256KB RAM, a meg of flash, and SD on the board. If you're not quick at soldering then pay the extra $4 for one with pins.

Remember that the core speed is not the only parameter to do the math to obtain the maximum theoretical SPI speed:

from https://forum.pjrc.com/threads/46819-Teensy-3-6-SPI-transaction-speed-problem reply #10:

If my math is correct, the highest SPI speed you can get is 60mhz. You can get this by running at 240mhz, plus edit Kinetisk.h to change the F_BUS speed to 120mhz instead of default 60mhz. Now SPI can run at F_BUS/2 so you can set the max to 60mhz on a Teensy 3.6

The Sam3x (DUE) embeds an AHB DMA, and this AHB DMA is further optimized for HSMCI, SSC (I2S) and SPI peripherals. If you run a DUE at 114 MHz, transfers on SPI bus should peak at close to 57 MHz.

As I understand reply #10, a Teensy 3.6 with its default clock speed at 180 MHz has a slower SPI bus (30 MHz max) than a DUE with its default clock speed :)

What stops the 180MHz user editing F_BUS speed to 90MHz instead of the default 60? Can the F_BUS divider be 1?

Can some one, please take a look at my code? Why is my code not working properly and what is cb and cb(Data)?

You should understand that the worst method to debug is to write a complete sketch and then try to make it work. Once you have a global vision of where you want to go, make baby steps, and check everything you can at each step before following the next one.

IMO the AUDIO Library doesn't work properly. Did you try the example sketch which comes with this Library (output a Wav file on a single DAC), and what is the result ?

ard_newbie:
You should understand that the worst method to debug is to write a complete sketch and then try to make it work. Once you have a global vision of where you want to go, make baby steps, and check everything you can at each step before following the next one.

IMO the AUDIO Library doesn’t work properly. Did you try the example sketch which comes with this Library (output a Wav file on a single DAC), and what is the result ?

Yes, I think you’re right, I’ve made a simple sketch, without DMA, and without SD and WAV file, and it’s working, it can represent a square wave, a sawtooth or a sine wave (using a samples array). The interrupts are working fine. And I can sample the signal up to 200.000 kHz, I think maybe a bit more but I haven’t tested it. :slight_smile:
The serial is printing points to check if it’s stuck, and seems ok.

Now, I will try to use the PDC (DMA) but that’s a lot more difficult. :o

Yes, the Audio library is not very well developed and makes a lot of noise.

Edit:
Your sine generator code works fine! :slight_smile:
It uses DMA and works! I’ve made a few changes to be easier to change the sampling frequency and I’ve also added the possibility to queue the samples to the first PDC buffer in DACC_Handler function. Why you say the buffer size must be a power of 2?

Regards,
Daniel

tcTriggeredDAC.ino (4.61 KB)

sineDMADAC.ino (3.71 KB)

Why you say the buffer size must be a power of 2? You can consider this comment as non necessary

IMO your modification in DACC_Handler(), as done in AUDIO Library, is not useful and may only slow down the PDC DMA process because:

The DACC_Handler() is called before the DAC runs out of conversions data ( somewhere between the middle and the end of the actual buffer). Inside DACC_Handler(), re-configure the PDC channel with the next buffer and transfer length. The processor processes DACC re-configuration before the DACC runs out of conversion data and, obviously, waits the end of the actual buffer before swapping to the next buffer.

Conclusion: a re-configuration is prepared during the conversion of the last buffer values !

BTW, to be sure that the core is not stuck, it's better to toggle the builtin led e.g. every second rather than serial print because it takes only a few clock cycles compared to a Serial print ( tens of us).

Greetings,

I’m still stuck, I can reproduce files (without using DMA) sampled at up to 32000 kHz, above that, the file keeps repeating samples.

I’m trying to use DMA based on your sinewave example, but I can’t put the code working. My goal is to have a circular buffer divided in 2 controlled by 2 flags, when the DAC finishes a buffer (DACC_TXBUFE), the readFlagX of that buffer is set to false, the loop should use one half of the buffer to store samples from the file.

The code should:
Store samples to the entire buffer from the SD card
The DAC converts the first half of the buffer
The loop stores samples on the first half of the buffer (after the DAC finishes the first buffer half)
The DAC reads the second half
The loop stores samples on the second half of the buffer (after the DAC finishes the second buffer half)
And the cycle goes on until EOF is reached

The problem is, my code is not working properly the wave files allways sound like an helicopter, both readFlags are set true at the same time, it’s confusing, the two buffers can’t be read at the same time. Can you please help me solving this.

This code is based on other two implementations that work at up to 32000kHz. I’m currently testing using 32000 kHz files.

Code attached.

File name finished on dma → not working, SD samples buffer read in less than 5 ms (12.5 KiB). 100 ms samples buffer (50 ms half buffer)

File name finished on db → uses a circular buffer accessed by the processor, works up to 32000kHz, large SD samples buffer (12.5 KiB)

File name finished on dac → the same as db but with a very small SD samples buffer (1 byte)

I guess the SD card is running with an SCK of 10.5 MHz or less.

Regards,
Daniel

DumpAudioFile3_tc_dac_dma.ino (18.4 KB)

DumpAudioFile3_tc_dac_db.ino (16.4 KB)

DumpAudioFile3_tc_dac.ino (13.2 KB)

Greetings,

I've made this modifications to the code:

const uint32_t sdCLK = VARIANT_MCK / 3; SD.begin(sdCLK, chipSelect) instead of simply SD.begin(chipSelect)

And now the reproduction of wav files up to 48 kHz is possible. The problem is, I still don't know how to use the DMA. And I wanted to remove some load from the processor. I also wanted to use external DACs and I can't use them using the SPI or turbo SPI libraries, because if I use it, the SD card wont initialize and I can't read the file. I need to implement a software defined SPI using the USART for example. Can you help me solve this.

Regards, Daniel