Background:
I know that some/most of this is covered in my thread:
But wondered if maybe some of things I discovered or tried along the way, might be helpful to others.
As I mentioned in the other thread, there are not a lot of examples of doing SPI with DMA on the forum, nor the documentation for doing SPI operations using DMA is not easy to decern from the documentation.
We did find a couple of links that helped:
For a moment, I thought that there was some support for this, contained within the MBED:: stuff, in particular:
/** Start non-blocking SPI transfer using 8bit buffers.
*
* This function locks the deep sleep until any event has occurred.
*
* @param tx_buffer The TX buffer with data to be transferred. If NULL is passed,
* the default SPI value is sent.
* @param tx_length The length of TX buffer in bytes.
* @param rx_buffer The RX buffer which is used for received data. If NULL is passed,
* received data are ignored.
* @param rx_length The length of RX buffer in bytes.
* @param callback The event callback function.
* @param event The event mask of events to modify. @see spi_api.h for SPI events.
*
* @return Operation result.
* @retval 0 If the transfer has started.
* @retval -1 If SPI peripheral is busy.
*/
template<typename Type>
int transfer(const Type *tx_buffer, int tx_length, Type *rx_buffer, int rx_length, const event_callback_t &callback, int event = SPI_EVENT_COMPLETE)
{
if (spi_active(&_peripheral->spi)) {
return queue_transfer(tx_buffer, tx_length, rx_buffer, rx_length, sizeof(Type) * 8, callback, event);
}
start_transfer(tx_buffer, tx_length, rx_buffer, rx_length, sizeof(Type) * 8, callback, event);
return 0;
}
However, when you look at the sources for the STM target code, you see:
// asynchronous API
// DMA support for SPI is currently not supported, hence asynchronous SPI does not support high speeds(MHZ range)
void spi_master_transfer(spi_t *obj, const void *tx, size_t tx_length, void *rx, size_t rx_length, uint8_t bit_width, uint32_t handler, uint32_t event, DMAUsage hint)
{
struct spi_s *spiobj = SPI_S(obj);
SPI_HandleTypeDef *handle = &(spiobj->handle);
// TODO: DMA usage is currently ignored
(void) hint;
...
First internal questions:
-
How do you cleanly mix and match. Much of your code will probably still be setup to work at the SPI or SPI1 wrapper level, but in the cases where you wish to access the hardware directly or through mbed: how do you do that, without just knowing that SPI -> hardware spi1 and SPI1 uses spi5...
-
How do you manage knowing which DMA objects are in use by some other device or the system, and which ones are available for your sketch/device?
Current State of my implementation:
Note: All of the code is up in my github project:
KurtE/ILI9341_GIGA_n: Converted from ILI9341_t3n for the Arduino GIGA (github.com)
Choosing a DMA Stream: Currently my display driver class allows me to simply pass in which DMA Stream to use, defaults to DMA1_Stream1. The constructor allows me to specify which SPI object as well.
Accessing Hardware SPI information: Currently I have a hardware table built into my class that looks like:
typedef struct {
SPIClass *pspi; // Which SPI is this (SPI, SPI1)
SPI_TypeDef *pgigaSpi; // What is the underlying hardware SPI object;
void (*txdmaisr)(void); // which call back should we use for this one.
uint8_t tx_dmamux1_req_id; // What is the DMAMUX Request ID for this SPI object
uint8_t rx_dmamux1_req_id; // What is the DMAMUX Request ID for this SPI object
} SPI_Hardware_info_t;
static SPI_Hardware_info_t s_spi_hardware_mapping[2];
...
ILI9341_GIGA_n::SPI_Hardware_info_t ILI9341_GIGA_n::s_spi_hardware_mapping[2] = {
{&SPI, (SPI_TypeDef *) SPI1_BASE, &dmaInterrupt, 38, 37},
{&SPI1, (SPI_TypeDef *) SPI5_BASE, &dmaInterrupt1, 86, 85}
};
Most of this information is pretty straight forward. Except maybe the dmamux request numbers. These come from the DMAMUX chapter of the reference manual.
Like Table 126 in chapter 18, about page 733.
In my test sketch I had setup a enum for this,
typedef enum {
DMAMUX1_REQ_GEN0 = 1, DMAMUX1_REQ_GEN1 = 2, DMAMUX1_REQ_GEN2 = 3, DMAMUX1_REQ_GEN3 = 4, DMAMUX1_REQ_GEN4 = 5,
DMAMUX1_REQ_GEN5 = 6, DMAMUX1_REQ_GEN6 = 7, DMAMUX1_REQ_GEN7 = 8, ADC1_DMA = 9, ADC2_DMA = 10,
TIM1_CH1 = 11, TIM1_CH2 = 12, TIM1_CH3 = 13, TIM1_CH4 = 14, TIM1_UP = 15,
TIM1_TRIG = 16, TIM1_COM = 17, TIM2_CH1 = 18, TIM2_CH2 = 19, TIM2_CH3 = 20,
TIM2_CH4 = 21, TIM2_UP = 22, TIM3_CH1 = 23, TIM3_CH2 = 24, TIM3_CH3 = 25,
TIM3_CH4 = 26, TIM3_UP = 27, TIM3_TRIG = 28, TIM4_CH1 = 29, TIM4_CH2 = 30,
TIM4_CH3 = 31, TIM4_UP = 32, I2C1_RX_DMA = 33, I2C1_TX_DMA = 34, I2C2_RX_DMA = 35,
I2C2_TX_DMA = 36, SPI1_RX_DMA = 37, SPI1_TX_DMA = 38, SPI2_RX_DMA = 39, SPI2_TX_DMA = 40,
USART1_RX_DMA = 41, USART1_TX_DMA = 42, USART2_RX_DMA = 43, USART2_TX_DMA = 44, USART3_RX_DMA = 45,
USART3_TX_DMA = 46, TIM8_CH1 = 47, TIM8_CH2 = 48, TIM8_CH3 = 49, TIM8_CH4 = 50,
TIM8_UP = 51, TIM8_TRIG = 52, TIM8_COM = 53, RESERVED = 54, TIM5_CH1 = 55,
TIM5_CH2 = 56, TIM5_CH3 = 57, TIM5_CH4 = 58, TIM5_UP = 59, TIM5_TRIG = 60,
SPI3_RX_DMA = 61, SPI3_TX_DMA = 62, UART4_RX_DMA = 63, UART4_TX_DMA = 64, UART5_RX_DMA = 65,
UART5_TX_DMA = 66, DAC_CH1_DMA = 67, DAC_CH2_DMA = 68, TIM6_UP = 69, TIM7_UP = 70,
USART6_RX_DMA = 71, USART6_TX_DMA = 72, I2C3_RX_DMA = 73, I2C3_TX_DMA = 74, DCMI_DMA = 75,
CRYP_IN_DMA = 76, CRYP_OUT_DMA = 77, HASH_IN_DMA = 78, UART7_RX_DMA = 79, UART7_TX_DMA = 80,
UART8_RX_DMA = 81, UART8_TX_DMA = 82, SPI4_RX_DMA = 83, SPI4_TX_DMA = 84, SPI5_RX_DMA = 85,
SPI5_TX_DMA = 86, SAI1A_DMA = 87, SAI1B_DMA = 88, SAI2A_DMA = 89, SAI2B_DMA = 90,
SWPMI_RX_DMA = 91, SWPMI_TX_DMA = 92, SPDIFRX_DAT_DMA = 93, SPDIFRX_CTRL_DMA = 94, HR_REQ_1 = 95,
HR_REQ_2 = 96, HR_REQ_3 = 97, HR_REQ_4 = 98, HR_REQ_5 = 99, HR_REQ_6 = 100,
DFSDM1_DMA0 = 101, DFSDM1_DMA1 = 102, DFSDM1_DMA2 = 103, DFSDM1_DMA3 = 104, TIM15_CH1 = 105,
TIM15_UP = 106, TIM15_TRIG = 107, TIM15_COM = 108, TIM16_CH1 = 109, TIM16_UP = 110,
TIM17_CH1 = 111, TIM17_UP = 112, SAI3_A_DMA = 113, SAI3_B_DMA = 114, ADC3_DMA = 115
} DMAMUX1_CxCR_DMAREQ_ID;
Would be good to have something like this so instead of seeing 86, you would
see: DMAMUX1_CxCR_DMAREQ_ID::SPI5_TX_DMA
One Time Initialize DMA
Not sure if anyone would find this useful, but my one shot code currently looks like:
void ILI9341_GIGA_n::initDMASettings(void) {
if (_dma_state & ILI9341_DMA_INIT) return;
// See which dmastream we are using.There is probably a cleaner way to do this...
uint32_t denable = 0;
switch ((uint32_t)_dmaStream) {
default: return; // not in our list.
case DMA1_Stream0_BASE: _dmamux = DMAMUX1_Channel0; _pdma = DMA1; _dma_channel = 0; _dmaTXIrq = DMA1_Stream0_IRQn; denable = RCC_AHB1ENR_DMA1EN; break;
case DMA1_Stream1_BASE: _dmamux = DMAMUX1_Channel1; _pdma = DMA1; _dma_channel = 1; _dmaTXIrq = DMA1_Stream1_IRQn; denable = RCC_AHB1ENR_DMA1EN; break;
case DMA1_Stream2_BASE: _dmamux = DMAMUX1_Channel2; _pdma = DMA1; _dma_channel = 2; _dmaTXIrq = DMA1_Stream2_IRQn; denable = RCC_AHB1ENR_DMA1EN; break;
case DMA1_Stream3_BASE: _dmamux = DMAMUX1_Channel3; _pdma = DMA1; _dma_channel = 3; _dmaTXIrq = DMA1_Stream3_IRQn; denable = RCC_AHB1ENR_DMA1EN; break;
case DMA1_Stream4_BASE: _dmamux = DMAMUX1_Channel4; _pdma = DMA1; _dma_channel = 4; _dmaTXIrq = DMA1_Stream4_IRQn; denable = RCC_AHB1ENR_DMA1EN; break;
case DMA1_Stream5_BASE: _dmamux = DMAMUX1_Channel5; _pdma = DMA1; _dma_channel = 5; _dmaTXIrq = DMA1_Stream5_IRQn; denable = RCC_AHB1ENR_DMA1EN; break;
case DMA1_Stream6_BASE: _dmamux = DMAMUX1_Channel6; _pdma = DMA1; _dma_channel = 6; _dmaTXIrq = DMA1_Stream6_IRQn; denable = RCC_AHB1ENR_DMA1EN; break;
case DMA1_Stream7_BASE: _dmamux = DMAMUX1_Channel7; _pdma = DMA1; _dma_channel = 7; _dmaTXIrq = DMA1_Stream7_IRQn; denable = RCC_AHB1ENR_DMA1EN; break;
case DMA2_Stream0_BASE: _dmamux = DMAMUX2_Channel0; _pdma = DMA2; _dma_channel = 0; _dmaTXIrq = DMA2_Stream0_IRQn; denable = RCC_AHB1ENR_DMA2EN; break;
case DMA2_Stream1_BASE: _dmamux = DMAMUX2_Channel1; _pdma = DMA2; _dma_channel = 1; _dmaTXIrq = DMA2_Stream1_IRQn; denable = RCC_AHB1ENR_DMA2EN; break;
case DMA2_Stream2_BASE: _dmamux = DMAMUX2_Channel2; _pdma = DMA2; _dma_channel = 2; _dmaTXIrq = DMA2_Stream2_IRQn; denable = RCC_AHB1ENR_DMA2EN; break;
case DMA2_Stream3_BASE: _dmamux = DMAMUX2_Channel3; _pdma = DMA2; _dma_channel = 3; _dmaTXIrq = DMA2_Stream3_IRQn; denable = RCC_AHB1ENR_DMA2EN; break;
case DMA2_Stream4_BASE: _dmamux = DMAMUX2_Channel4; _pdma = DMA2; _dma_channel = 4; _dmaTXIrq = DMA2_Stream4_IRQn; denable = RCC_AHB1ENR_DMA2EN; break;
case DMA2_Stream5_BASE: _dmamux = DMAMUX2_Channel5; _pdma = DMA2; _dma_channel = 5; _dmaTXIrq = DMA2_Stream5_IRQn; denable = RCC_AHB1ENR_DMA2EN; break;
case DMA2_Stream6_BASE: _dmamux = DMAMUX2_Channel6; _pdma = DMA2; _dma_channel = 6; _dmaTXIrq = DMA2_Stream6_IRQn; denable = RCC_AHB1ENR_DMA2EN; break;
case DMA2_Stream7_BASE: _dmamux = DMAMUX2_Channel7; _pdma = DMA2; _dma_channel = 7; _dmaTXIrq = DMA2_Stream7_IRQn; denable = RCC_AHB1ENR_DMA2EN; break;
}
_dma_state |= ILI9341_DMA_INIT | ILI9341_DMA_EVER_INIT;
// Enable DMA1
SET_BIT(RCC->AHB1ENR, denable);
delay(50);
_dmaStream->M0AR = (uint32_t)_pfbtft;
_dmaStream->PAR = (uint32_t)&_pgigaSpi->TXDR;
uint32_t cr = 0;
cr |= 1 << DMA_SxCR_DIR_Pos; // Set Memory to Peripheral
cr |= (2 << DMA_SxCR_MSIZE_Pos); // try 32 set 16 bit mode
cr |= DMA_SxCR_MINC; // Memory Increment
cr |= 1 << DMA_SxCR_PSIZE_Pos; // Peripheral size 16 bits
//cr |= DMA_SxCR_PFCTRL; // Peripheral in control of flow contrl
cr |= DMA_SxCR_TCIE; // interrupt on completion
cr |= DMA_SxCR_TEIE; // Interrupt on error.
cr |= (3u << DMA_SxCR_PL_Pos); // Very high priority
cr |= (0x1 << DMA_SxCR_MBURST_Pos); // Incr 4...
_dmaStream->CR = cr;
// Experiment with FIFO not direct
_dmaStream->FCR |= (0x3 << DMA_SxFCR_FTH_Pos) | DMA_SxFCR_DMDIS; // disable Direct mode
_dmaStream->FCR |= DMA_SxFCR_FEIE; // Enable interrupt on FIFO error
/******************MEM Address to buffer**************/
/*******************Number of data transfer***********/
_dmaStream->NDTR = 38400; // (32*240/2); //0 - 65536
/******************Periph request**********************/
// Point to our SPI
_dmaActiveDisplay[_spi_num] = this;
_dmamux->CCR = (_dmamux->CCR & ~(DMAMUX_CxCR_DMAREQ_ID_Msk))
| (s_spi_hardware_mapping[_spi_num].tx_dmamux1_req_id << DMAMUX_CxCR_DMAREQ_ID_Pos);
// transfer complete and error interupt
clearDMAInterruptStatus(DMA_LIFCR_CTCIF0);
NVIC_SetVector(_dmaTXIrq, (uint32_t)s_spi_hardware_mapping[_spi_num].txdmaisr);
NVIC_EnableIRQ(_dmaTXIrq);
}
As I put in the code there are probably easier/cleaner ways to gleam information about the Input stream is that was passed in:
Currently I am needing to know, which DMAMUX channel this maps to, is it on DMA1 or DMA2, The Channel on the DMAMUX, what IRQ number assigned to the channel, what flags do I need to set to enable the DMA object...
With this I enable the DMA (DMA1 or DMA2) object.
I setup the pointers to the Peripheral (the TXDR register of the hardware SPI object),
I setup the Memory pointer here as well, but will probably be removed as it is set each time, I start the DMA operation. Likewise, for the count.
It also initializes the DMAMUX entry, with this line:
_dmamux->CCR = (_dmamux->CCR & ~(DMAMUX_CxCR_DMAREQ_ID_Msk))
| (s_spi_hardware_mapping[_spi_num].tx_dmamux1_req_id << DMAMUX_CxCR_DMAREQ_ID_Pos);
where it passes in that magic number, which maps the output to the SPI object.
SPI and DMA changes each call:
Each time you do a call to updateScreenAsync, the initialization code for this makes changes to the SPI and DMA objects:
Not sure if showing the code here helps or not, but:
bool ILI9341_GIGA_n::updateScreenAsync(bool update_cont) {
if (!_use_fbtft) return false;
if (_dma_state & ILI9341_DMA_ACTIVE) { return false; }
initDMASettings();
SCB_CleanInvalidateDCache_by_Addr( _pfbtft, CBALLOC);
// reset the buffers.
_dmaStream->M0AR = (uint32_t)_pfbtft;
_dmaStream->NDTR = 38400; // (32*240/2); //0 - 65536
// Lets setup the transfer ... everything before the fill screen.
beginSPITransaction(_SPI_CLOCK);
// Doing full window.
setAddr(0, 0, width() - 1, height() - 1);
writecommand_cont(ILI9341_RAMWR);
setDataMode();
setSPIDataSize(16);
_dma_sub_frame_count = 0;
// enable TX in the stream
SET_BIT(_dmaStream->CR, DMA_SxCR_EN_Msk);
// Enable TXDMA in SPI
SET_BIT(_pgigaSpi->CFG1, SPI_CFG1_TXDMAEN); // enable SPI TX
// finally enable SPI
SET_BIT(_pgigaSpi->CR1, SPI_CR1_SPE); // enable SPI
SET_BIT(_pgigaSpi->CR1, SPI_CR1_CSTART); // enable SPI
if (update_cont) _dma_state |= ILI9341_DMA_CONT;
_dma_state |= ILI9341_DMA_ACTIVE;
#ifdef DEBUG_ASYNC_UPDATE
dumpDMASettings();
#endif
return true;
}
A lot of the above code is specific to this driver, but there are some interesting issues or hints like:
-
The DMA system appears to output what is actually stored in the physical memory, and does not go through the cache, as such you need to make sure the cached data is written out to memory. This is done with the call:
SCB_CleanInvalidateDCache_by_Addr( _pfbtft, CBALLOC);
-
As I mentioned earlier I setup the memory pointer to the start of the frame buffer, plus I setup the initial count:
// reset the buffers.|
_dmaStream->M0AR = (uint32_t)_pfbtft;|
_dmaStream->NDTR = 38400; // (32*240/2); //0 - 65536|
Note with the count, The display is 320*240 words in size, which does not fit in the NDTR range of up to 65535... So this is one half of the display. When this completes the ISR will restart it to do the second half.
-
Sort of outside the scope here, but the call setSPIDataSize(16) switches the SPI from 8 bit transfers to 16 bit transfers, which I use to output the frame buffer.
-
Then the code enable the DMA with the SPI TX. The RM section 53.4.14 describes the order of how to start and stop the communications:
Processing the Interrupt:
Here is the start of the ISR Code:
void ILI9341_GIGA_n::dmaInterrupt(void) {
if (_dmaActiveDisplay[0]) {
_dmaActiveDisplay[0]->process_dma_interrupt();
}
}
void ILI9341_GIGA_n::dmaInterrupt1(void) {
if (_dmaActiveDisplay[1]) {
_dmaActiveDisplay[1]->process_dma_interrupt();
}
}
uint8_t ILI9341_GIGA_n::getDMAInterruptStatus() {
switch (_dma_channel) {
case 0: return (_pdma->LISR >> 0) & 0x3f;
case 1: return (_pdma->LISR >> 6) & 0x3f;
case 2: return (_pdma->LISR >> 16) & 0x3f;
case 3: return (_pdma->LISR >> 22) & 0x3f;
case 4: return (_pdma->HISR >> 0) & 0x3f;
case 5: return (_pdma->HISR >> 6) & 0x3f;
case 6: return (_pdma->HISR >> 16) & 0x3f;
case 7: return (_pdma->HISR >> 22) & 0x3f;
}
}
void ILI9341_GIGA_n::clearDMAInterruptStatus(uint8_t clear_flags) {
switch (_dma_channel) {
case 0: _pdma->LIFCR = (clear_flags << 0); break;
case 1: _pdma->LIFCR = (clear_flags << 6); break;
case 2: _pdma->LIFCR = (clear_flags << 16); break;
case 3: _pdma->LIFCR = (clear_flags << 22); break;
case 4: _pdma->HIFCR = (clear_flags << 0); break;
case 5: _pdma->HIFCR = (clear_flags << 6); break;
case 6: _pdma->HIFCR = (clear_flags << 16); break;
case 7: _pdma->HIFCR = (clear_flags << 22); break;
}
}
void ILI9341_GIGA_n::process_dma_interrupt(void) {
txIRQCount++;
DBGdigitalToggleFast(LED_BLUE);
uint8_t istatus = getDMAInterruptStatus();
if (istatus & DMA_LISR_TCIF0) {
clearDMAInterruptStatus(DMA_LIFCR_CTCIF0);
if (_dma_sub_frame_count == 0) {
...
-
Routing interrupt to object: There are cleaner ways to do some of this, like lambda functions and the like, but I am using simple array that one on SPI uses index 0 and one on SPI1 uses index1...
-
Checking and Clearing interrupt state: I wish there was cleaner way, to programatically handle the 16 DMA streams (8 on DMA1 and 8 on DMA2), as the Status and clear registers are bit Screwy on each of the DMAs, the Streams 0-3 are handled by the LISR and LIFCR registers and the Streans 4-7 by the HISR/HIFCR.
Each one of the has 6 bits (actually one bit is skipped). Thought of trying to setup typedef for this register, but they are packed sort of strange:
So I used those two functions to map to and from the first 6 bits, which is working.
This is All for now
Not sure if anyone will find any of this helpful or not, but I hope so. Let me know if you have any suggestions or comments.
Edit:
Will be curious to see if any of this works with the Portenta H7... (Have one coming)
And I should mention that we have some things working with this, like a version of uncanny eyes I ported over that works with one display on SPI and the other on SPI1
Thanks
Kurt