My next step is to remind myself how to do direct port manipulation for the CS LOW/HIGH, as there is a long delay before / after the actual transfer before the CS pin flips.
IIRC should be something like PORTD = PORTD & 0b00100000 ....
There is a digitalWriteDirect function (from another thread) that I think is about 9X faster...
// example: digitalWriteDirect(10, HIGH)
inline void digitalWriteDirect(int pin, boolean val){
if(val) g_APinDescription[pin].pPort -> PIO_SODR = g_APinDescription[pin].ulPin;
else g_APinDescription[pin].pPort -> PIO_CODR = g_APinDescription[pin].ulPin;
}