Whenever you need a small (a few ns or 1 or 2 us) delay, it is advisable to insert some NOPs.
A NOP is an assembler instruction which does...nothing but requires 1 clock cycle. However, an ARM uc inserts its own wait states here and there between instructions, therefore, if you add (e.g) 50 NOPs, you can be sure that the core will add a few wait states. AFAICT the duration of a wait state is equal the duration of a NOP.
If you clock your DUE at 84MHz, 1 NOP = 11.9 ns.
Once you have inserted some NOPs (and the core its own wait states), you can check and fine tune precisely the actual duration thanks to SysTick->VAL.
Here is an assembler macro I use to insert some NOPs:
__asm__ __volatile__(
".macro NOPX P \n\t"
".rept &P \n\t"
" NOP \n\t"
".endr \n\t" // End of Repeat
".endm \n\t" // End of macro
);
void setup() {
}
void loop() {
// Insert 50 NOP:
// The uc will insert its own wait states between the NOPs
// resulting in a bit more than 50 NOP
__asm__ __volatile__("NOPX 50");
}