How to implement a delay using low level code?

Post content lost due to vandalism by author

The SAM3 (and most bigger ARM chips) has relatively non-deterministic instruction timing (flash wait states, accelerators, pipeline stalls, and stuff like that get in the way), so you don't usually see the "counted-cycles instruction loop" sor of code that is common on 8bit chips.
The usual way to implement a delay is to use the sysTick timer. (which counts up to 2**24 main clock cycles, or 0.2s before it wraps around.) You can also set it up to interrupt and keep a count similar to millis() on the Arduino, and use delay calls similar to the way they're implemented on Arduino.