There is SoC specific code in Arduino libs, all I used was git clone the repo and "git grep micros":https://github.com/arduino/Arduino/blob/1.5.2/hardware/arduino/sam/cores/arduino/wiring.c#L31
If you want to have more preciesely timing, you can better use:
It's equal to 1/84000000 seconds.
Err... assuming that NOP instruction in program takes exactly 1 tick of core is really brave, and surely OPcodes shouldn't be used for timing purposes without calibrating it to some timer. NOP might even get optimised out during runtime by the core caching mechanism, or it can take few cpu cycles to fetch and "execute" it.http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489e/Cjafcggi.html
So yeah - been there, done that, so you don't have to.
//edit, as a sidenote:
While micros does loop checking system clock tick counter, there's also Sleep function for longer delays (1ms resolution), using Wait For Interrupt - https://github.com/arduino/Arduino/blob/1.5.2/hardware/arduino/sam/system/libsam/source/timetick.c
In embedded machines using the second one is better practice (whenever you don't need better resolution than 1ms) as WFI does put ARM core into sleep, decreasing the power consumption as core is being awaken every 1ms and then being put back to sleep until required delay passes.