how to generate less than microsecond delay?

Check the microcontroller data sheet. For the Arduino Uno with the ATmega328, www.atmel.com/Images/doc8161.pdf#page=429 the PDF says the NOP instructions takes 1 clock cycle.

For a 16 MHz clock, 1 NOP takes 1/16MHz to complete, (1/(16*10^6) - Google Search)+seconds+to+nanoseconds