I'm kinda confused with the fact that the mcu doesn't know that at 8mhz, one time unit will be twice the amount as 16mhz...
The mcu doesn't know how fast it's running. It thinks in instruction cycles. Regardless of the clock rate each instruction takes the same number of instruction cycles so to the mcu it's all the same.
The Arduino core uses the F_CPU value set in boards.txt to measure time based on instruction cycles.
I'm going to go home and change the count > 6 to count > 3 for giggles and see what happens.
Remember that "delayMicroseconds(1)" automatically adjusts when you switch from a 16 MHz Arduino to an 8 MHz one. It's all the other instructions that will take twice as long. Try values from 5 to 2 to see which ones give the desired results. If more than one value gives the desired results, pick one in the middle.