You might want to read my blog:
http://blog.blinkenlight.net/experiments/counting/fast-counter/ and if this is not fast enough
http://blog.blinkenlight.net/experiments/counting/faster-counter/.
The only way to be even faster is to use the hardware timers / pwm capabilities of the Arduino. For software approaches I am very close to the speed limit. And I am definitely faster than the "obvious" approach
void loop()
{
PORTD = HighMask;
PORTD = LowMask;
}
Notice that my examples are deliberate attempts to push the limits and not examples for structured / layered / clean code. They use almost all (but not really all) tricks that I could think of. Anyone who has ideas to push my examples closer to the maximum is very welcome to explain them this to me. Preparing the stack to substitute subroutine calls by jumps is something I already thought of but the additional speed gains are negligible
