Here's my thread on the same question:
http://arduino.cc/forum/index.php/topic,93737.0.html
I've been using the digitalWriteFast, which somehow is faster, smaller, and easier than port manipulation.
http://arduino.cc/forum/index.php/topic,93737.0.html