Why use software when you have fast onboard hardware - the SPI port?
#include <SPI.h>
digitalWrite(RCKpin, LOW);
SPI.transfer(your_data_byte);
digitalWrite(SRCKpin, HIGH);
SCK, goes to SRCK (shift register)
MOSI goes to serial data in
MISO not used
SS got to RCK (output register)
OE/ to GND if not using for PWM
MCLR to +5 if not being used