PaulRB:
If you use 3 Arduino pins as CLK pins, you would fix that problem too. Each would have a fan out of 10. But then you could not use the hardware SPI. You would need to use the shiftOut() function, which is much slower. This may not be a problem for your project, depending what speed is required and what else the Arduino is doing.
That got me thinking. You could run SCK to say, the enable of a (74LS138) 1-of-8 decoder and use two Arduino pins to drive the address select of the decoder. The decoder would give you the fanout and you could clock each bank of SRs independently and still utilize SPI.
Possible?