Shift Registers and Arrays?

274HC165 for shifting, 2 74Hc595 or TOIC6B595 for shifing.
Use SPI.transfer to read & write at the same time:

digitalWrite (csInPin, LOW);
digitalWrite (csInPin, HIGH); // capture inputs at '165s with low pulse
byte1in = SPI.transfer(byte1out); // read in while writing out
byte2in = SPI.transfer(byte2out);
digitalWrite (csOutPin, LOW);
digitalWrite (csOutPin,  HIGH); // move newly shifted data to '595 outputs