Common latchpins with 74HC595 Shift Register

I can see how cascading the shift registers would dramatically reduce performance.

From my little understanding of SPI, the only thing faster than SPI is direct connection, but that limits the number of LEDs to the number of io pins.

By fast, I mean the fastest way to change all the LEDs at one time, that may not be what you need to do in every situation, but it does seem to be the most common way to do it.

If you had a situation where the bits pattern didnt change often, but which group of LEDs needed to be turned on and off individually. In that case, you could control all the latches individually with another shift register, or directly.

Just for the sake of examples its easier to reduce this from 10 to 8 shift registers (since these registers are 8 bit, not 10 bit).

It seems the total through put will always be faster with SPI, but if control is more important than performance, this might be viable.

lets say you have 64 LEDs connected to 8 shift registers, each shift registers latch is connected to a latch shift register, and each shift registers clock is connected to a clock shift register, and each shift registers data is connected to a data shift register. The data,clock,latch shift could be run from 9 pins, or maybe they could be daisy-chained the typical way, possibly using SPI.

Does having a separate shift register for clock make sense? is could controlling that independently on each chip useful?

I can only imagine that it would be difficult to program