Maximum number of shift registers cascading (TPIC6B595)

I don't see the logic in replacing 40mA capable Arduino output pins with ~4mA capable output ports.
Is OP using SPI or shiftout(). Can it be slowed down in code.
How long is the wiring, and is it low capacitance (CAT-5 or Cat-6).
Maybe reflections are a problem, and the wires needs to be terminated.
Or propagation time of the data needs to be compensated by slowing down the other signals as well.
No experience in that number of shift registers (yet).
Maybe someone that did proper experiments with a logic analyser could enlighten us.
Leo..