TPIC6B595 are 150ma output capable, not 7-8mA of 74HC595.
Yes, I understand not buffered between chips, but between banks of chips.
I've connected up 5 of these boards with short ribbon cable between boards and clocked in data at 8 MHz (fastest SPI rate) with no problems.
Arduino connects to the first one, then it's board to board after that. Each board had power/gnd from the 5V source.
