Massive Parallel SPI

I don't think it really matters how it is arranged, he needs 115,200 BPS to 24 strings, so 2,764,800 BPS total.
Using a bigger processor, could have the 24 slave select lines generated internally.
Need a message to start up & get in sync, then a tight loop going:

void loop(){
/ do some serial data buffering, maybe 32 bytes, then let the data start ripping out:

PORT = PINC & B11111110;
SPI.transfer (Serial.read());
SPI.transfer (Serial.read());
SPI.transfer (Serial.read());
:
32 times
:
SPI.transfer (Serial.read());
PORT = PINC | B00000001;

// maybe do some data buffering again, then
// next port
PORT = PINC & B11111101;
SPI.transfer (Serial.read());
SPI.transfer (Serial.read());
SPI.transfer (Serial.read());
:
32 times
:
SPI.transfer (Serial.read());
PORT = PINC | B00000010;
// next port

// continue with 2nd port, then 3rd port, until hit all 24 strings
}