Controlling TLC5940's on Arduino from Processing?

Would anyone have any advice on what the best performing way is to control several TLC5940's from Processing, when needing to update many channels at high interval - for example an interpretation of a video stream or animation of 20 fps to 64/512/1024 outputs?
It seems that plain serial would be too slow, and I can't find a way to get to the TLC5940 with Firmata, so am not sure if that is quicker - although it looks much the same.

I'm also puzzled by the best way to deal with serial - either sending each output update one by one, or sending the whole field as a chunck and let Arduino decipher it - as it seems quicker to send 3 numbers in one message, than 3 small single messages.

Was wondering if reaching the Arduino through an Ethernet shield could perhaps be quicker than serial, with large amounts of data. Hmm...

Many thanks

20 fps to 64/512/1024 outputs?

Sounds too fast.

1024 outputs requiring two byte per output give a buffer space of 2048 bytes, this is more RAM memory that the arduino has.

64 outputs require 128 bytes at 20 per second that is 128 * 20 = 2400 bytes per second transfer. In terms of baud rate time this by ten to get a minimum of 24000 baud. That gives you no time for actually getting the data into TLC.

The problem with using Ethernet is that it is not real time or indeed any guaranteed delivery time, and there is not enough buffer space in the arduinol.