Exploding RGB button matrix!

Ok so lets talk theory here. I have a matrix of RGB buttons. Lets say it's 9x6 for 54 buttons. My concept is that if you press a button it lights up either red, green, or blue (random, cycled, whatever). 50-100ms later the buttons in the 4 main cardinal directions light up that color with the original going back dark. The 4 lights continue out until shooting "off the edge" of the matrix. If any off shoots overlap then we get the combined secondary color (easy). The issue here is that any number of any combination of buttons can be pressed at any time AND a button may be pressed before its previous "off shoots" reach the edge of the matrix. I can't even begin to write code for this as I'm not sure how to approach keeping track of the lights and where they are going. I am also trying to reduce the amount of variables needed as this needs to be scalable.

My best idea right now is to have two bytes per button where I use at least 12 bits. Each of the 12 bits represents a unique combination of direction and color. Bit 0 is red going left, bit 1 is red+up, bit 4 is green+left, etc. This means it doesn't matter what space generated the "explosion" of color. The status bits get updated every delay time and then the LEDs are updated accordingly and if an LED is asked to be green three times it's just fine.

What I don't like about this is that I would need 216 bytes just for this small scale test. 2 bytes per button, 54 buttons, and a copy of it all for previous and new. I guess that is not a ton compared to the mega's 8k SRAM but there are still a lot of other variables to store.

At this time, I don't see any issues with this setup (thanks for rubber ducking it for me!) but if anyone has a better suggestion, I would love to hear it!

I think you need to consider all of these moving LED's as if they were particles

Each particle has a color and a direction.

When you press a button, initialize all the associated particles (position, color and direction) and add them to a particles store.(Array)
For each particle you need

color (2 bits)
Horizontal of vertical direction (1 bit)
Up or down / left or right (1 bit )
Current position (54 positions = 6 bits)

So you'd need 10 bits per particle

To start with I'd put this in 2 bytes, but I guess you could pack it and save space i.e as you are only using 10/16 = 62.5% of the space
(however it would slow things down having to find and unpack the data)

Then on each time tick, process the movement of all the particles, then iterate though all the display positions (9 x 6) and for each position check whether the any particles are on that display position.

BTW. You won't have 8K of ram, the core of the Arduino eats quite a lot especially on the mega as it allocates I think 64 bytes per serial channel (possibly is 64 bytes * 2 per channel , but I can't remember if it has a separate input and output buffer)

So assume perhaps you have 6k

This still gives you 3000 sprites.

Or have I got my maths wrong?

The shame with the Arduino is that you can't really use Malloc to dynamically allocate memory for the sprites, as apparently you get big issues with fragmentation.