You could achieve major RAM savings by shrinking that to process 1 line at a time or even 1 led at a time.
1 led at a time won't work because when Multiplexing each line light up after the other, so it wouldn't make sense to light up each LED in a line and then light up each LED in the next line,would it. This would take far too long to get throug all the lines and to show up an image on the matrix, or the frequency is so high, that the LED's light on time is so short, that thay are quit dark.
But I think calculating 1 line at a time makes sense. So you mean I calculate the first line save it in an array and finnaly let the line shine. Then I calculate the second line and save it in the same array, right? and so on and so on