Smooth, really cool.
It thing it must be possible to get lower sram use. Maybe you should check The games on hackvision or Wayne and layne's Video game shield. But probable you already did.
I expect your right in reducing the SRAM . My understanding is that the Hackvision and Wayne and Iayne's shield both use the TVOut Library by Myles Metzler and their games are far more complex then anything I am producing
You can store graphics in the 32K of FLASH memory on the Arduino so extra graphics to some extend reduces the memory usage.
My understanding is that a resolution of 120x96 uses 1,440 bytes (120x96/8), leaving 608 bytes for all other variables.
There are 2 byte arrays for the playing area 13x6 (78 bytes). One for the display and the other for when I'm checking for 3s. Another 156 Bytes, leaving 452.
Any text on screen takes up SRAM as well.
There are about 50 Bytes used for the opening text and 27 for the GAME OVER screen, so another 72 BYTES
Leaving 380 bytes.
Then there is the SRAM used by the TVOut Library just to function. I have no idea how it uses.
So, it got pretty tight in there.
Looking at that HackVision the closest program I would expect is Tetris. In that case all blocks once placed are the same so there is no need to record the different types of blocks at each position. It would be possible to use the get_pixel(x,y) command to check if a block has been placed at a location.
That said I know my code is not optimised. I learned to code on a VIC-20/C64 back in the 80's and it's still the same coding practices I use now. One of these days I'll figure out how to program in C++ properly. I looked at some of the HackVison code and it is way beyond what I have done.
That's the great thing about the Arduino. Even basic programmers like me can put together something that works. Then there is the opportunity to learn and refine and get even better without it being overwhelming at any time.