Once again, I think this is an incredible feat of code optimization to fit all this into an Atmega328P.
Not only do you have 256-colour output, but also sound, an introductory screen, and explanatory text. It runs on both NTSC and PAL, as I proved. It also interfaces with the NES controller, so you can have A, B, SELECT, START, UP, DOWN, LEFT, and RIGHT buttons.
The output is jitter free, I personally have a bit of noise but that would be from the breadboard, and the fact that there is no shielding.
The code just fits:
Binary sketch size: 31,986 bytes (of a 32,256 byte maximum)
If you ran it on a larger processor like the Atmega1280 (for example the Bobuino, or just the bare chip, which is available in DIP format) you would have access to a lot more program memory (128 kB) which would allow for a lot more sprites, rooms, game logic etc.
It would be nice if you could document the room data layout. A lot of us (including me) would find it challenging to modify or improve the code, but no doubt would have fun adding extra rooms or changing the layout.
If this sort of stuff was documented:
const PROGMEM prog_uchar rooms[] = {
0x10,0xc1,0x10,0xc1,0x10,0xc1,0x10,0xc1,0x10,0xc2,0x10,0x11,0x13,0x10,0x51,0x10,0x31,0x10,0x14,0x15,
0x10,0x51,0x10,0x31,0x10,0x14,0x12,0x10,0x11,0x16,0x31,0x17,0x31,0x10,0x14,0x20,0x92,0xf1,0x61,0x18,
0x91,0x32,0x91,0x12,0x10,0x15,0x61,0x42,0x20,0x72,0x51,0x20,0xb1,0x10,0x21,0x18,0x19,0x61,0x14,0x11,
0x10,0x11,0x15,0x51,0x32,0x14,0x12,0x10,0x72,0xf1,0xf1,0xf1,0x71,0x52,0x11,0x42,0x11,0x22,0x31,0x20,
0x11,0x30,0x31,0x10,0x81,0x10,0x11,0x12,0x11,0x10,0x41,0x3a,0x51,0x10,0xa2,0x14,0x12,0x10,0xc1,0x10,
...
That would be great! And how did you draw the graphics? I can imagine using Photoshop (or similar) in indexed colour mode, and outputting raw data, to draw tiles and suchlike.