Congrats on getting it working!
I still don't understand the reason why the 2x3 unit was causing this problem. Could it be that the 2x3 unit was splitting the current between too few bases, thus each base geting more current than the transistor could take before breaking down? I'm not even sure that's a technically sensible thought or not but anyway. I can live with using 16 pins
There is also a diode connection from base to collector. Normally the collector is at a higher potential (voltage) then the base, so there is no current flowing this way. However, in your circuit, you put the collector low when you don't use that column of LEDs connected to it. This is what works, since this base-collector is in parallell to any LEDs, and have a lower forward voltage drop than the LED (hence LEDs not glowing).
However, you added that "high level" 2x3 transistor grid. While this transistor grid will get (most of) their unused transistor's collectors low, they are shut off from their emitter, to the second level transistor grid's collectors. IE no diode connection from base to collector here in parallell with any LEDs, thus the base current from a row also flowed through LEDs in other columns.
Unless your LED's have a lower forward voltage than about 1.4V (plus some more for saturation voltage to the "cathode" transistors and VOL
from the atmega), I think you can keep your "high level" grid, if you put a diode in anti-parallel to those (high-level grid) transistors collector-emitter. Maybe anti-parallel is somewhat wrong term. I mean a diode from emitter to collector (backwards). 6 additional diodes. But as you say it only saves one more output pin.