Lots of independant LEDs

Haha, whoops....

Anyway, I'm looking for maximum brightness. At least for the brightest ones. Then a few levels darker if possible. Sortof to create a comet trail look. I wouldnt want to split them up into more than 4 columns/rows because it seems like they would begin to get pretty dark after that. so maybe 4x32? Can I shiftout to 32 registers quickly enough for a matrix? And even better, could I cycle the matix quickly enough to fake PWM some of the LED's? For instance have four cycles, 1 LED High for all four, 1 for 3/4, the next for 2/4 and last 1/4....Could I shiftout 32 bits 4 times and have the 1/4 LED not appear to strobe?

thanks