1,000 LED Control

Can move up a couple dollars to an ATMega1284 based design and have a lot more RAM to play with; having a more IO is nice but not critical here if you are going to just use SPI to send data to the PWM controllers quickly. The extra IO will let you have more chip select lines tho so can address the shift registers directly.
I can picture 10 IO pins with 10 current sources for the rows, with a '5940 to sink the current for the column for a layer.
10 pins to select the OE for the '5940 for the active layer.
10 pins to select the CS for the 10 '5940s.
3 SPI pins
2 serial pins.
Hmm, that's 35.
Okay, so two shift registers for the first set of 10 pins. Just walking a 1 across to enable a row at a time.
so 27 IO. '1284 will give you lots of RAM, all the needed IO, minimal external hardware.
Shift register will control a 200mA current source per row.