Graduating to 8x8x8 Cube....Requesting Help on layer switching

I have a board, see my signature link, that I think would be great for this. It has 12 open drain shift registers. You would need to add 8 P-channel MOSFETs with gate pullup resistors, and 64 column resistors for current limiting.
To run it, you would send 9 bytes of data via SPI.transfer for every layer, update it at a decent rate so it doesn't flicker.
8 of the shift resistors would pull the individual cathodes low to allow them to turn on.
The 9th shift register woud be used to walk a 0 across the 8 p-channel MOSFET gates to turn on 1 layer of anodes at a time.
There are 96 outputs, so you could make a 9x9x9 array (81 columns, 9 layers) if you wanted as well.
81 LEDs on at once could be 1.62A (81 x 20mA if you went for full brightness) so 9 decent P-channel MOSFETs would be needed.
1.28A for 64 LEDs.