Here's the difference:
"With this setup, we only need 64 (for the anodes) + 8 (for each layer) IO ports to control the LED cube."
where the columns are driven high, and a layer pulled low to enable it.
While mine pulls the columns low, and drives the layers High to enable it.
My idea is to use 9 shift registers, one to enable the selected layer and the other 8 to control the columns,
vs a demux chip and 8 shift registers.
With my idea, SPI is used to blast out the 9 bytes, and a really high refresh rate can be achieved. 9 bytes at 8 MHz SPI rate will take about 9uS to send out. For a 30 Hz refresh rate, 33.3mS per total cube update, means each layer can be left on ~4200uS minus the 9uS byte transfer time. During that ~4.2mS you can be updating the data to be displayed next, via serial, button pushes, reading pots, etc.
It takes 512 bytes to represent the whole cube in memory; you can have 2 copies, one that is being cycled thru for displaying, one that is being updated for the next display cycle.
Or control it with a 1284P chip, 16K of SRAM, can have lots of copies that you rotate thru and can take more time to update.