I wrote a library to PWM many outputs with shift registers. Computing and sending out the bits over the SPI takes 43 clock cycles per shift register (8 pins). The limit is 768 pins at 32 brightness levels.
Could it be added to the Arduino libraries?
Something else, maybe a change for the standard arduino include files: This version uses SPI, but a previous version bit banged normal pins. For that version, I replaced pins_arduino.h with a file I created myself: pins_arduino_compile_time.h. It is a version in which the compiler can know the pin to port conversion at compile time. The standard version stores the look up arrays in program memory, which causes the compiler to generate code to retrieve the look-ups at run time. Even if the pin numbers are constant, the compiler does not know this and does not compile to fast code. DigitalWrite takes 53 clock cycles.
In my redefinition of the arrays, the arrays are defined as constants. Now the compiler can evaluate the look up at runtime and compile to a sbi or cbi instruction, which takes only 2 clock cycles. Check out my version here: http://code.google.com/p/shiftpwm/source/browse/trunk/ShiftPWM/pins_arduino_compile_time.h
Is there a good reason that the arrays are defined in progmem?