How do you know it's too slow? And how much too slow IS it? Can you do the sort of dimming you're looking for, with your existing code, on 16 pins? 2 pins?
Hmm. It looks like the biggest problem you have at the moment is trying to run the timer interrupt too often. Trying to interrupt in fewer cycles than your service routine takes to execute is a recipe for disaster; you'll essentially end up executing one main program instruction, doing the entire ISR, one more main program instruction, etc...
Here are some details and other comments.
There's an intermediate-speed digitalWrite that might be useful. Have arrays indexed by your channel_pin (whatever is sent over the serial port) for Port and bitmask, just like the real digitalWrite does. Except populate them at initialization time instead of every time you want to write to the pin:
void populate_fdw_tab (char myPinNum, arduinoPinNum) {
fdw_port[myPinNum] = digitalPinToPort(arduinoPinNum);
fdw_ONbitmask = digitalPinToBitMask(arduinoPinNum);
fdw_OFFbitmask = ~digitalPinToBitMask(arduinoPinNum);
pinMode(arduinoPinNum, OUTPUT);
digitalWrite(arduinoPinNum, LOW);
}
:
void setup() {
populate_fdw_tab( 0, 22); // lights are on pins 22 -52
populate_fdw_tab( 1, 31); // and can be mapped "however"
:
populate_fdw_tab(31, 53);
}
Then, turning a pin off is just:
*fdw_port[pin] &= fdw_OFFbitmask[pin];
and turning it on is just:
*fdw_port[pin] |= fdw_ONbitmask[pin];
(Hmm. You can probably scratch all that, since you're always dealing with all the pins, and the current fastdigitalWrite is doing single instructions. It would give you faster random/variable access to your pins, but I guess that's unnecessary.)
Similarly, since serial messages are only occasional and not more than every 50ms, while AC cycles happen every 1/120 s (8.3ms),
get rid your calculations of powerDelay[] in ZeroCrossDetected and move them into the serial receive code.
How many brightness levels are you trying to get? Make sure that you're not running your timer interrupt too often, or it will suck up extra cycles in overhead that might be needed elsewhere. (It looks like this is really bad at the moment! This is probably the biggest change that needs to happen!) tickCounter should probably be MUCH smaller than 32bits; ideally you should divide the 8.3mS half-cycle time into exactly N ticks by adjusting the timer prescaling and count, which will give you up to N brightness levels and simplify a LOT of the math that is happening, assuming that N < 256.
This code is interesting:
if(powerDelay[CHANNEL_PIN_1] <= tickCounter) customDigitalWrite(CHANNEL_PIN_1, HIGH); // turn the channel on here
if(powerDelay[CHANNEL_PIN_2] <= tickCounter) customDigitalWrite(CHANNEL_PIN_2, HIGH); // turn the channel on here
if(powerDelay[CHANNEL_PIN_3] <= tickCounter) customDigitalWrite(CHANNEL_PIN_3, HIGH); // turn the channel on here
if(powerDelay[CHANNEL_PIN_4] <= tickCounter) customDigitalWrite(CHANNEL_PIN_4, HIGH); // turn the channel on here
if(powerDelay[CHANNEL_PIN_5] <= tickCounter) customDigitalWrite(CHANNEL_PIN_5, HIGH); // turn the channel on here
Because all the array calculations are done at compile time, so it's load memory, load memory, compare, portwrite.
However, the load memory operations are currently 32bits, and since tickCounter is volatile, it is loaded each time (at significant expense.) Replace with something like:
register uint8_t tick = tickCounter; // get a local copy!
if(powerDelay[CHANNEL_PIN_1] <= tick) customDigitalWrite(CHANNEL_PIN_1, HIGH);
if(powerDelay[CHANNEL_PIN_2] <= tick) customDigitalWrite(CHANNEL_PIN_2, HIGH);
if(powerDelay[CHANNEL_PIN_3] <= tick) customDigitalWrite(CHANNEL_PIN_3,
HIGH);