Two more comments..
Code like the your setupTimers() requires me to read the datasheet to understand.
For some reason I can not be bothered.
Perhaps, because when I write code that contains obscure special values in combinations, I add comments about what the values are intended to achieve, and on what page of the datasheet I can confirm the correctness of my assumptions.
(ok - no I dont record the references to the datasheet page but I should)
So - I have no idea what the timer is set up to do...
I guess you do, but can not validate it. Neither can anyone else.
void setupTimers()
{
TCCR1A = 0;
TCCR1B = 0;
TCCR1B = _BV(CS10);
TIMSK1 = _BV(OCIE1A) | _BV(TOIE1);
OCR1AH = 255;
OCR1AL = 255;
TIFR1 = 1;
}
And the second comment is, I missed the indirection by pwmOrder in all the code the ISR().
You could reorder the data outside the ISR() so that the access order is the native order.
The ISR() code would be about 2 x faster for that, especially as the array pwmOrder is declared volatile.
Actually, I think that the code has WAY too many 'volatile' variables for reasonable speed.
You only refer to the non temp arrays in the ISR I see.
So they do not need to be volatile.
Also, the temp arrays are read by the ISR and are built in main line with interrupts disabled.
Except that they are not.. the tempCompValL[pin] and tempCompValH[pin] are changed while interrupts are disabled.. so the ISR() could be copying them before the sort order is updated. This is a bug that should trigger relatively rarely and be hard to find when you are looking for it. It would NOT cause the errors you report.
In any case, them being volatile does not help.
Then beginIndex, tempOCH and tempOCL are only used in the ISR() - so should be local variables to the ISR and not volatile. (beginIndex should be a static local to the ISR()).
These changes are all simple multiples of the code execution time...
So the code will be perhaps 6 x quicker - 2x for removing the indirection by pwmOrder and 3x for removing the surplus volatiles.
The change to the search from the entire array to only incremental will make a difference based on the number of pins (9), and even more if you add more pins.
again - good luck.