Library for TLC5940 16-channel PWM chip

I'm trying to tidy up the library and I keep running into this weird bug with Tlc.update() where the grayscale data is only occasionally latched:

First an explanation of the serial interface

         _   _   _   _   _ 
GSCLK  _| |_| |_| |_| |_| |_ ... (x4095) 
                                                  ___
BLANK ___________________________________________|   |___
                                                   _
XLAT _____________________________________________| |_____

SIN / SCLK (shifting data in for
           next cycle after BLANK)

Here's how the code currently works:

  1. GSCLK is run off TIMER2 in fast PWM mode
  2. BLANK and XLAT are run off TIMER1 in fast PWM mode
    a. BLANK pulse width is OCR1B = 1
    b. XLAT pulse width is OCR1A = 0 This works perfectly, XLAT is inside the BLANK pulse (0 is the shortest pulse width)
    c. XLAT is not usually pulsing; the compare output bits, COM1A1 and COM1A0, are set to normal port operation (0, 0)
  3. When Tlc.update() is called:
    a. check _needXLAT to see if we are still waiting on a previous Tlc.update() to latch, return 1 if we are
    b. shift in the data
    c. set _needXLAT = 1
    d. enable the XLAT compare output to give us a pulse at the end of the current grayscale cycle: COM1A1 = 1, COM1A0 = 0 (non-inverting mode)
    e. enable the output compare A match interrupt, OCIE1A = 1
    What should happen at this point but doesn't:
  4. XLAT (OC1A) is pulsed at the end of the current grayscale cycle like the diagram above
  5. the OC1A output compare match interrupt is generated and executed a few clocks after XLAT should have pulsed
  6. the code in the interrupt turns off the XLAT pulsing (COM1A1 and COM1A0 = 0) and the interrupt is disabled (OCIE1A = 0), and _needXLAT is cleared to 0.

I know the interrupt is getting called successfully because the _needXLAT is getting cleared, but apparently XLAT isn't actually pulsing. I worked around this by waiting two interrupts before I clear _needXLAT and COM1A1 / COM1A0, but Arduino 0012 seems to have broken this. I can wait 10 interrupts and the same problem still occurs. I've also tried increasing the pulse width for BLANK and XLAT to no avail. Ugh!!!!!

Regardless, I would like to use TIMER1 to generate the XLAT pulses so that they're always within the BLANK pulse (as opposed to toggling the pin manually in an interrupt generated by blank: this would put the pulse a few clocks after the blank pulse). Does anyone with experience with the timers know what I'm doing wrong?

The relevant code is in resetTimers(), update() and the ISR. ( the latest version is TLC5940LEDv005_.zip)