Can't make Timer2 trigger SPI transmissions

OS: Windows Vista (SP2)
Board: Arduino Uno (ATMega328p-pu)

I have 2 sketches that both work nicely on their own. One uses SPI to drive a pair of chained 74HC595 8-bit shift registers. The other sketch uses Timer2 to call a short ISR function about 8000 times a second. I have verified that both sketches (separately) do the correct things with the correct timings.

However, I want to make a hybrid of these that uses the ISR to call SPI.transmit(...) and do a bit of integer math to calculate what data to transmit. BUT when I try to combine the SPI sketch with the Timer2 sketch, it seems that the timing is badly disrupted and I'm getting incorrect output on the shift register (indicated by 16 leds), it simply looks like nothing is happening when I try to run the hybrid sketch.

On the Arduino, I'm using digital pins 10, 11 and 13 (and +5v and gnd) to drive the 595 chips, otherwise the circuit is identical to the one in the "ShiftOut()" tutorials on the main arduino site. The circuit is verified as correct.

I'll post my code here, but truncated quite a bit...

#include <avr/io.h>
#include <avr/interrupt.h>
#include <SPI.h>

#define SHIFT_REGISTER_LATCH_SS (10)

#define PHYSICAL_PANEL_WIDTH (24) //columns of LEDs
#define PHYSICAL_PANEL_HEIGHT (8) //rows of LEDs
#define BITS_PER_PIXEL        (8) //1 bit in the memory image maps to a single LED on the panel

//starting values are the initial valid bit position for each bit field...
#define AMU_COLL_START (0x04)
#define AMU_BASE_START (0x10)
#define CMU_EMIT_START (0x01)
#define CMU_BASE_START (0x01)

//resets are one left shift further than we want for each bit field...
#define AMU_COLL_RESET (0x100)
#define AMU_BASE_RESET (0x100)
#define CMU_EMIT_RESET (0x10)
#define CMU_BASE_RESET (0x04)

volatile int amu_Coll = AMU_COLL_START;
volatile int amu_Base = AMU_BASE_START;
volatile int cmu_Emit = CMU_EMIT_START;
volatile int cmu_Base = CMU_BASE_START;

volatile byte hiByte = 0x00;
volatile byte loByte = 0x0f;
unsigned int tcnt2;

ISR(TIMER2_OVF_vect)
{
//  if((*pixelPtr & pixelBit) == 0)
//    UpdateDisplay(0, 0x0f);
//  else
    UpdateDisplay(hiByte, loByte);

  amu_Coll <<= 1;
  if(amu_Coll == AMU_COLL_RESET) //the 1 bit was shifted right out of the byte leaving all zeroes
  {
    amu_Coll = AMU_COLL_START;
    amu_Base <<= 1;
    if(amu_Base == AMU_BASE_RESET)
    {
      amu_Base = AMU_BASE_START;
      cmu_Emit <<= 1;
      if(cmu_Emit == CMU_EMIT_RESET)
      {
        cmu_Emit = CMU_EMIT_START;
        cmu_Base <<= 1;
        if(cmu_Base == CMU_BASE_RESET)
        {
          cmu_Base = CMU_BASE_START;          
        }
        else
        {
          hiByte = amu_Coll | cmu_Base;
        }
      }
      else
      {
        loByte = amu_Base | (cmu_Emit ^ 0x0f);
      }
    }
  }
}


static void FailSafe()
{
  amu_Coll = AMU_COLL_START;
  amu_Base = AMU_BASE_START;
  cmu_Emit = CMU_EMIT_START;
  cmu_Base = CMU_BASE_START;
  
  UpdateDisplay(0x00, 0x0f);
}

int UpdateDisplay(byte hiByte, byte loByte)
{
  SPI.transfer(loByte);
  SPI.transfer(hiByte);
  digitalWrite(SHIFT_REGISTER_LATCH_SS, LOW);
  digitalWrite(SHIFT_REGISTER_LATCH_SS, HIGH);
}

void InitScreenRefreshInterrupt()
{
   /* First disable the timer overflow interrupt while we're configuring */
  TIMSK2 &= ~(1<<TOIE2);

  /* Configure timer2 in normal mode (pure counting, no PWM etc.) */
  TCCR2A &= ~((1<<WGM21) | (1<<WGM20));
  TCCR2B &= ~(1<<WGM22);

  /* Select clock source: internal I/O clock */
  ASSR &= ~(1<<AS2);

  /* Disable Compare Match A interrupt enable (only want overflow) */
  TIMSK2 &= ~(1<<OCIE2A);

  /* Now configure the prescaler to CPU clock divided by 128 */
  TCCR2B |= (1<<CS22)  | (1<<CS20); // Set bits
  TCCR2B &= ~(1<<CS21);             // Clear bit

  tcnt2 = 238;

  /* Finally load end enable the timer */
  TCNT2 = tcnt2;
  TIMSK2 |= (1<<TOIE2);
}


void setup()
{
  FailSafe();

  InitScreenRefreshInterrupt();
  
  //initialize SPI:  
  SPI.setBitOrder(LSBFIRST);
  SPI.begin();
  //Serial.begin(9600);
  pinMode(SHIFT_REGISTER_LATCH_SS, OUTPUT);
}

void loop()
{
  delay(1000);
  noInterrupts();
}

Is Timer2 hardware interfering with the SPI stuff (or vice versa) or am I simply doing something stupid?

First observation: you use digitalWrite() in your spi transfer function, which is called from your ISR. Compared to direct port manipulation, this function is at least 10x slower. Don't use it.

Set the hardware SPI unit to use the fastest transfer clock, which is F_CPU/2. This will greatly reduce the time spi.transfer blocks the ISR.

What makes you believe your timer2 overflow ISR runs at 8kHz? Setting TCNT2 to 238 just once doesn't do it.

16,000,000MHz / ( 256 * 128 ) = 488Hz

I suspect your ISR takes too long to execute, that is longer than about 2ms, and locks up your code. Increase the prescaler, so it has more time before it gets called the next time.

I thought 238 was fine because in my Timer2 sketch (the simplistic one that just flashes an LED) my Oscilloscope was showing a square wave, on for slightly over one HDiv, off for slightly over one HDiv when set to 0.1ms/HDiv. So if the entire screen width represents 1ms and I can see 8 edges then that must be 8kHz right? The exact frequency isn't critical just as long as it's between 7000 and 9000hz then I'm happy.

I'll google those other things you mentioned, thanks.

Well, you set 238 in your setup routine, but where do you set it to 238 AGAIN?

Once the timer has overflown, it starts at 0 and not 238.

Either add TCNT2=238 at the end of your ISR, or use CTC mode (WGM mode 2 for timer2). There you can set the top value of counter2 using the OCR2A register and it will count from 0 to top.

A Timer1 CTC looks something like this. You can adapt it to Timer2:

ISR(TIMER1_COMPA_vect)
{
 /* your code */
}

void setup_timer1_ctc(void)
{
	uint8_t _sreg = SREG;	/* save SREG */
	cli();			/* disable all interrupts while messing with the register setup */

	/* set prescaler to 1024 */
	TCCR1B |= (_BV(CS10) | _BV(CS12));
	TCCR1B &= ~(_BV(CS11));

	/* set WGM mode 4: CTC using OCR1A */
	TCCR1A &= ~(_BV(WGM11) | _BV(WGM10));
	TCCR1B |= _BV(WGM12);
	TCCR1B &= ~_BV(WGM13);

	/* normal operation - disconnect PWM pins */
	TCCR1A &= ~(_BV(COM1A1) | _BV(COM1A0) | _BV(COM1B1) | _BV(COM1B0));

	/* set top value for TCNT1 */
         /* the ISR is called, once TCNT1 reaches OCR1A */
         /* and TCNT1 is reset to 0 */
	OCR1A = 0x0050;

	/* enable COMPA isr */
	TIMSK |= _BV(OCIE1A);

	/* restore SREG with global interrupt flag */
	SREG = _sreg;
}