Odd. It seems that no NOPs are needed after the loop for it to work properly.
I was worried my register access using a temp variable might be optimized out so I checked the assembly, and sure enough it's there:
tmp = SPDR;
6c6: 8e b5 in r24, 0x2e ; 46
And when I read SPSR after that, but before I re-enabled interrupts (waiting until after that to output my serial debug data), the correct bit was 0 as it should be. (And my 2x speed bit was set to 1 as it also should be.)
So, that's odd. I know the instructions to put the value into SPSR add a few clocks, but it seems like it would be one short.
SPSR = SPSR & ~_BV(SPIF); // Clear transfer flag.
6c0: 8d b5 in r24, 0x2d ; 45
6c2: 8f 77 andi r24, 0x7F ; 127
6c4: 8d bd out 0x2d, r24 ; 45
tmp = SPDR;
6c6: 8e b5 in r24, 0x2e ; 46
But regardless of the reason, it seems to work fine, so final version of the loop I guess:
/*
This function shifts all the LED data out to the TLC5947's.
*/
#define NOP __asm__ __volatile__ ("nop\n\t");
#define WAIT NOP NOP NOP NOP NOP NOP NOP NOP NOP NOP NOP // 11 NOPs
void updateLEDs() {
const int LEDFrameTime = 1000 / 30; // The number of milliseconds between each update of the LEDs. 1000 / 24 = 24fps.
static unsigned long nextLEDFrameTime = 0;
byte tmp;
byte *thisLED;
byte *lastLED;
if (time >= nextLEDFrameTime) { // If it is time for an LED update...
//unsigned int time = micros();
thisLED = &led::data[led::modules*36]; // The first operation is to decrement this pointer, so the first time we write thisLED it will be array index LEDMODULES*36-1
lastLED = &led::data[0];
cli(); // Halt interrupts.
do {
SPDR = *--thisLED; WAIT;
} while (thisLED != lastLED); // thisLED is decremented one last time before we hit the end of the loop, so after byte 0 transfers, the loop exits.
SPSR = SPSR & ~_BV(SPIF); // Clear transfer flag.
tmp = SPDR;
// Move data from shift register into latch register:
LATPORT &= ~(1 << LATPIN); // Set LATCH pin low. It is important to do this first, just before transitioning high! I don't know why, but the LED modules won't work right otherwise!
LATPORT |= (1 << LATPIN); // Set LATCH pin high. Transitioning from low to high moves all data from the shift register into the latch registers.
ENAPORT |= (1 << ENAPIN); // Set BLANK pin high. Turn all outputs off, and reset grayscale timer. Blanking reduces flicker, but it does not seem to matter if it is before, after, or in the middle of latching.
ENAPORT &= ~(1 << ENAPIN); // Set BLANK pin low. Turn all outputs on.
sei(); // Reenable interrupts.
//time = micros() - time;
//Serial1.println(time,DEC);
nextLEDFrameTime = time + LEDFrameTime; // Set the time of the next update.
}
}