Unstable timing

The following code outputs, to the Serial connection, the difference in time between when the ISR and loop function is called. I would expect is delta to be low but it can be up to a millisecond. Can someone explain why this is happening? BTW, this is test code. I wrote it to track down and isolate a problem I'm having with a big program. Right now, I'm just concerned with how it runs on the Uno board. Also, my tests have been on the 1.0.1 version of the IDE.

void setup()
{
   Serial.begin(115200);
   pinMode(9, OUTPUT);   // for testing (toggling)

#if defined(USBCON)
   TCCR1A = 0;
   TCCR1A = _BV(COM1A0) | _BV(COM1B0) | _BV(COM1C0); // for testing (toggling)
   TCCR1B = _BV(WGM12) // CTC mode (TOP = OCR1A)
          | _BV(CS10); // clk/1
#else
   TCCR1B =
0b01 << 3 | // WGM13:2 = 1, CTC (Top = OCR1A) (see TCCR1A)
0b001;      // CS12:0  = 1, clkI/O/1 (No prescaling)

   TCCR1A =
0b01 << 6 | // COM1A1:0 = 1, Toggle OC1A/OC1B on Compare Match. for testing (toggling)
0b00;       // WGM11:0 = 0, CTC (Top = OCR1A) (see TCCR1B)
#endif

   OCR1A  = 16000;       // 1mS
   TIMSK1 = _BV(OCIE1A); // enable interrupt
}

unsigned long max_delta = 0;
unsigned long last_isr = 0;

#define BEGIN_CRITICAL_SECTION  \
{ unsigned char oldSREG = SREG; cli(); {
#define END_CRITICAL_SECTION    \
} SREG = oldSREG; }

void loop() {
  unsigned long delta;
  BEGIN_CRITICAL_SECTION
  delta = micros() - last_isr;
  END_CRITICAL_SECTION
  max_delta = max(max_delta, delta);  
  if (!(millis() % 1000)) {
     Serial.println(max_delta);
     max_delta = 0;
  }
}

// called once per mS
ISR(TIMER1_COMPA_vect) {
  last_isr = micros();
}

Let's say the timer interrupt is signaled during your critical section. The interrupt happens just after END_CRITICAL_SECTION and last_isr gets set.

Then let's say that millis() is a multiple of 1000. The micros() counter keeps counting through the serial output. max_delta gets set to 0.

Next time through loop() you calculate delta = micros() - last_isr; which come out to about a millisecond because of the serial output. max_delta gets the same value and since it is larger than all the other times through loop() that max value is kept until it is displayed the next time millis() is a multiple of 1000.

Thank you for the help.