Improving millis()

I've been looking at the implementation of the millis() function and, unless I've missed something, I've noticed that it does not account for the possibility of a timer0 overflow interrupt occuring in the middle of the retrieval of timer0_overflow_count. Since this variable is a long, accessing it requires a non-atomic operation that becomes interruptable.

For example, imagine you call millis() when timer0_overflow_count is 0x0000FFFF, and in the middle of retrieving timer0_overflow_count the timer0 overflow interrupt occurs. If the interrupt happens after you've loaded the two highest bytes but before you've loaded the two lowest bytes, millis() will return 0.

The odds of this happening are quite low, so this bug would manifest itself as an exceedingly rare and almost impossible-to-duplicate run-time error in code that relies on millis() for timing. An example of a simple fix for this would be:

// new global variable to flag when an overflow has occurred
unsigned char timer0_overflow_flag;

SIGNAL(SIG_OVERFLOW0)
{
timer0_overflow_count++;
timer0_overflow_flag = 1; // flag the overflow
}

unsigned long millis()
{
unsigned long count;
do
{
timer0_overflow_flag = 0; // clear the overflow flag
count = timer0_overflow_count; // read timer0_overflow_counter
}
while (timer0_overflow_flag); // if an overflow occurred during the read, try again

return count * 64UL * 2UL / (F_CPU / 128000UL);
}

You should use a similar construction if you ever set timer0_overflow_flag equal to a value since an interrupt in the middle of the setting process can cause problems.

  • Ben

Hi Ben, thats a neat way of handling the problem without disabling interrupts but I think you need to declare the timer0_overflow_flag volatile or the compiler will optimise out the test to see if its set;

// new global variable to flag when an overflow has occurred
volatile unsigned char timer0_overflow_flag;   //  <-  this needs to be declared volatile

SIGNAL(SIG_OVERFLOW0)
{
    timer0_overflow_count++;
    timer0_overflow_flag = 1;  // flag the overflow
}

unsigned long millis()
{
    unsigned long count;
    do
    {
        timer0_overflow_flag = 0;  // clear the overflow flag
        count = timer0_overflow_count;  // read timer0_overflow_counter
    }
    while (timer0_overflow_flag);  // if an overflow occurred during the read, try again

    return count * 64UL * 2UL / (F_CPU / 128000UL);
}

Hi Ben, thats a neat way of handling the problem without disabling interrupts but I think you need to declare the timer0_overflow_flag volatile or the compiler will optimise out the test to see if its set;

Very true, thanks for pointing that out!

  • Ben

We've implemented something similar in SVN, which will be in Arduino 0012. From wiring.c:

volatile unsigned long timer0_clock_cycles = 0;
volatile unsigned long timer0_millis = 0;

SIGNAL(SIG_OVERFLOW0)
{
        // timer 0 prescale factor is 64 and the timer overflows at 256
        timer0_clock_cycles += 64UL * 256UL;
        while (timer0_clock_cycles > clockCyclesPerMicrosecond() * 1000UL) {
                timer0_clock_cycles -= clockCyclesPerMicrosecond() * 1000UL;
                timer0_millis++;
        }
}

unsigned long millis()
{
        unsigned long m;
        uint8_t oldSREG = SREG;
        
        cli();
        m = timer0_millis;
        SREG = oldSREG;
        
        return m;
}

How's that seem? Do you think it's better to use the overflow flag as in your code instead of temporarily disabling interrupts?

Thanks for the detailed response, Mellis. I have a few comments about these changes (one major gripe and two minor gripes):

  1. You've increased the time spent in the time overflow ISR by a tremendous factor, especially when you note that every operation is dealing with four-byte longs. I like that in 0011 you keep the ISR as short as possible and do the time-consuming computation in millis(), which is called only when the user needs it. By moving the computations to the ISR, you're forcing the CPU to evaluate them every timer overflow and occuping valuable CPU processing time. This could become especially devastating if someone runs timer0 with a lower prescaler, such as 8 or 1, and still wants to catch timer overflows. Additionally, spending a lot of time in the timer0 overflow interrupt means that other more important interrupts are potentially delayed, which could hurt certain applications.

My advice is to stick to the principle of "get in and get out" when dealing with interrupts. Let the interrupt set up the calculation, but actually perform the calculation outside the interrupt if at all possible.

I'm currently writing a motor-control Arduino library for the Orangutan and Baby Orangutan robot controllers. These use timer0 and timer2 to generate the PWMs that control the motor drivers, but those timers need to run using a prescaler of 8 if they are to produce a 10 kHz PWM. I had intended to provide my own version of millis() as part of the library that would account for the faster timer0 clock, but I think with the above changes I would just have to disable timer0 interrupts altogether to keep the CPU from getting overly monopolized by the interrupt. With a prescaler of 8, the interrupt will happen every 2048 clock cycles, and I estimate that the current (0011) interrupt probably requires around 51 clock cycles, so I lose around 2.5% of my CPU processing time. Your new ISR could be anywhere from twice to maybe five times that, depending on whether clockCyclesPerMicrosecond() is a macro or an actual function call and depending on how many times the loop iterates.

  1. This is a pretty minor gripe, but I don't like disabling global interrupts because someone else might be relying on them for some extremely time-critical application. I think the more friendly approach would be to only disable the interrupt that could potentially cause the problem:

uint8_t temp = TIMSK0; // store current state of TIMSK0
TIMSK0 &= ~(1 << TIOE0); // disable timer0 overflow interrupts
m = timer0_millis;
TIMSK = temp; // restore TIMSK0 to previous state (i.e. enable interrupts if they were enabled)

On the other hand, globally disabling interrupts for 9 cycles probably isn't going to be too much of a hardship on people given that they already have to deal with having interrupts disabled for much longer time periods by the timer0 overflow interrupt itself. Either way, I think this is a fine equivalent to my suggested overflow-flag approach and it has the added benefit of being harder to screw up (e.g. forgetting to make the flag volatile) and easier to understand.

  1. In your proposed new construction, you probably no longer need to declare timer0_clock_cycles as volatile (assuming it's only used inside the timer0 overflow ISR). Any compiler optimizations will be valid since there's no chance of an ISR changing its value as you're using it. By keeping it volatile, you're forcing the ISR to load all four bytes from RAM (8 cycles) and save all four bytes to RAM (8 cycles) every time you do something like

timer0_clock_cycles -= something;

  • Ben

We wanted to eliminate the rollover in the millis() values, and increasing the work done in the interrupt seemed like the easiest way to do that. Changing the timer 0 prescale factor or overflow frequency isn't "supported" functionality and most people never do it, but they do run into the millis() overflow problem. Do you want to try eliminating the overflow by improving the millis() function and leaving the interrupt handler as it was? I'd love to do it that way, but don't really have time to write the code myself.

Sure, I'll see if I can come up with something that meets with your approval. Also, the millis() overflow doesn't have to be much of a problem if you do your timing using differences rather than absolute numbers. For example, if I want to see if one hour has passed since the last "event" (i.e. the last time I checked and stored millis()), I can do it in two different ways:

if (millis() >= lastEventMillis + 3600000UL)
do something;

if (millis() - lastEventMillis >= 3600000UL)
do something;

Technique number 1 will fail if millis() has overflowed, but technique number 2 will always work. The only problem I can see that arises from the millis() overflow is that you can't time durations longer than 9 hours, but how often does something like this come up?

  • Ben

That's what I thought, too, the first few times people brought up the overflow problem. Unfortunately, that doesn't work, because millis() doesn't overflow on an even data size boundary, but instead as the result of an intermediate calculation. So the workarounds get uglier.

The following doesn't solve the millis problem but it does avoid overflows for time intervals measured in seconds or more, and it also adds a useful real time clock capability.

A few lines of code added to the overflow counter in wiring.c can provide an accurate count of seconds that doesn't overflow. It uses the Unix like unsigned long seconds count since Jan 1 1970, so it does actuallly overflow in 2038, but this is unlikely to be an inconvenience.

volatile unsigned long timer0_overflow_count; // the timer overflow counter
volatile byte timer0_overflow_flag; // Bens' flag to indicate timer overflow

volatile unsigned long sysTime;   // seconds since midnight Jan 1 1970
  // two counters used to determine 976.5625 counts without using time consuming math
volatile static unsigned int overflowCounter;
volatile static byte fracOverflows;       

SIGNAL(SIG_OVERFLOW0)
{
  timer0_overflow_count++;

  timer0_overflow_flag = 1;  // flag the overflow – Bens idea :>)
  // code added below increments Time counter every 976.5625 overflows (i.e. exactly once per second)
  ++fracOverflows;      // this counter tells us fractional overflows without using floating point   
  if( ++overflowCounter >= 976) {       
    if( (fracOverflows >= 144)||(overflowCounter >= 977) )  // 144 divided by 256 equals 0.5625 
    {
      overflowCounter = 0; 
      sysTime++;  // increment the seconds counter
    }
  }
}

// functions to get and set the system time
unsigned long getTime() 
{ 
  unsigned long _time;
  // return the system time
  do
  { 
    timer0_overflow_flag = 0;  // clear the overflow flag 
    _time = sysTime;
  } 
  while (timer0_overflow_flag);  // if an overflow occurred during the read, try again  
  return _time; 
} 

void setTime(unsigned long time)
{
  // set the system time to the given time value (as seconds since Jan 1 1970)
  cli();             //disable interrupts
  sysTime = time;
  sei();             // enable interrupts    
}

This approach allows the addition and subtraction of time (in seconds) without overflow problems. It also makes it easy to get clock time information in an arduino sketch. I have a library for managing clock time that would be useful if you wanted to adopt this approach, here are some of the methods:

void Sync(time_t time); // sync clock (typically uses messages on the serial port)
time_t now(); // return the current time as seconds since Jan 1 1970
boolean available(); //Refreshes the time properties below, returns true if clock is synched.

// read only time properties.
// These are accurate to the nearest second of the most recent call to available();
byte Hour;
byte Minute;
byte Second;
byte Day;
byte DayofWeek;
byte Month; // Jan is month 0
byte Year; // year minus 1900

I would be happy for any of this to be included in a future Arduino release if it would be useful.

That's what I thought, too, the first few times people brought up the overflow problem. Unfortunately, that doesn't work, because millis() doesn't overflow on an even data size boundary, but instead as the result of an intermediate calculation. So the workarounds get uglier.

Hmm, I was hoping things would just happen to work out with the overflow, but I guess I should have actually tried to work out the details. In that case, unless there are details I'm still neglecting, you could just force a timer0_overflow_count on an even data size boundary. For example:

SIGNAL(SIG_OVERFLOW0)
{
unsigned long temp = timer0_overflow_count;
temp++;
if (temp == 33554432UL) // if (temp == 0x10000 / 128)
temp = 0;
timer0_overflow_count = temp;
}

I believe this would allow differential timing to work beyond the overflow while adding fewer than 10 cycles to the ISR.

  • Ben

The following doesn't solve the millis problem but it does avoid overflows for time intervals measured in seconds or more, and it also adds a useful real time clock capability.

I like it, especially since it doesn't bog down the ISR with too many extra calculations.

void setTime(unsigned long time)
{
// set the system time to the given time value (as seconds since Jan 1 1970)
cli(); //disable interrupts
sysTime = time;
sei(); // enable interrupts
}

The version of this function that relies on the overflow flag would be:

void setTime(unsigned long time)
{
  do
  {
    timer0_overflow_flag = 0;
    sysTime = time;
  }
  while (timer0_overflow_flag);
}

But it might just be simpler to either disable interrupts or disable timer0 overflow interrupts for the duration of the write.

  • Ben

Hi Ben,

If millis() overflows cleanly then I can increment the time outside the ISR

time_t DateTimeClass::now()  // return the current time
{
  while( millis() - prevMillis >= 1000){
    this->sysTime++;
    this->prevMillis += 1000;
  }
  return sysTime;
}

void DateTimeClass::setTime(time_t time)
{

  this->sysTime = time;  
  this->prevMillis = millis();
}

But I would much prefer if the timer0 overflow interrupt incremented sysTime as per my earlier post, this eliminate the extra four bytes for the prevMillis counter.

BTW, i would expect that setTime would be called infrequently (typically once a day, perhaps once an hour at most) so disabling interrupts for that call should not be a problem.

There is a thread on the DateTime library here: http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1211215328

bens: are you sure that will make the return values of millis() overflow on an even boundary? The problem is that there's a division at the end of the millis() calculation, so an overflow at a nice variable size boundary in the intermediate calculation ends up as a weird overflow in the final result. Again, the priority is making millis() work well, not necessarily optimizing the ISR.

bens: are you sure that will make the return values of millis() overflow on an even boundary? The problem is that there's a division at the end of the millis() calculation, so an overflow at a nice variable size boundary in the intermediate calculation ends up as a weird overflow in the final result. Again, the priority is making millis() work well, not necessarily optimizing the ISR.

I understand the priority, I'm just looking for a solution that would accomplish both.

If you change the overflow-compare number to 33554375, it will overflow on an even boundary if F_CPU is 16 MHz (since this number is divisible by 125). If F_CPU is not 16 MHz, the divisor is a truncated float and millis won't work right anyway.

Just out of curiosity, have you ever considered using timer1 as your millis() generator, or is timer1 being used for something else? You could have init() configure timer1 to overflow every millisecond and then you wouldn't have to do any calculations at all. For example:

// configure timer1 for CTC mode, TOP is OCR1A, clock source is I/O clock (F_CPU)
TCCR1A = 0;
TCCR1B = 0x09;
OCR1A = F_CPU / 1000;  // set TOP so TCNT1 overflows every 1ms
TIMSK1 |= 1 << TOIE1;  // enable timer1 overflow interrupt

The two plus sides to this:

  1. It will work for pretty much any F_CPU (so long as it's a multiple of 1000).
  2. The ISR simply updates a millisecond counter (unsigned long) that can time for 49.7 days without overflowing, and when it does overflow, there are no uneven data boundary issues.

You'd still want to implement a getMillis() and setMillis() function to ensure uninterrupted reads and writes of the millisecond counter, though. Just a thought.

  • Ben

Just out of curiosity, have you ever considered using timer1 as your millis() generator, or is timer1 being used for something else?

  • Ben

I would not be happy if timer1 was lost to millis. Its too useful for other things like input capture and precision timing. Half of my Arduino code would break :frowning:

bens: I'd like to find a solution that allows both too.

I'm a bit confused about your function:

SIGNAL(SIG_OVERFLOW0)
{
unsigned long temp = timer0_overflow_count;
temp++;
if (temp == 33554432UL) // if (temp == 0x10000 / 128)
temp = 0;
timer0_overflow_count = temp;
}

Would millis() be the same as it is now (in Arduino 0011)? In that case, wouldn't the value return from millis() go from 34359737 to 0? Or if we used 33554375, wouldn't millis() go from 34359678 to 0? Either way, you can't just do millis() - lastEventMillis to measure an interval.

Or am I missing something?

I was also thinking about using timer 1 for the millis(), since, as you point out, it's (relatively) easy to make it accurate. Unfortunately, there do seem to be a lot of other uses for the timer (like the Servo library). It seems better to take over timer 0 than timer 1.

bens: I'd like to find a solution that allows both too.

I'm a bit confused about your function:

You're right to be confused: I'm being a moron. I was focusing solely on the fact that the overflow was happening somewhere other than on an even millisecond boundary, thereby causing that overflowed millisecond to have an erroneous duration. It would seem my function completely neglects the much more significant problem of the overflow not occuring at 0xFFFFFFF. I'm beginning to understand now how the workarounds just become messier and messier. I keep feeling like there should be some supremely elegant way of doing this well, but maybe that's not the case.

  • Ben

Hey, what about just giving the user a resetMillis() function that would set timer0_overflow_count to zero? This way, the user can accomplish differential timing by resetting the overflow count and then waiting for the desired number of milliseconds to ellapse. This lets the user keep himself safely away from the region where the overflow counter overflows.

Are there any down sides to this that I'm not taking into account?

  • Ben

I did a timing measurement on the difference in performance with the new millisecond ISR code, here are the results:

Arduino 011 version interrupt takes around 4.5us including saving and restoring registers. Adding my code to increment the seconds counters adds 1.5us

David's proposed 012 code as posted here takes 9us. Half that time is in the while loop.

The timings were done using an HP 16500C logic analyzer.

My view is that using 1% of the processing power to service the millisecond interrupt is ok unless someone can come up with a more efficient solution.

mem: Thank you for doing that measurement. It's much easier to make these kinds of decisions with good data behind them.

I think it's okay to double the ISR time in order to keep millis() from overflowing until it hits the limits of an unsigned long. I'd love to have a better solution (and I think there probably is one), but I think this is reasonable. Feel free to take a shot at it, though.

mem: do we need to add code to the ISR to do the seconds calculations? Or can you just base your functions / library on millis()?

bens: resetting the millis() counter seems like an extra complication that most people shouldn't need to worry about. If you really want to do it, you can always set the internal variable directly, but that's more of a hack.