Hi all,
I've just discovered a strange problem that I can't seem to solve.
I've got an Arduino MEGA2560 with a 22.1184 MHz. crystal which works just fine.
However, any kind of delay that I try to use is off. I tried these three:
__builtin_avr_delay_cycles (22118400UL);
_delay_ms (1000);
delay (1000);
All three of those should give me a 1 second delay. However, if I do this:
uint32_t tm = micros(); // get current microsecond value
delay (1000); // any one of the three delays
tm = micros() - tm; // end minus start = time
Serial.print (tm / 1000); // print it as milliseconds
I get values around 696 (which is VERY close to the expected 1000 divided by 22.1 / 16.0)! In other words, it seems as though the delay time thinks the clock is at 16 MHz (which it should not because F_CPU is properly set).
The Arduino delays account for F_CPU (which is, of course, 22118400UL) and the AVR built in delay cycles simply delays one CPU cycle per count, so using it with "22118400" should give me a 1 second delay.
Trying the exact same test code on a 16 MHz MEGA2560 yields a result of 1007 (which is the expected 1000 with a touch of overhead).
Anyone have an idea why I should be seeing this kind of error? (and please no "you shouldn't be running at 22 MHz that's your problem" kind of baloney because that's not the problem and not the answer).
Thanks!
Have you modified the core (specifically the timer 0 overflow handler)?
When I test frequency, I compare with GPS PPS captured with a simple ISR on an interrupt pin. It removes a lot of doubt and guesswork about overhead.
No I have not modified the core. And if anything, the plain "__builtin_avr_delay_cycles()" function should yield the correct result, even if there were something wrong with the core functions.
I've been thinking of testing this with a huge bunch of NOP's, but I doubt I can compile 22.1 million of them, and putting a few inside a loop will surely be inaccurate due to the loop counting overhead.
I might try it in ASM... I just need to see that 22.1 million CPU cycles does indeed take 1 second to complete.
(by the way, I wondered if the crystal was wrong. I tuned it in on my shortwave radio and hear a carrier on 22.1184 MHz, and on 30576480 (which is 22.1 times 22.1/16) there is nothing, so I am 100% sure the crystal is right).
aarg:
When I test frequency, I compare with GPS PPS captured with a simple ISR on an interrupt pin. It removes a lot of doubt and guesswork about overhead.
I'm not sure how this applies to my question, but it DOES give me an idea...
I'll setup a timer to just toggle at a known rate and see what I actually get. If I set a timer for, say, 1000 Hz and look at the pin on my scope, it should read right around 1000. If I see 999 or 1001 that's obviously fine. But if I see something like 696, then we've got a problem.
As far as my original test, I'm sure that reading millis() or micros(), then doing a delay(1000), then reading again should give me CLOSE to 1000 ms (or 1000000 us) plus a little for overhead.
On a 16 MHz Arduino, I expect 1000 and get 1007 (exactly correct as far as I'm concerned). But on a 22.1 MHz Arduino, I get around 696 which is obviously wrong... and curiously off by a factor of "22.1 crystal / 16 crystal".
Don't need GPS accurate timing for things like this.....
micros() is wrong on non-standard clock speeds, you need to modify micros implementation to get correct results. (note that millis() is fine)
ATTinyCore/wiring.c at OldMaster---DO-NOT-SUBMIT-PRs-here-Use-2.0.0-dev · SpenceKonde/ATTinyCore · GitHub the fun starts at line 218.
The default implementation just integer-divides the prescaler of the millis timer by the clock cycles per microsecond (derived from the clock speed). This works fine where the millis timer prescaler (normally 64) divides evenly by the clock cycles per microsecond. So for 8 and 16mhz it works great.
But what do you do for other cases? You can't do division with longs, because that's so slow that it takes a significant time to run (that's my fallback case at the bottom, though). So what I did was use inexpensive bitshift operations to get the right number of cycles for the cases where crystals of a given speed are readily available.
You can see at the bottom all the other things I tried there, commented out 
krupski:
No I have not modified the core.
Which means that micros will not be correct, millis will not be correct, and they are likely not consistent (meaning that 1000 micros counts may not equal one millis count).
And if anything, the plain "__builtin_avr_delay_cycles()" function should yield the correct result...
No, it will not. It does not account for time consumed by interrupt service routines. As long as the timer 0 overflow interrupt handler is allowed to run the actual time delayed will be longer. If I remember correctly the actual delay is about 0.6% longer. Ditto for _delay_ms.
I've been thinking of testing this with a huge bunch of NOP's, but I doubt I can compile 22.1 million of them...
That won't accomplish anything. __builtin_avr_delay_cycles should be within two NOPs of the precise number of clock cycles.
by the way, I wondered if the crystal was wrong.
The way you are measuring the actual crystal frequency won't make a difference. You are using a single clock source for the delay and measurement. When one is wrong the other will be by the same canceling the error.
The problem is not the clock. The problem is millis / micros was designed to work correctly with a specific short list of frequencies (1, 2, 4, 8, 16, 32 MHz). Other frequencies, as you have found, can give surprisingly inaccurate behaviour.
A version for 20 MHz that will hopefully give you a sense of the issue and one potential solution...
http://forum.arduino.cc/index.php?topic=70475.0
There are very few values of F_CPU which will work correctly with the Arduino AVR Core. We know that 8 and 16 are fully supported. 20 MHz is very close to correct and is mostly supported. Other values will be off by a factor that is hard to calculate.
Timer0 uses a prescale of 64 so on a 16 MHz machine it runs at 250 KHz. It is set up to interrupt every 256 ticks so the interrupt rate is 976.5625 Hz or 1024 microseconds per interrupt. That is split into 1 millisecond and 24 microseconds. To keep track of the extra 24/1000ths of a millisecond both sides are divided by 8 to get 3/125ths. On each interrupt 3 is added to a counter and when it exceeds 125, one is added to the millisecond count and 125 is subtracted from the microsecond count. The divide by 8 is used to fit the microsecond accumulator into a byte.
If the system clock is 22118400 Hz then after the 64 prescale the timer clock rate is 345600 Hz. It is set up to interrupt every 256 ticks so the interrupt rate is 1350 Hz or 740.740740... microseconds per interrupt. Right there you will have a truncation error. That is split into 0 millisecond and 740 microseconds. To keep track of the extra 740/1000ths of a millisecond both sides are divided by 8 to get 92.5/125ths. On each interrupt 92 (not 92.5, another roundoff error) is added to a counter and when it exceeds 125, one is added to the millisecond count and 125 is subtracted from the microsecond count. Adding 92*8 microseconds 1350 times per second will get you 993600 microseconds per second so the two truncations result in a clock about 6.5% slow.
Note that micros(), delay() and delayMicroseconds() make different use of the clock. The micros() function uses a macro called 'clockCyclesPerMicrosecond' which would be 22 after truncation. The delay() function uses micros() so it has the same truncation error. The delayMicroseconds() function uses the same code for clock rates from 20 MHz to 24 MHz which, I believe, delays one instruction cycle for each 1/5th of a microsecond. That will make the delayMicroseconds() run about 10% slow.
Thank you DrAzzy, Coding Badly and johnwasser. I understand the problem now.
I need to think of a clever way to solve the problem. Hopefully, the error caused by non-standard clock speeds is a linear function so that I can take the requested delay, run it through a "y=mx+b" correction then do the delay, but I'm sure it won't be that easy.
Simply doing a correction for 22.1184 is a kludge since it won't correct any other non-standard clock.
I'll work on it and post the results...
30-some years ago I had a frequency counter (a kludgy thing made of all TTL stuff). It had an 8 MHz. crystal and a trimmer cap to fine adjust it.
Being clever, I thought I would trim the crystal and get it exactly right on, so I connected the counter input to the oscillator buffer output. It read 8000.00 kHz. I thought "wow that's pretty darn good" and just to see if it would change, I turned the trimmer. The reading stayed exactly the same.
Then it dawned on me that I was seeing the frequency count based on the clock and ANY crystal frequency would show 8000.00 kHz.
I couldn't zero-beat it against WWV... they don't transmit 8 MHz. I had to borrow a friend's counter and set the trimmer correctly (hoping that the other counter was close in calibration).
Arduino isn't the first time I've made this mistake! 
krupski:
Thank you DrAzzy, Coding Badly and johnwasser. I understand the problem now.
You are welcome.
Hopefully, the error caused by non-standard clock speeds is a linear function...
The micros error will be linear until the overflow interrupt.
If you call micros twice in quick succession and the interrupt did not occur, the error is linear.
If you call micros twice and the interrupt did occur, the error is linear plus a constant. There is a step error across each interrupt.
The way I handle such things (e.g. fiscal calendars are like that) is to calculate the number of interrupts that occurred between the two intervals (or since the device was powered) then multiply that count by the error from the interrupt. Add that to the simple linear error.
If I remember correctly this raises the possibility of a negative elapsed time across interrupts.
(Reply #9: good story.)
Nice! It looks like you can use the code from my link for millis.
This should cover your crystal with zero error...
#elif (F_CPU == 22118400L)
#define MILLIS_INC (0)
#define FRACT_INC (20)
#define FRACT_MAX (27)
typedef uint8_t timer0_fract_t;
#endif
20/27th millisecond per clock overflow will work for millis().
micros() will be a little off because it calculates timer clock cycles and then divides by 22 instead of 22.1184 to get time in microseconds.
delay() uses micros() so it will also be off.
delayMicroseconds() will use the 20 MHz version.