Hi all, I've been doing some work on that recurring delay/timing issue many have been experiencing. (Including myself!)
Long story short - I've written a new delay function that seems to be very close to spot on. I think it'll need a bit of explaining, but the base code is below. In a nutshell, millisecond delays are fine, but microsecond delays may need increasing by 20% depending on your clock speed.
There are two new functions, "delay_us" and "delay_ms", and both take an unsigned integer for the time period. You can put the functions in your sketch file, or add them to "wiring.c" in the core13 directory.
A few issues to consider:
- "ms_delay" just reuses "us_delay", but note that the "ms_loops" variable can be 1200 rather than 1000. This is due to the ATTiny internal base frequency being at 1.2/4.8/9.6MHz. It's only 1000 if using 2/4/8/16MHz. This is done automatically for millisecond delays, so nothing to worry about.
However, such an adjustment is NOT done inside "us_delay". It would require a *1.2 multiplier, and maths functions eat up memory. As a result, if you're running at 1.2/4.8/9.6MHz then the value passed to "us_delay" MUST be increased by 20% to get the correct delay period. So if you want a 50us delay, you actually need to call "us_delay(60);" (This is a minor annoyance, but I'm working on it...)
-
The F_CPU parameters in "boards.txt" need to be the actual CPU frequency. The stock file has both 4.8MHz and 9.6MHz set to 1200000L, and this won't work. They need to be changed to 4800000L and 9600000L for these functions to work correctly. (Changing them may mess up the existing delay functions if you use those as well - but I haven't yet checked.)
-
There are some slight overheads which make the delays very slightly longer than expected. This is just a few microseconds, so isn't a problem for long delays, but it starts creeping in below 100us. (Actually called as "120us" - see point 1.)
-
The maximum delay in microseconds is 16383, the minimum is 4. Due to point 1, in practice this translates as 13650us if you're running at the awkward 1.2/4.8/9.6 clock speeds.
I've done a reasonable amount of testing with some ATTiny13As at 1.2MHz and 9.6MHz, and also with a standard Arduino (ATMega328) at 16MHz. At the higher clock speeds the timing is pretty accurate, but not quite so good at 1.2MHz. I think this is due to the overheads being longer in real-time at the slower clock speed.
I'll try and fix the annoying +20% issue when I get some free time, I suspect it may require changing the number of clock cycles used by the loop.
Comments and feedback appreciated. Someone may also wish to check the code as it's my first attempt at assembly!
void delay_us(uint16_t us_delay) // Works OK between 10us and 16382us. NOTE: Send x1.2 for ATTiny @ 1.2/4.8/9.6MHz
{
us_delay = us_delay >> 2;
uint8_t us_low = us_delay & 255;
uint8_t us_high = us_delay >> 8;
uint8_t us_loops; // define the number of outer loops based on CPU speed (defined in boards.txt)
#if (F_CPU == 16000000L)
us_loops=16;
#elif (F_CPU == 8000000L || F_CPU == 9600000L)
us_loops=8;
#elif (F_CPU == 4000000L || F_CPU == 4800000L)
us_loops=4;
#elif (F_CPU == 2000000L)
us_loops=2;
#elif (F_CPU == 1000000L || F_CPU == 1200000L)
us_loops=1;
#else
#error This CPU frequency is not defined
#endif
// loop is (4) + (4x us) + (4x us_loops) clock cycles in total - this is where the overheads occur
// each clock cycle is 62.5ns @ 16MHz
// each clock cycle is 833.3ns @ 1.2MHz
asm volatile(
"CLI\n" // turn off interrupts : 1 clock
"MOV r28,%0\n" // Store low byte into register Y : 1 clock
"MOV r29,%1\n" // Store high byte into register Y : 1 clock
"MOV r30,%2\n" // Set number of loops into register Z : 1 clock
// note branch labels MUST be numerical (ie. local) with BRNE 1b (ie. backwards)
"1:\n" // = 4 clock cycles for each outer loop
"MOV r26,r28\n" // Copy low byte into register X : 1 clock
"MOV r27,r29\n" // Copy high byte into register X : 1 clock
"2:\n" // = 4 clock cycles for each inner loop
"SBIW r26,1\n" // subtract one from word : 2 clocks
"BRNE 2b\n" // Branch back unless zero flag was set : 1 clock to test or 2 clocks when branching
"NOP\n" // add an extra cycle if not branching
"SUBI r30,1\n" // subtract one from loop counter : 1 clocks
"BRNE 1b\n" // Branch back unless zero flag was set : 1 clock to test or 2 clocks when branching
"SEI\n" // turn on interrupts : 1 clock (adds extra clock cycle when not branching)
:: "r" (us_low), "r" (us_high), "r" (us_loops) // tidy up registers
);
}
void delay_ms(uint16_t ms_delay) // reuse delay_us routine
{
uint16_t ms_loops=1000; // define number for us cycles
#if (F_CPU == 1200000L || F_CPU == 4800000L || F_CPU == 9600000L)
ms_loops=1200; // Need to compensate for 1.2/4.8/9.6MHz
#endif
for (int ms_loop=0; ms_loop < ms_delay; ms_loop++) {
delay_us(ms_loops);
}
}