Attiny 1Mhz workaround for incorrect delayMicroseconds()

In case anyone is interested …

For some reason best known to itself this code works properly on an Attiny at 1MHz and gives a delay of 8 seconds

    unsigned long startMicros = micros();
    while (micros() - startMicros < 8000000) {
    }

BUT … delayMicroseconds(8000000) gives a strange result that isn’t quite 1/8th of 8 seconds.

This problem has been identified elsewhere and reported as a bug but I have seen no sign of a solution.

The following workaround seems to work as an alternative to delayMicroseconds() and should work at any CPU speed on any device

void delayUsecs(unsigned long waitUsecs) {
    unsigned long startUsecs = micros();
    while (micros() - startUsecs < waitUsecs) {
  
 }
}

…R

Hi Robin,

The problem with your code is that even on a 16Mhz UNO the stepsize of micros() is 4 usec. I have very little experience with 1Mhz Attiny's but I expect that the stepsize of micros() is larger or at least the same. For large values your code will work fine, for small values which are not a multiple of 4 (or ?) it will not be accurate. Still, if this works for you, it is OK.

The goal of delayMicroseconds() is to have delays in the order of 0..10000 usec and often used in the 0..50uSec. range e.g. for handshakes and IO protocols. For this application delayMicroseconds() works quite well as it can make steps of (approx) 1 uSec.

BUT ... delayMicroseconds(8000000) gives a strange result that isn't quite 1/8th of 8 seconds.

note that the signature expects uint16_t not a uint32_t so it is truncated to 65535 max

void delayMicroseconds(unsigned int us)

Can you determine the step size of micros() on your 1Mhz attiny?

Robin2:
BUT ... delayMicroseconds(8000000) gives a strange result that isn't quite 1/8th of 8 seconds.

See this comment in the documentation for delayMicroseconds():

Currently, the largest value that will produce an accurate delay is 16383.

If you're trying to delay for millions of microseconds, why aren't you using delay()?

The Arduino API documentation is pretty useless about details like this because it doesn't bother to document the actual signature of these functions, which you need to know to use them correctly. However, if you dig into the code you will see that the signature of the function you're using is:

void delayMicroseconds(unsigned int us);

This means that regardless of whatever else may be going on inside the function, it would never be able to receive an argument greater than 65535 microseconds. That's unintuitive given that micros() returns an unsigned long, but presumably somebody somewhere thought that was a good idea.

PeterH:
The Arduino API documentation is pretty useless about details like this because it doesn’t bother to document the actual signature of these functions, which you need to know to use them correctly. However, if you dig into the code you will see that the signature of the function you’re using is:

void delayMicroseconds(unsigned int us);

This means that regardless of whatever else may be going on inside the function, it would never be able to receive an argument greater than 65535 microseconds. That’s unintuitive given that micros() returns an unsigned long, but presumably somebody somewhere thought that was a good idea.

the extra code to read, decrement and store the value would add a noticable amount of time to the function, its quite accurate in the normal range of what most uses of the function will use it for. ie. once over 1000 microsends, why not consider using delay() ?

so to me, its quite right to use unsinged int as a value to call delayMicroseconds.

Rob Tilliart, has suggested some improvements to the call, to allow more accurate times, and I took his adears, and changed the code to allow a larger range. hint ( the code multiplies the passed delay time, and thus over flows the unsigned int ) in the case of a 16MHz Uno. its *4 that value, hence the bit where 16383 is the max you can use., slower cpu’s can use higher values, and faster even lower.

I extended the delay loop, so i can use higher values passed in my own version of the code.

/* Delay for the given number of microseconds ( 1% accurate of clock rate >20 )
   max we can pass on 8MHz -> 65535, on 16MHz its 32768 and for 20MHz its
   only 13107 due to having to times 5, then divide by 2 :-(
*/
void delayMicroseconds(uint16_t us)
{
    // playing around with altering _us_ means we top out early on the max value we can pass.
#if F_CPU >= 20000000L
    // for a zero  or one-microsecond delay, simply wait 2 cycle and return. The overhead
    // of the function call yields a delay of approx 0.8 microsecond.
    __asm__ volatile (
        "nop" "\n\t"
        "nop"
    );
    if (us < 2) return;

    // the busy loop takes a 2/5 of a microsecond (8 cycles)
    // per iteration, so execute it 2.5 times for each microsecond
    us = (( us - 1 ) * 5 ) >>1;
#elif F_CPU >= 16000000L
    // for a zero or one-microsecond delay, simply return.  the overhead
    // of the function call yields a delay of approximately 1 us.
    if (us < 2) return;

    // the busy loop takes a half of a microsecond (8 cycles)
    // per iteration, so execute it twice for each microsecond of
    // delay requested. offset by time for above check
    us = ( us - 1 ) <<1;
#else
    // for a zero to two microsecond delay, simply return.  the overhead of
    // the function calls takes that. then each loop per microsecond :-)
    if (us < 3) return;
    us = us - 2;
#endif

    // busy wait ( 8 cycles = 1/2 microsecond on 16MHz )
    __asm__ volatile (
        "1: sbiw %0,1" "\n\t"	// 2 cycles
        "nop" "\n\t"			// 1 cycle
        "nop" "\n\t"			// 1 cycle
        "nop" "\n\t"			// 1 cycle
        "nop" "\n\t"			// 1 cycle
        "brne 1b" "\n\t"		// 2 cycles
        : "=w" (us)
        : "0" (us)
    );
}

Hi Darryl,

good to see some ideas implemented :wink:

For the 20Mhz one extra that might be useful

 us = (( us - 1 ) * 5 ) >>1;

change to

 us = (( us - 1 ) * 2 + (us-1)>>1 ;

max value for 20Mhz becomes ~ 26214 so ~10 millies larger than the specified value of 16383

Sorry if I have caused confusion. I didn't try delayMicroseconds(8000000). I actually repeated delayMicroseconds(1000) for 8000 times - but I forgot that when I wrote the post.

I was using 8 seconds so I could time an LED with a stopwatch. My actual code uses 416 usecs.

Anyway, as far as my experiments show the delayMicroseconds doesn't return a "sensible" value at 1MHz. I could easily understand if it was off by a round number factor such as 8 or 16 - and that would be easy to compensate for. It was the fact that my compensation didn't work (though obviously wasn't ridiculously wrong) that got me to write a short test script and look for a reliable solution.

I am aware that there must be some (perhaps a lot of) granularity using micros() on an Attiny, but my code now works without needing any strange fudge factors.

...R

robtillaart:
Hi Darryl,

good to see some ideas implemented :wink:

For the 20Mhz one extra that might be useful

 us = (( us - 1 ) * 5 ) >>1;

change to

 us = (( us - 1 ) * 2 + (us-1)>>1 ;

max value for 20Mhz becomes ~ 26214 so ~10 millies larger than the specified value of 16383

yes, looks to work better, does the assembly code keep it in range though ?

i seem to remember looking at *5 being produced as a double shift, add and then final shift anyway on unsigned variables. certainly on Gcc 4.3.2 the standard included compiler. i mostly use 4.8.0 now myself.

seeing this code, reminds me, I mentioned I changed the code ( in one of the other threads I seem to follow you in :slight_smile: ) sbi.w #1 and test & branch to include the extra 4 nops, purely to get the longer delay. as to what code needed that i dont know now ! hehe.

ps. sorry for diversifying the thread.

darryl:

robtillaart:

 us = (( us - 1 ) * 2 + (us-1)>>1 ;

max value for 20Mhz becomes ~ 26214 so ~10 millies larger than the specified value of 16383

yes, looks to work better, does the assembly code keep it in range though ?

Not checked, don’t have a 20Mhz board available so I had no need…

robtillaart:

darryl:

robtillaart:

 us = (( us - 1 ) * 2 + (us-1)>>1 ;

max value for 20Mhz becomes ~ 26214 so ~10 millies larger than the specified value of 16383

yes, looks to work better, does the assembly code keep it in range though ?

Not checked, don’t have a 20Mhz board available so I had no need…

likewise :wink: