Pages: 1 2 [3] 4 5 ... 8   Go Down
Author Topic: realtime clock, microseconds, etc.  (Read 9780 times)
0 Members and 1 Guest are viewing this topic.
0
Offline Offline
God Member
*****
Karma: 1
Posts: 513
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

updated micros, passes all test thus far:

Code:
unsigned long micros()
{
  unsigned long m, t;
  uint8_t oldSREG = SREG;
  cli();
  t = TCNT0;
  if ((TIFR0 & _BV(TOV0)) && (t == 0))
    t = 256;
  m = timer0_tics;
  SREG = oldSREG;
#if F_CPU >= 16000000L
  return ((m << 8) + t) <<2;
#else
  return ((m << 8) + t) <<3;
#endif  
  
}
Logged

Austin, TX USA
Offline Offline
God Member
*****
Karma: 5
Posts: 997
Arduino rocks
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

@dcb--

If you just change your hpticks derivative in reply 24 from

Code:
 unsigned long t0_ticks = (clock_cycles / 64) + (millis * (1000L * clockCyclesPerMicrosecond() / 64)) + t0;
  return ((t0_ticks) * 64L / (F_CPU / 1000000L));

to

Code:
 unsigned long t0_ticks = (clock_cycles / 64UL) + t0;
  return 1000 * millis + t0_ticks * 64UL / clockCyclesPerMicrosecond();

the 268-second micros() overflow problem goes away (overflows at the 32-bit boundary).  (This change avoids the unnecessary translation of millis into the "tick" domain and then back into the "micro" domain.)

This seems ideal to me.  No changing wiring.c and all the benefits of Don's and dcb's work.  It also has the added benefit of working with the 20MHz clock (I think).  Do you agree?

Mikal
« Last Edit: November 11, 2008, 09:31:54 am by mikalhart » Logged

0
Offline Offline
God Member
*****
Karma: 1
Posts: 513
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Mikal, I do appreciate the investigation, but let me get a read from you on the function in reply 30 first.

Having seen this approach work 5 times faster than the no-change to wiring version, you can imagine I want the fast one smiley
Logged

Austin, TX USA
Offline Offline
God Member
*****
Karma: 5
Posts: 997
Arduino rocks
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

On the surface it looks good!  I'll study it enthusiastically later on.  (At some point I need to pretend to be doing "real" work today. smiley)  

Nice work!  This is fun.

Mikal
Logged

0
Offline Offline
God Member
*****
Karma: 1
Posts: 513
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

fyi, timer0_tics appears to be identical to the former timer0_overflow_count  :smiley
Logged

Portland, OR, USA
Offline Offline
Jr. Member
**
Karma: 0
Posts: 78
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
#if F_CPU >= 16000000L
I understand what you're trying to do here but keep in mind that with -0s the avr-gcc compiler is going to replace multiplication by powers of two with left shifting so it isn't necessary to code it explicitly with shifting.  Moreover, the result won't be correct for 20MHz because the multiplication factor should be 3.2 rather than 4.

I would propose the alternate implementation of cycles to microseconds shown below which handles CPU speeds that are factors of 64,000,000 as one case, handles 20MHz as a special case and reports an error at compile time otherwise.

Note that the code for 20MHz will be somewhat slower due to the divide-by-10 operation (in addition to one more shift cycle to perform the multiplication) and it will have a slightly smaller dynamic range.  If desired, the calculation for 20MHz could include rounding by adding 5 to the result prior to dividing by 10.
Code:
#define F_CPU_MHZ   (F_CPU / 1000000L)
  unsigned long us;
#if ((64 / F_CPU_MHZ) * F_CPU_MHZ) == 64
  us = ((m << 8) + t) * (64 / F_CPU_MHZ);
#elif F_CPU_MHZ == 20
  us = (((m << 8) + t) * 32) / 10;
#else
  #error clock speed not supported
#endif
  return(us);
#undef F_CPU_MHZ
Logged

Don

ZBasic Microcontrollers
http://www.zbasic.net

0
Offline Offline
God Member
*****
Karma: 1
Posts: 513
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I'm ok with a special case for 20mhz as long as millis() and elapsedMillis() is fast.

delayMicroseconds is currently 8 or 16mhz fyi it is in wiring.c if you want to take a stab at figuring out how to make that 20mhz compatable.

Also might want to think about 1mhz compatibility a bit, I think the avr butterfly (runs on a button cell) is a really neat device and worthy of some consideration as well @ 1mhz.

Obviously it may not be practical to work with every possible frequency and get good performance though.  I would generally assert that where microseconds is concerned, performance is also a concern.
Logged

Austin, TX USA
Offline Offline
God Member
*****
Karma: 5
Posts: 997
Arduino rocks
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

@Don, one possible objection to your proposal for the 20MHz case is that micros() will not overflow at 32-bits because of the final division by 10, right?

@dcb: I'm playing (illicitly) with your code at work, and the more I see the more I like!  I've been doing some experiments with consecutive calls to (your) micros() and it does indeed seem much, much faster on a 16MHz Arduino.  And the wiring.c mods are really quite minimal, aren't they?

It doesn't seem like 1MHz support would be too hard because 1 divides 64.

@mellis: I measured 1 million deltas between consecutive calls to dcb's micros and got these results:

Delta | Count
0us   | 35.7%
4us   | 64.1%
12us  | 0.2%
16us  | 0.02%


Mikal
« Last Edit: November 11, 2008, 12:46:33 pm by mikalhart » Logged

Portland, OR, USA
Offline Offline
Jr. Member
**
Karma: 0
Posts: 78
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
one possible objection to your proposal for the 20MHz case is that micros() will not overflow at 32-bits because of the final division by 10, right?
Quite so.  However, if you don't multiply by 3.2 the result won't have units of microseconds.  If, for example, you choose to multiply by 4 (like the 16MHz case) the return value will represent units of 0.8uS.

This issue is why I've favored a routine that returns Timer0 clock cycles instead of microseconds.  It is quite simple to implement microsecond-based timing by looking for the equivalent number of Timer0 clock cycles at the prevailing CPU speed.  This strategy has the added convenience of allowing either truncation or rounding behavior to be employed (when a difference exists) depending on which is better for a particular application.
Logged

Don

ZBasic Microcontrollers
http://www.zbasic.net

Austin, TX USA
Offline Offline
God Member
*****
Karma: 5
Posts: 997
Arduino rocks
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
However, if you don't multiply by 3.2 the result won't have units of microseconds.
I understand the dilemma.  You're going to have to lose some bits either at the high end or the low end one way or another.

What about replacing (for 20MHz only)
Code:
 us = (((m << 8) + t) * 32) / 10;
with
Code:
 us = (((m << 8) + t) / 5) * 16;

With this expression, we'd lose a bit of resolution (because of doing the divide by 5 first), but still overflow at 32 bits (if my analysis is correct).

Mikal
« Last Edit: November 11, 2008, 01:43:59 pm by mikalhart » Logged

Portland, OR, USA
Offline Offline
Jr. Member
**
Karma: 0
Posts: 78
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
delayMicroseconds is currently 8 or 16mhz fyi it is in wiring.c if you want to take a stab at figuring out how to make that 20mhz compatable.
I've modified my version so that it works correctly at 20MHz.  As shown in the code below, an extra cycle is conditionally added to the delay loop for 20MHz.  The code preparing for the delay is slightly different, too, to account for the faster speed.  My measurements put it right on the button.
Code:
/* Delay for the given number of microseconds.  Assumes a 8, 16 or 20 MHz clock.
 * Disables interrupts, which will disrupt the millis() function if used
 * too frequently. */
void delayMicroseconds(unsigned int us)
{
#define EXTRA_CYCLES
    uint8_t oldSREG;

    // calling avrlib's delay_us() function with low values (e.g. 1 or
    // 2 microseconds) gives delays longer than desired.
    //delay_us(us);

#if F_CPU >= 20000000L
    // for a 20 MHz clock add one extra cycle to the delay loop
#undef EXTRA_CYCLES
#define EXTRA_CYCLES    " nop" "\n\t" // 1 cycle

    // for a one-microsecond delay, simply return.  the overhead
    // of the function call yields a delay of approximately 0.9 us.
    if (--us == 0)
        return;

    // the loop below takes 0.25 microseconds (5 cycles)
    // per iteration, so execute it four times for each microsecond of
    // delay requested.
    us <<= 2;

    // partially compensate for the overhead of getting into and out of the loop
    us -= 3;

#elif F_CPU >= 16000000L
    // for the 16 MHz clock on most Arduino boards

    // for a one-microsecond delay, simply return.  the overhead
    // of the function call yields a delay of approximately 1 1/8 us.
    if (--us == 0)
        return;

    // the following loop takes a quarter of a microsecond (4 cycles)
    // per iteration, so execute it four times for each microsecond of
    // delay requested.
    us <<= 2;

    // account for the time taken in the preceeding commands.
    us -= 2;
#else
    // for the 8 MHz internal clock on the ATmega168

    // for a one- or two-microsecond delay, simply return.  the overhead of
    // the function calls takes more than two microseconds.  can't just
    // subtract two, since us is unsigned; we'd overflow.
    if (--us == 0)
        return;
    if (--us == 0)
        return;

    // the following loop takes half of a microsecond (4 cycles)
    // per iteration, so execute it twice for each microsecond of
    // delay requested.
    us <<= 1;
    
    // partially compensate for the time taken by the preceeding commands.
    // we can't subtract any more than this or we'd overflow w/ small delays.
    us--;
#endif

    // disable interrupts, otherwise the timer 0 overflow interrupt that
    // tracks milliseconds will make us delay longer than we want.
    oldSREG = SREG;
    cli();

    // busy wait
    __asm__ __volatile__ (
        "1: sbiw %0,1" "\n\t" // 2 cycles
        EXTRA_CYCLES
        "brne 1b" : "=w" (us) : "0" (us) // 2 cycles
    );

    // reenable interrupts.
    SREG = oldSREG;
}
Logged

Don

ZBasic Microcontrollers
http://www.zbasic.net

Portland, OR, USA
Offline Offline
Jr. Member
**
Karma: 0
Posts: 78
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
With this expression, we'd lose a bit of resolution (because of doing the divide by 5 first), but still overflow at 32 bits (if my analysis is correct).
I'm not sure that I understand what you mean about overflowing at 32 bits.  I believe that you either lose range, resolution or both.  With the scaling factors that you mentioned the maximum value of ((m << smiley-cool + t) is 0x4FFFFFFB as compared to 0xFFFFFFFF in the other cases.  While it is true that the maximum return value will be 0xFFFFFFFF you still need to know the range in order to compute elapsed time when the second data point has a lower value than the first, i.e., when the value has wrapped.

Besides that, having micros() return a value with different units also causes portability issues that may be more bothersome than the inherent difference in range.

I reiterate my support for hpticks() (returning Timer0 ticks) because it avoids these problems entirely and, as I indicated earlier, it is easy to convert the result to microseconds using the CPU speed if desired.
Logged

Don

ZBasic Microcontrollers
http://www.zbasic.net

0
Offline Offline
God Member
*****
Karma: 1
Posts: 513
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Ok, I haven't absorbed the preceeding entirely yet, but wondered when the last time someone tried avrlib's delay_us() function was?

And is it long becasue we call it from a function?  In that case a #define delayMicroSeconds(us) delay_us(us)

might fix it?
Logged

Portland, OR, USA
Offline Offline
Jr. Member
**
Karma: 0
Posts: 78
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
I [...] wondered when the last time someone tried avrlib's delay_us() function was?
The delay_us() function, which takes a floating point parameter, works great if the parameter to it is a compile-time constant.  In that case, the compiler does the real math and produces integral values to use in the inlined code.  If the parameter is not constant at compile time, your program grows by a large amount due to the floating point math having to be linked in.
Logged

Don

ZBasic Microcontrollers
http://www.zbasic.net

Forum Administrator
Cambridge, MA
Offline Offline
Faraday Member
*****
Karma: 12
Posts: 3538
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I think the ticks() function is nice, but I think it may be too confusing to include in the core.  A tick can be hard to explain, especially because it varies based on the cpu speed.  We'd probably want to include a microsToTicks() functions or something, which also complicates things.  Again, this would be a great function to have a nice implementation of on the playground: http://www.arduino.cc/playground/Main/GeneralCodeLibrary

For the micros() function, is it reasonable to simply count microseconds in the overflow handler?  Or is there another way to avoid overflowing at a weird value?  Especially with micros() that will overflow relatively quickly, I think it's important that people can just do a simple subtraction.  
Logged

Pages: 1 2 [3] 4 5 ... 8   Go Up
Jump to: