Drifting in a for loop with digitalwrite

Forgive my noobness, but I have a question about the basics of how the arduino works. I have a method that runs a for loop with a digitalwrite and a delayMicroseconds. What I'm seeing is that given the size of the boundary the execution time gradually takes longer. Maybe this is just simply how it works, but I was expecting it to be linear.

Here is the sample code:

// JUST FOR TESTING FOR Arduino drift

// This is what calls the method and does the calculation:
long msecStart = millis();
rotate(steps,speed);
long msecFin = millis();
  
float delta = (float)(msecFin-msecStart);
Serial.println(delta/1000);

void rotate2(long steps, float speed){

  float usDelay = (1/speed) * 70;  // Its 140 for this example. speed=.5


  for(int i=0; i < steps; i++){
    digitalWrite(3, HIGH); // There isn't anything connected to pin 3.
    delayMicroseconds(usDelay); 

     digitalWrite(3, LOW);
    delayMicroseconds(usDelay);
  }
}

I set the step value (boundary) to 10000, 20000, 40000, 60000. Here are the results when running the method above;

10000 takes 2.95 secs to execute
20000 takes 5.89 secs to execute (1/100 of a second longer)
40000 takes 11.78 secs to execute (2/100 of a second longer)
60000 takes 17.66 secs to execute (4/100 of a second longer)

What I was expecting is that 20000 should be 2.95 x 2 = 5.90 seconds. Maybe I'm nitpicking, but I need accuracy. So perhaps I'm calculating elapsed time wrong, perhaps millis() isn't accurate, perhaps there is just lag executing digitalwrite so many times.

Thoughts?

Are you, perhaps, calculating your delays as if the rest of the code took no time at all?

And just why.

10000 takes 2.95 secs to execute
20000 takes 5.89 secs to execute (1/100 of a second longer)
40000 takes 11.78 secs to execute (2/100 of a second longer)
60000 takes 17.66 secs to execute (4/100 of a second longer)

Needs fixing.
10000 takes 2.95 secs to execute; x 2 = 5.90 secs, x 4 = 11.80 secs, x 6 = 17.70 secs
20000 takes 5.89 secs to execute (1/100 of a second shorter than 2 x 2.95 secs)
40000 takes 11.78 secs to execute (2/100 of a second shorter than 4 x 2.95 secs)
60000 takes 17.66 secs to execute (4/100 of a second shorter than 6 x 2.95 secs)

Why? Perhaps floating point operations on a processor without FPU aren't all the same speed?
Or maybe the FP operation to get usDelay being done once, the long integer value is stored in a register for use in the loop delays. That would make the FP operation a one-time overhead, like a single-rate bulk shipping charge... the more you 'buy' the more the overhead is 'spread out'.

By my calculation, each timing pass takes 8 milliseconds of overhead and about 14.2 microseconds per loop iteration in addition to the explicit delays.

10000 takes 2.95 secs to execute (2.8 seconds of delay + 0.142 seconds of loop execution + 0.008 seconds of overhead)
20000 takes 5.89 secs to execute (5.6 seconds of delay + 0.284 seconds of loop execution + 0.008 seconds of overhead)
40000 takes 11.78 secs to execute (11.2 seconds of delay + 0.568 seconds of loop execution + 0.008 seconds of overhead)
60000 takes 17.66 secs to execute (16.8 seconds of delay + 0.852 seconds of loop execution + 0.008 seconds of overhead)

Well there is certainly more overhead involved when using a float var in the loop. Here is what I discovered. I modified the method and hardcoded the delay value rather than calculating/assigning.

// JUST FOR TESTING FOR Arduino drift
void rotate2(long steps, float speed){
  //rotate a specific number of microsteps (8 microsteps per step) - (negitive for reverse movement)
  //speed is any number from .01 -> 1 with 1 being fastest - Slower is stronger
   
   // Wake up!
   if (IsSleeping == LOW) {
      Serial.println("I'm still a sleep");
      //Serial.println("** Waking Up! **");
      //digitalWrite(SLEEP_PIN, HIGH);
   }
  
  int dir = (steps > 0)? HIGH:LOW;
  steps = abs(steps);
  
  digitalWrite(2,dir); 

 // float usDelay = (1/speed) * 70;  


  for(int i=0; i < steps; i++){
    digitalWrite(3, HIGH);
    delayMicroseconds(140.00); 

     digitalWrite(3, LOW);
    delayMicroseconds(140.00);
  }
}

It takes 4/100th of a second less time to execute a the loop with a boundary of 10000.

So 10000 now takes 2.91 seconds to execute using a hardcoded delay of 140.00 microseconds. If I increase the boundary by ten (100,000) it only drifts 2/100th of a second.

Perhaps I'm approaching this all wrong. I'm trying to calibrate a stepping motor. I want to know the amount of time that has elapsed over the total number of steps I take until the end trigger.

I would try this with a timer interrupt.

Pete

DrWoo:
I have a method that runs a for loop with a digitalwrite and a delayMicroseconds. What I'm seeing is that given the size of the boundary the execution time gradually takes longer. ... Maybe I'm nitpicking, but I need accuracy.

The documentation for delayMicroseconds says:

This function works very accurately in the range 3 microseconds and up.

There is a bit of an implication here that it won't work very accurately for higher delays. Which is what you are finding.

    delayMicroseconds(140.00);

So, two problems here. This is a lot more than 3 microseconds. And the argument is unsigned int, not float. So the decimal places are not only useless, but they (potentially) consume processor time.

You are much better of using micros () call (possibly in a loop), detecting when the desired number is up. One reason for this is that interrupts (ironically, for getting accurate figures for millis () and micros () ), will be happening during the delayMicroseconds loop. So the longer you delay, the more interrupts, and the more drift.

However micros () relies on hardware timers, so even if interrupts go off elsewhere (eg. serial interrupts) it should not drift unless interrupts are turned off for too long.

If you want drift-proof timing don't use delay/delayMicroseconds, use millis/micros:

  unsigned long last_tick = micros() ;
  for(int i=0; i < steps; i++)
  {
    while (micros() - last_tick < usDelay)
    {}
    digitalWrite (3, HIGH) ;
    last_tick += usDelay ;
    while (micros() - last_tick < usDelay)
    {}
    digitalWrite (3, LOW) ;
    last_tick += usDelay ;
  }

This way you are always waiting for an exact multiple of usDelay microseconds and it doesn't matter how long digitalWrite() takes (unless its simply too slow to keep up at all)

Thank you to everyone that responded. Certainly appreciate the help from this great community. I'm going to hack away on this a bit and see what I come up with.

Thanks again.

The Arduino delay routines are not very accurate and are totally inaccurate if the MPU clock
rate is not 16Mhz or 8Mhz.

If you are looking for a true cycle accurate spin loop type delay routine see this project
over on AVR freaks: (you will need to create an account to access this)
http://www.avrfreaks.net/index.php?module=Freaks%20Academy&func=viewItem&item_id=665&item_type=project

These functions will give you a very accurate delay that is accurate to within 1 MPU clock cycle
(which is 62.5ns at 16Mhz).
They work by inlining various AVR instuctions to give you the requested delay.

The only thing to keep in mind is that if interrupts are enabled and ISRs are running (and they are on Arduino),
that they are robbing away cycles from the delay loop so that the delays can potentially be longer
than what was asked for or can drift in size depending on the interrupt load.

--- bill

bperrybap:
The Arduino delay routines are not very accurate and are totally inaccurate if the MPU clock rate is not 16Mhz or 8Mhz.

Why is that, Bill? Doesn't the internal timer fire accurately? That would be the only reason for it to be wrong, surely? Just curious.

bperrybap:
... see this project over on AVR freaks: (you will need to create an account to access this)

Why do you need to create an account just to view information? Is it a secret? Every time I visit a web site (eg. by Google) that requires me to "create an account" just to find a solution to my problem, I close that window and move onto the next search result.


This test program:

const unsigned long usDelay = 3000;

void setup ()
{
  pinMode (12, OUTPUT);
}

void loop () {
  PORTB = 1 << 4;  // pin D12
  unsigned long last_tick = micros() ;
  while (micros() - last_tick < usDelay)
    {}
  PORTB = 0;
  delay (2);
}

Measuring on the logic analyzer the on time for D12 was 3001.5625 uS. That quite a low error amount (1/3000).

For smaller amounts (like 140) I measured 141.9375 uS - that's almost 2 uS out, but there is an unavoidable overhead of the time taken to turn the timing pin on and off, and the time to establish the start time. You could factor those out as constants to an extent.

The only thing to keep in mind is that if interrupts are enabled and ISRs are running (and they are on Arduino), that they are robbing away cycles from the delay loop so that the delays can potentially be longer than what was asked for or can drift in size depending on the interrupt load.

That's true with "count cycles" delays. But the timer delays should be pretty accurate because the hardware timer is doing the timing, and provided the overflow interrupt is serviced in time, other interrupts shouldn't affect the end result.

DrWoo:

...

float usDelay = (1/speed) * 70;  // Its 140 for this example. speed=.5
...




perhaps there is just lag executing digitalwrite so many times.

Perhaps there is lag calculating a floating point division so many times, particularly when delayMicroseconds doesn't take floating point numbers.

DrWoo:
Thank you to everyone that responded. Certainly appreciate the help from this great community. I'm going to hack away on this a bit and see what I come up with.

Thanks again.

Think about ditching the floating-point. The timing and delays deal with unsigned longs for one.

Here is an example of printing 22/7 (the old-old PI approximation) to 4 places without FP:

  long  scale = 10000;
  long  fixedPointValue = 22 * scale / 7;
  Serial.print( fixedPointValue / scale );  // printing the left-side digits
  Serial.print( "." );                                   // then the dec pt
  Serial.print( fixedPointValue % scale );  // then the remainder to scale, check result with calculator

You are correct in that not using 16Mz or 8Mhz clock are the primary causes for inaccuracy.

I was basing the "inaccuracy" primarily on DrWoo's original sample code that used floating point
and called delayMicroseconds().

If there is a need for delays that are very small (like less than a few microseconds) or need to be a fractional
microsecond value, the existing routines would not be very accurate.

If wasn't clear if the additional accuracy for fractional microseconds was desired.

In general, it is easy to get a delay that is "at least as long as". But trying to get a delay that is "exactly"
a given value no matter how it is implemented is very difficult especially as the needed precision increases.

The Arduino routines like delay(), millis(), and micros(), are actually pretty good at being able to time
durations as long as the durations are not too short or the precision you need is not extremely high.

DrWoo:
What is it you want/need when you say:

Maybe I'm nitpicking, but I need accuracy

Accuracy and precision are not the same thing.
How much of each are you needing?

Yep, I'm with you on that one. Normally, I do the same. In fact I didn't realize that you needed
an account to see the AVR freaks projects until I did this post and tried out the link when I wasn't
logged in.

On Arduino, while the delay(), millis(), and micros() functions use the hardware timer, the delayMicroseconds() function
uses a "count cycles" type spin loop. It does not use a hardware timer.
So it can suffer from "drift" depending on the interrupt load - same is true for the delay routines
from the delay project over on AVRfreaks that I referenced earlier.

DrWoo:
perhaps there is just lag executing digitalwrite so many times.

digitalWrite() does take about 4us on a 16Mhz AVR.

--- bill