execution time

Hi,

I need to calculate exact time for some activity in my application. For that I have to subtract the tame taken by processor from the total time. So I need the time data for every part of my program. Can anybody provide the timing details for arduino uno.

Thanks
Pabel

Are we talking us? ms? ns?

For ms and us you can use millis() an micro() (but not the 4us resolution on the standard Arduino's).

Set up Timer1 to run at 16 MHz:

TCCR1A = 0;
TCCR1B = CS10;  // Run clock with pre-scale of 1

In your sketch, capture the timer count before and after the part you want to time"

unsigned int StartTime = TCNT1;
Do stuff.
unsigned int StopTime = TCNT1;

The elapsed time in 16ths of a microsecond will be: StopTime - StartTime.

To get the code overhead, take out the "Do stuff.". Your elapsed time should then be just the overhead.

The following codes present the execution time of the instruction digitalWrite(2, HIGH); on the Serial Monitor!

void setup() 
{
  Serial.begin(9600);
  TCCR1A = 0x00;    //TC1 is n Counting Mode
  TCCR1B = 0x00;   //TC1 is OFF
  TCCR1B = 0x01;   //TC1 is ON with clkTC1 = 16 MHz
  digitalWrite(2, HIGH); //code whose execution time is to be measured/presented
  TCCR1B = 0x00;    //TC1 is OFF
  Serial.print(TCNT1*0.0625);  //print execution time in us
  Serial.print("us");

}

void loop() 
{
  
}

@johnwasser

Which one of the following two is the correct syntax/semantics to operate TC1 with 16 MHz (on prescaling) internal clock frequency?

TCCR1B = CS10; //this is one you have mentioned in your Post#2
TCCR1B = 0x01; // what I have mentioned in my Post#3 as per data sheet.

The program compiles OK with TCCR1B = CS10; but, the execution time of digitalWrite(2, HIGH); shows 0.12 us. Required clock cycles are shown as: 2 (two). (2 x 0.0625 = 0.12 us)

With TCCR1B = 0x01;, the execution time for the said instruction is 2.69 us. Required clock cycles are shown as: 43 (forty three). (43 x 0.0625 = 2.69 us).

Would highly appreciate to have more information in this issue!

TCCR1B = CS10;  // Run clock with pre-scale of 1

should be

TCCR1B = 1<<CS10;  // Run clock with pre-scale of 1

With TCCR1B = 1<<CS10; , the execution time for the digitalWrite(2, HIGH); is about 52.69 us instead of 2.69 us which comes with TCCR1B = 0x01.

oqibidipo:

TCCR1B = CS10;  // Run clock with pre-scale of 1

should be

TCCR1B = 1<<CS10;  // Run clock with pre-scale of 1

Yes, I forgot that the AVR definitions give bit number, not the bit mask. CS10==0 so 1<<CS10 is correct.

These are some of the basic instruction times if that’s what you need.

@DKWatson

Execution time of digitalWrite(arg1, arg2); is 4.56 us for your case, and it is 2.69 us for my case; tough we both claim that the TC1 is running at 1/16th of the clkSYS (16 MHz). We need to see the clues for the difference!

The timing of digitalWrite() depends on which pin you use:

void digitalWrite(uint8_t pin, uint8_t val)
{
        uint8_t timer = digitalPinToTimer(pin);
···
        // If the pin that support PWM output, we need to turn it off
        // before doing a digital write.
        if (timer != NOT_ON_TIMER) turnOffPWM(timer);

Hi,
Thank you all for replying. Can you tell me what time it takes to execute loops?

pAbel:
Hi,
Thank you all for replying. Can you tell me what time it takes to execute loops?

That depends on what the loops are doing.

The overhead for a loop is two clock cycles for the actual branch if it's a small loop as in less than a couple hundred hex bytes. Add to that the cost of the test condition which can vary up to four more clock cycles. If you want a really fast loop, set your counter and then count down. Example - counter = n; while(counter--); will loop until counter hits zero. This compiles to a simple decrement and a branch not zero. As the zero flag is affected by the decrement, the test is automatic. If you count up, each iteration requires the test value to be loaded then subtract the current value of count and then check the zero flag to see if they're equal.

Example - counter = n; while(counter--); will loop until counter hits zero. This compiles to a simple decrement and a branch not zero. As the zero flag is affected by the decrement, the test is automatic.

If your saying zero flag refers to the physical Z-flag of the SREG Register of the ATmega328P, then this Z-flag is not affected by the counter--; (decrement counter) instruction. The following are the demonstrative codes in favor of this statement:

void setup() 
{
  Serial.begin(9600);
  pinMode(13, OUTPUT);
  digitalWrite(13, LOW);
  byte counter = 0x05;
  bitClear(SREG, 1);  //Z-flag is cleared

  LX: counter--;
  if(counter != 0x00)
    goto LX;

  Serial.print(bitRead(SREG, 1), BIN); //prints 0.
  delay(10000);
  digitalWrite(13, HIGH);

}

void loop() 
{
  
}

Sorry GM, this is straight from the 328P datasheet. Clearly states that DEC triggers Z, N and V.

Capture.JPG

Sorry GM, this is straight from the 328P datasheet. Clearly states that DEC triggers Z, N and V.

DEC is the assembly instruction, and it affects the Z-flag.
counter-- is the high level (C/C++) code which has no reach at the bits of the SREG Register.

In the following codes, the decision is taken not looking at the Z-flag of the SREG Register; but, looking at the content of counter!

LX: counter--;
if(counter != 0x00)
 goto LX;

I did say that it compiles to a decrement and a branch.

The processor knows nothing about high-level languages. It only understands what that gets compiled to and in the case of the 328P there are 131 instructions to choose from. What do you think it compiles to?

And BTW, try printing SREG from time to time and see what you get. You may be surprised. Make a copy of it first of course as any op may change the flags.

byte counter = 0x05;
LX: counter--;
if(counter != 0x00)
 goto LX;

After compilation, the above codes take the following intermediate assembly form (conceptual); where, the content of r17 register (the counter) is looked upon for taking decision for executing the loop again.

ldi   r17, 0x05
ldi  r16, 0x01
LX: sub r17, r16     //Z-flag is affected
cpi  r17, 0x00         //decision is being taken place looking at the content of r17 (the counter)
brne  Lx                  //branch to LX if content of r17 in not equal to zero

The high level codes are not compiled to the following intermediate assembly codes which are looking at the Z-flag for taking decision:

ldi   r17, 0x05
ldi  r16, 0x01
LX: sub r17, r16     //Z-flag is affected
brbc  1, LX             //brach to LX if bit-1 (Z-flag) of SREG is cleared (not 1)

‘Conceptual’ intermediate assembly? I’m not quite sure what that means.

How about real empirical data?

volatile a

count = 0;
while(count++ <= 1000000) a++;
takes 33 clock cycles per iteration.

count = 1000000;
while(count–) a++;
takes 27 clock cycles per iteration.

It’s just shy of 1AM where I am. Goodnight.