How many clock cycles does digitalRead/Write take?

Hi,

Anyone know how many clock cycles does a digitalRead/Write take on an arduino board?

Sherzaad

Which Arduino? It is actually different on different boards.

The answer is either “not a lot” or “way too much”. If your program is not meeting your performance goals you may consider digitalWriteFast() which skips a few steps that are usually, but not always, unnecessary.

write = ~50
2-3uS UNO

MorganS:
It is actually different on different boards.

And different cores. Teensyduino digitalWrite reduces to the smallest / fastest code for a given set of parameters.

thank for your replies.

I’m actually looking for time taken for a digitalRead/Write in clock cycles not microseconds as I will be transferring my code from a UNO (16MHz) to a pro mini (8MHz).

As my code is time critical, knowing how many clock cycles these routines take is essential for me.

Any help would be much appreciated.

He did say "about 50", which was a cycle count. Most Arduino's run at 16 MHz, so 48 cycles is 3us...

There are lots of ways to speed this up. The processor itself can set an output bit in 1 or 2 cycles.

sherzaad:
As my code is time critical, knowing how many clock cycles these routines take is essential for me.

If its important I would suggest checking the timings yourself with an oscilloscope.

This will be quick for writing, I do this a lot for really fast transfers with SPI:

PORTD = PORTD & 0b11111011; // clear D2 (low), leave the rest alone

PORTD = PORTD | 0b00000100; // set D2 (high), leave the rest alone

Up to you to have this in setup:
pinMode (2, OUTPUT);

If some of the pins on port D (D7,6,5,4,3,2,1,0) are inputs then their pullup resistors may get turned on or off.

If you know the state of the pin (or don’t care), you can also just toggle it by writing to the input register:

PIND = PIND | 0b00000100;  // toggle D2
or also try this: atomic and is likely fewer instructions and faster.
PIND |= 0b00000100; // toggle D2

This should be quick for Reading:

if ((PIND & 0b00000100) == 0){  // read all 8 bits, mask for bit 2
d2State = 0; // read all 8 bits
}
else {
d2State = 1;
}

Up to you to have this in setup:
pinMode (2, INPUT);
or
pinMode (2, INPUT_PULLUP); // internal pullup turned on

All of these ditch the checking that digitalRead and digitalWrite do to make sure registers are set up correctly.

thank you all for your suggestions.

I am aware of that direct port manipulation can be faster but I want my code to be as x-platform as possible (for arduinos that this) and therefore would prefer to use digitalRead and digitalWrite.

So far I understand from this thread that digitalWrite is 48 clock cycles...

what about digitalRead? any ideas...

"As my code is time critical, "
and
"I want my code to be as x-platform as possible "

seem to me to be at cross purposes, somewhat.

There are ways to look at the assembly code generated for digitalRead, Nick Gammon used to show the results, and you could estimate the number of clocks from those.

If you open Verbose outputs, you will see this at the end of compiling:
Linking everything together...
"C:\Users\1403219114E\Documents\other\Arduino 1.8.1\hardware\tools\avr/bin/avr-gcc" -w -Os -g -flto -fuse-linker-plugin -Wl,--gc-sections -mmcu=atmega328p -o
"C:\Users\140321~1\AppData\Local\Temp\arduino_build_747743/sketch_jun28a.ino.elf"
"C:\Users\140321~1\AppData\Local\Temp\arduino_build_74774\sketch\sketch_jun28a.ino.cpp.o"
"C:\Users\140321~1\AppData\Local\Temp\arduino_build_747743/core\core.a"
"-LC:\Users\140321~1\AppData\Local\Temp\arduino_build_747743" -lm
"C:\Users\1403219114E\Documents\other\Arduino 1.8.1\hardware\tools\avr/bin/avr-objcopy" -O ihex -j .eeprom --set-section-flags=.eeprom=alloc,load --no-change-warnings --change-section-lma .eeprom=0 "C:\Users\140321~1\AppData\Local\Temp\arduino_build_747743/sketch_jun28a.ino.elf"
"C:\Users\140321~1\AppData\Local\Temp\arduino_build_747743/sketch_jun28a.ino.eep"
"C:\Users\1403219114E\Documents\other\Arduino 1.8.1\hardware\tools\avr/bin/avr-objcopy" -O ihex -R .eeprom
"C:\Users\140321~1\AppData\Local\Temp\arduino_build_747743/sketch_jun28a.ino.elf"
"C:\Users\140321~1\AppData\Local\Temp\arduino_build_747743/sketch_jun28a.ino.hex"
Sketch uses 654 bytes (2%) of program storage space. Maximum is 32256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2039 bytes for local variables. Maximum is 2048 bytes.

I added some carriage returns to try and improve readability a little.
I think you can do something with one of those files to get the assembly code listing.
Or maybe use a different avrdude command. Some software thing.

sherzaad:
thank you all for your suggestions.

I am aware of that direct port manipulation can be faster but I want my code to be as x-platform as possible (for arduinos that this) and therefore would prefer to use digitalRead and digitalWrite.

So far I understand from this thread that digitalWrite is 48 clock cycles…

what about digitalRead? any ideas…

You may need to make your code conditionally compiled, so that you can get better performance on the
pro mini, but still have portable code that works on other boards.

You can measure the performance of digitalWrite in a timing loop if you want, code that has externally
visible effects won’t be optimized away:

  unsigned long before = micros () ;
  for (byte i = 0 ; i < 100 ; i++)
  {
    digitalWrite (pin, HIGH) ;
    digitalWrite (pin, LOW) ;
    digitalWrite (pin, HIGH) ;
    digitalWrite (pin, LOW) ;
    digitalWrite (pin, HIGH) ;
    digitalWrite (pin, LOW) ;
    digitalWrite (pin, HIGH) ;
    digitalWrite (pin, LOW) ;
    digitalWrite (pin, HIGH) ;
    digitalWrite (pin, LOW) ;
  }  // ten-fold unrolled, so 1000 calls are timed
  Serial.println (micros() - before) ;

(1) The Experiment Setup of Section-3 using Arduino UNO R3 says:
//------------------------------------------------------------------------------------------------------------
(a) Execution time of digitalWrite(12, HIGH); instruction with 2.2k load/no-load at the Port-pin is: 59 clock cycles of 16 MHz clkSYS (59 x 1/16000000 = 3.6875 us).

(b) Execution time of digitalWrite(12, LOW); instruction with 2.2k load/no-load at the Port-pin is: 61 clock cycles of 16 MHz clkSYS (61 x 1/16000000 = 3.8125 us).

//-----------------------------------------------------------------------------------------------------------
(c) Execution time of digitalRead(9); instruction (with internal pull-up enabled) is:
50 clock cycles of 16 MHz clkSYS (50 x 1/16000000 = 3.125 us).

(2) Experiment Methodology:
(a) Just before the execution of the digitalWrite(12, HIGH/LOW); instruction, We have started Timer-1(T1) to count the clkSYS (16 MHz) pulses. After the execution of the digitalWrite(12, HIGH/LOW); instruction, we have stopped the Timer-1.

(b) Number of counts of Timer-1 must be equal to the time that digitalWrite(12, HIGH/LOW); has spent for its own execution.

(c) We have read the content of TCNT1, and we have displayed it on the LCD.

(d) Similarly, we have found the execution time of digitalRead(9) instruction.

(3) Experiment Setup using Arduino UNO R3

(4) Arduino Program Codes (Prog1.ino)

#include <LiquidCrystal.h>
//LiquidCrystal lcd(RS, E, D4, D5, D6, D7);
LiquidCrystal lcd(5, A0, A1, A2, A3, A4);


void  setup()
{
  lcd.begin(16, 2);
  pinMode(12, OUTPUT);
  pinMode(9, INPUT_PULLUP);
  //Timer-1 as a n Internal Pulse Counter----
  TCCR1A = 0x00;   //Normal UpCounting Mode
  TCCR1B = 0x00;  //T1 is OFF
  TCNT1 = 0x00;   //Initial Value
  
  TCCR1B = 0x01;  //T1 ON; running at 16 MHz
  digitalWrite(12, HIGH);
  //digitalWrite(12, LOW);
 
  //digitalRead(9);

  TCCR1B = 0x00;    //Timer-1 OFF
  lcd.setCursor(0, 0);  //cursor position
  lcd.print(TCNT1, 10); //show Counter-1's content
}

void loop()
{
    
}

Prog1.ino (634 Bytes)

CrossRoads:
If you know the state of the pin (or don't care), you can also just toggle it by writing to the input register:

PIND = PIND | 0b00000100;  // toggle D2

or also try this: atomic and is likely fewer instructions and faster.
PIND |= 0b00000100; // toggle D2

Your statements toggles the pullup on input pins that are HIGH and clears any HIGH output pins.

PIND = 0b00000100;  // toggle D2

It looks counterintuitive, but works without touching the other bits in PORTD.

sherzaad:
I am aware of that direct port manipulation can be faster but I want my code to be as x-platform as possible (for arduinos that this) and therefore would prefer to use digitalRead and digitalWrite.

On other boards, digitalWrite might take another number of cycles.

PIND |= 0b00000100; // toggle D2

Your statements toggles the pullup on input pins that are HIGH and clears any HIGH output pins.

No, it doesn't. Careful analysis might lead you to THINK that it does (It's a RMW statement of the whole port, right? So bits that read as ones will toggle the PORTD values!) But the compiler carefully generates a SBI or CBI instruction, which apparently doesn't behave this way. (Try it! Sketch attached.)

On the other hand, "PIND |= variableBitMask;" will probably behave as you describe. And "PINH |= 0b100;" or even "PINB |= 0b101;" will as well.

"I want my code to be as x-platform as possible "

On the third hand, the PINx "complement" feature isn't present across all AVRs; not even all the ones that have been used in official Arduinos (ATmega8 doesn't do it.) So it wouldn't be a good choice for highly portable code.

The ARM platforms have drastically different timings, with much smaller gains from optimizing for special cases.

DigitalWrite() on AVR does approximately:

 portinfo = translate(pinNum);
 if portinfo.timer
    turnOffPWM(portinfo)
 disableInterrupts
 temp = portinfo.port
 if val ==  LOW
    temp &= ~portinfo.bit
 else
    temp |= portinfo.bit
 portinfo.port = temp
 restoreInterrupts

digitalRead is very similar
Note that this means that exact timing is dependent on which pin you access (whether it's associated with a timer.)
Short of "constantPORT |= constantBIT;" (which is a single instruction on AVR, for low-numbered ports), there is a "typical" speedup mechanism that consists of caching the "portinfo" in some other object (and ignoring the timer because the code can assume it was fixed at initialization time.)
Note that while the logic on ARM chips is VERY similar, some of the details are different. Also, ARM lacks "magic IO instructions", so there is never a one-instruction optimization (even with constants.)
Note that some platforms (notably "Teensy" from PJRC) have VERY different implementations that are faster than the Arduino core implementation (even on AVR-based Teensies!)

Ok, when its only one bit that shall be toggled, the compiler removes the logical bug by accident,
but only on PINx addresses that allow sbi, I'm not shure whether all PINx registers satisfy it.

Expand the method to more than one bit and you will get what you asked for, without fix.

And why should you write something that you do not really want to happen?

Just write a one to the bits you want to toggle, without any oring.

westfw:
But the compiler carefully generates a SBI or CBI instruction, which apparently doesn’t behave this way.

Only when possible. The PIN registers are not always in an address range that allows SBI / CBI instructions.

The safe / correct code excludes the bitwise-or.

(Bear in mind the forum’s intended audience.)

See #12 - the correct way to toggle is:

  PIND = 0b0000100 ;

which is a single write to the PIND reg. It must be a single write of a mask containing the
pins to toggle. The is special hardware that interprets a write to PINx as meaning "toggle the relevant
bits of PORTx"

Trying to do

  PIND |= 0b0000100 ;

is crazy and could toggle any pin on port D that happened to be an output and currently is HIGH.

PIND = 0b0000100 ;
which is a single write to the PIND reg. It must be a single write of a mask containing the
pins to toggle. The is special hardware that interprets a write to PINx as meaning "toggle the relevant
bits of PORTx"

I have been a bit confused at the very first sight of PIND = 0b00000100 thinking that how can we write data into a port which has been configured to work as input?

After reading the next two lines, I have found something that is really interesting.

My comments:
(1) I assume that Port-D has been configured to work as input port. If so, Port-D has now taken over the name PIND having unique address and marked as 'Read Only.'

(2) There is a pseudo PORTD register, but it is disconnected from the pins of PIND register via G2 (Fig-1). (Why do I call it pseudo port? It is because: Port-D takes over the name PORTD register when Port-D is configured to work as output port. Here, we have no outport port, but we need PORTD to manipulate the internal pull-up resistors (Fig-1).)

Figure-1: Internal structure of digital port-pin of ATmge328

(3) I have experimentally verified that the execution of the instruction PIND = 0b00000100 does toggle the output of the latch FF2 (Fig-1). This toggling, in fact, affects the internal pull-up resistor associated with PIND2-pin.

(4) Which one of the following is a recommended way to affect the internal pull-up of PIND-2 pin?
(i) pinMode((2, INPUT);
(ii) pinMode(2, INPUT_PULLUP);
(iii) PIND = 0b00000100; //it appears as something where we are trying to manipulate data
//that has arrived from a distant source! In fact, we can not. The
// user (PIND = 0B00000100 data does not get written into PIND.
//The data is written into FF2 latch (Fig-1).

(iv) DDRD = 0b11111011; //PIND2 is input
PORTD = 0b00000100; //enables internal pull-up
bitClear(MCUCR, 4); // Global PUD (pull-up disable bit).

(v) bitWrite(PORTB, 2, !bitRead(PORTB, 2)); //read-modify-write (toggling)

Thanks for the meticulous observation!

Table 14-1. Port Pin Configurations
DDxn PORTxn PUD(in MCUCR) I/O Pull-up Comment

0 0 X Input No Tri-state (Hi-Z)
0 1 0 Input Yes Pxn will source current if ext. pulled low.
0 1 1 Input No Tri-state (Hi-Z)
1 0 X Output No Output Low (Sink)
1 1 X Output No Output High (Source)