How exactly slow is digitalRead()?

OK, we all know that Arduino sacrifices speed for simplicity and portability, when it comes to reading and writing pin states. I know I can circumvent this with direct port manipulation, but I want to know if I can avoid this to maintain portability to different microcontrollers.

So what I would like to know is exactly how slow digitalRead() is. I can't measure this myself, I tried to look up numbers on the web and I came up with different numbers. I know that it can a different time according to whether the pin has "recently" been used as PWM or not, so let's suppose a simple case where I set a pin as INPUT and just call digitalRead() on it from time to time. Some sources I found state that it takes ~1 us, while others say ~4 us.

Who is right? A figure in the us range will be enough for me.

Depends on the crystal frequency. And how many instructions it takes in assembly to read a pin. You could check on avr forums.

Let's take a standard Arduino Uno, so clocked at 16 MHz and using the Arduino libraries.

BTW, I have just noticed the sticked I/O Benchmarks post, it says for the Uno:

Digital Pin Read Takes About 4.78 Microseconds.

So it seems I have my answer: almost 5 us! Sorry I didn't notice that before :confused:.

An alternative is to obtain the port and bit mask with some macros. I think these macros are supported in the various cores. It's slightly slower than direct port access because of the architecture of the AVR chips, but it's a lot faster than digitalRead.

    #define MY_PIN    8

    // do this once at setup
    uint8_t myPin_mask = digitalPinToBitMask(MY_PIN);
    volatile uint8_t *myPin_port = portInputRegister(digitalPinToPort(MY_PIN));

    // read the pin
    uint8_t pinValue = (*myPin_port & myPin_mask) != 0;

There's no doubt a nicer way to dress this up.

DigitalWrite is also slow. You can do the same thing except use portOutputRegister() instead.

if you know the pin number at compile time you can optimize a lot.
if the pin number is not known ...

check - digitalWriteFast, digitalReadFast, pinModeFast etc - Bugs & Suggestions - Arduino Forum -

A single I/O instruction is 0.125us. The ATmega chip is optimized for single I/O bit instructions.
A digitalRead() is about 3.6us. That is without the time for the iteration/loop. The 4.78us is with the iteration/loop included.

Using the digitalPinToPort() and so, will increase the speed a lot. It is not as fast as 0.125us, because a few variables have to be read from memory.
The digital...Fast functions are more or less portable. For fully portability you are stuck with that 3.6us.

Thanks a lot, Peter, that's what I needed to know. One more question though: all those timings are going to double when running at 8 MHz, right?

SukkoPera:
Thanks a lot, Peter, that's what I needed to know. One more question though: all those timings are going to double when running at 8 MHz, right?

Naturally.

The complication is of course, that digitalRead() includes the option of using a variable to select the port. That includes the overhead of performing a bit shift according to the variable value (essentially a FOR loop) on the mask bit which is used to select the bit. Then it must perform a zero test in order to determine which value - TRUE or FALSE - to return.

All this is easy but not simple, whereas if you know at compile time, exactly which bit you want and are happy to cope with a zero/ non-zero but probably not specified result, it can be programmed in assembler at lightning (sort of) speed. The digitalRead() function is almost like using an interpreter.

Thanks again. Yes, I understand why it is slow and the various ways readings can be sped up, but before going that way I wanted to make sure it was absolutely necessary.

Indeed, it is. I have a signal that I must sample on a 12 us clock. Of course I use an interrupt: it seems it takes about 3 us to enter the handler and 2 more to leave it. If running at 8 MHz, like I plan, that doubles and doesn't leave much time for the actual job :(.

SukkoPera:
Indeed, it is. I have a signal that I must sample on a 12 us clock. Of course I use an interrupt: it seems it takes about 3 us to enter the handler and 2 more to leave it. If running at 8 MHz, like I plan, that doubles and doesn't leave much time for the actual job :(.

The benchmark sticky you quoted is not perfect but I figured your normally going to want to act on result from reading so the for-next loop overhead sort of makes up for this.

To sample at that speed I would consider a faster MCU/clock speed as you won't have much time to do anything else. If you supply more details of what you need/want then maybe we can suggest options.

Super extra awesome interrupt tutorial : Gammon Forum : Electronics : Microprocessors : Interrupts
There is a section: "How long does it take to execute an ISR?"
With 8MHz, you have to double the numbers :frowning:

Peter_n:
A single I/O instruction is 0.125us.

Isn't it just one cycle (62.5ns at 16MHz) for reading/writing one of the GPIO ports?

Try it !
Perhaps the other cycle is loading the constant for a byte to read/write to the port.

Okay. I tried it. It took one cycle to read PINB.

In the context of the rest of a sketch of course there will be overhead. You can't just read a value into a register and call it a day. But if you do it differently, for example by using a pointer to the port instead of an immediate value, it will take more cycles.

Thanks for testing that 8)
Two two cycles is probably not the constant for the bit, but storing the value into a variable.

It depends, doesn't it? In isolation it's just one cycle. And if you do the same thing but address the port indirectly instead it's going to take longer, at least with an AVR processor.

I have some code that I originally wrote to use pin numbers defined in the library header file. That way the port accesses would be via immediate values. But in the Arduino environment having values in the header files makes it hard to go from one sketch to the next where the pins might different. So I rewrote it to use values passed in the constructor. That ultimately meant that the ports were addressed indirectly, even though technically the pin numbers were known at compile time -- just not library compile time. And it affected the performance.

Riva:
To sample at that speed I would consider a faster MCU/clock speed as you won't have much time to do anything else. If you supply more details of what you need/want then maybe we can suggest options.

Sorry for the delay, but I have some news!

First of all, what I'm doing is snooping the NES Controller Protocol (Which is based on a CD4021 shift register as probably everybody knows, see here: http://www.mit.edu/~tarvizo/nes-controller.html). I'm not just trying to drive a pad connected to the Arduino, I'm rather intercepting the signals while the console is running, and they're pretty fast.

I have no problems doing that on a 16 MHz Arduino, even digitalRead() works fine (provided that I use an interrupt to detect clock transitions) but my final objective is to create a sort of modchip to install in the console, so the least components, the better. This means that I would happily avoid the crystal, so I am trying to get things running fine at 8 MHz. It would even be better to use a smaller MCU in the end, my ideal target being the ATtiny84.

But at 8 MHz detection isn't reliable. I have tried different strategies (direct port manipulation, register variables, different interrupts, etc...), but it seems that the code is always too slow and drags behind the strict timing required. The best result I could get was getting only the first 4 buttons read correctly (over 8 ).

But yesterday I discovered SPI! I spent some time with it and got it working easily on the '328. I used the much neglected (on the Arduino) slave mode, since everything is being driven by the console. Then I switched to the '84 and I found out that there's no ready-made library and that it doesn't have real hardware SPI, then I found Atmel's Technical Note about USI and managed to get something working on the '84 as well. It's not complete yet, and I'm not sure it will work 100% (it's hard to debug without a Serial Console), but results are promising so far.

So I guess we are a bit out of topic by now, but anyway... Any hints?

The tiny85 has a PLL clock mode so you can run it at something around 16MHz with no external xtal. Maybe it will do the job or another similar mode chip?

Interesting! But the '84 doesn't have that, right? Too bad, I already had in mind a version for the '85 but I don't think I will be able to squeeze all the features in :confused:.

What do you mean with "another similar mode chip"?

The ATtiny84 doesn't have that PLL, but you can rev up the internal oscillator to around 14MHz, give or take:

The Tiny chips also lack a MUL instruction. Depending on what you're doing that can have a noticeable effect on execution time.