digitalWrite in question ?

max777 · July 19, 2019, 6:49am

Hi

That is SPI communication.

You can see on the linked screen a long delay (about 7uS) between end of transmission and CS high.

Code is below (regWrite function) and CSN_HIGH macro is

#define CSN_HIGH digitalWrite(csnPin,HIGH);

There is no delay between two access to SPI.transfer, and I can't imagine how digitalWrite could have a latency of 7 uS.

UNO board at 16MHz ; SPI speed 4MHz

A look to spi.h doesnt explain either

// Write to the SPI bus (MOSI pin) and also receive (MISO pin)
inline static uint8_t transfer(uint8_t data) {
SPDR = data;
/*

The following NOP introduces a small delay that can prevent the wait
loop form iterating when running at the maximum speed. This gives
about 10% more speed, even if it seems counter-intuitive. At lower
speeds it is unnoticed.
*/
asm volatile("nop");
while (!(SPSR & _BV(SPIF))) ; // wait
return SPDR;
}

Any idea ?

Robin2 · July 19, 2019, 11:11am

max777:
can't imagine how digitalWrite could have a latency of 7 uS.

The digitalWrite() function is slow because it has to figure out which pin on which I/O port needs to be changed.

There is a digitalWriteFast library which is very much quicker - but it requires the I/O pin to be known at compile time.

...R

Koepel · July 19, 2019, 11:17am

You could measure the time that is used by digitalWrite().
You can use my millis_measure_time.ino sketch and replace analogWrite() with digitalWrite() in three places.
I did not test it myself, but I know from the past that it is 5 µs.

The ATmega328p only needs 0.125 µs to change a pin. The 5 µs is Arduino overhead.
The SPI code uses the SPI hardware of the ATmega328p, and uses even "inline" code to control the hardware. That is very little overhead.

So everything seems just as it should be. If you want to know exactly what is going on, then you should read the assembly code that is generated by the compiler.
If you change something in other parts of the sketch, then the compiler could decide to optimize this code in a different way, and that could change the timings.

max777 · July 20, 2019, 10:48am

Hi !

@Robin2

Thx, I understand what you mean, but...

As 7 uS are a quite long time to get 2 adresses and read/modify/write a byte for a 16MHz MCU, I spend some minutes to write the below lines to evaluate the duration ;

Function digitalSet() seems to be doing same operations than digitalWrite() ; digitalSet is not "pure" assembly langage so should be far slower than digitalWrite();

That's not the case :

max777 · July 20, 2019, 10:58am

(oups wrong typing)

I have about 8uS for digitalWrite() and 5 for digitalSet()...
DigitalWrite() is surely doing very complicated things ....

@Koepel

Thx, Arduino IDE has secrets

Here I see assembly code slower than compiled C++ ; I had a similar issue with the powerup time on ESP66 who is far too long to allow real lowpower applications.

++

max777 · July 20, 2019, 11:04am

hummm forgotten the "below lines"

#define NBPORTS 3
#define NBPPPORT 8
#define NBPIN   NBPORTS*NBPPPORT

uint8_t ports[NBPORTS];
byte mask[]={0x01,0x02,0x04,0x08,0x10,0x20,0x40,0x80};

uint8_t v=0,l=0;

uint8_t pin=6;

void setup() {
 Serial.begin(112500);
 Serial.println("ready");
}

void digitalSet(uint8_t pin,bool level)
{
 byte scratchPort;
 // read byte
 scratchPort=ports[pin/NBPPPORT];
 // modify byte
 scratchPort &= ~mask[pin%NBPPPORT];
 if(level){scratchPort |= mask[pin%NBPPPORT];}
 // write it
 ports[pin/NBPPPORT]=scratchPort;
}



char getch()
{
 if(Serial.available()){
   return Serial.read();
 }
 return 0;
}


void loop() {

// show pins
for(int i=0;i<NBPORTS;i++){
 for(int j=0;j<NBPPPORT;j++){
   Serial.print((ports[i]>>(j))&0x01);Serial.print(" ");
}}Serial.println();

// get new value
v=0;l=0;
Serial.print("enter pin number (0-9)");
while(v==0){v=getch();}v-=48;
Serial.println(v);
Serial.print("level (0-1) ");
while(l==0){l=getch();}l-=48;
Serial.println(l);

// get time for 100 loop then
// set new value 100 times
 long time_beg=micros();
 char a=1;
 for(uint8_t k=0;k<100;k++){
   //digitalSet(v,l); 
   a=(a>>k)+1;
 }
 long offset=micros()-time_beg;
Serial.print(a);Serial.print(" offset time=");Serial.println((long)offset);

 time_beg=micros();
 for(uint8_t k=0;k<100;k++){
   digitalSet(v,l); 
   //digitalWrite(pin,HIGH);
   a=(a>>k)+2;
 }
Serial.print(a);Serial.print("  total time=");Serial.println(micros()-time_beg);
Serial.print(a);Serial.print(" durat=");Serial.println((float)(micros()-time_beg-offset)/100);
}

Koepel · July 20, 2019, 1:49pm

Arduino is doing "things", not complicated things
This is the digitalWrite() for the AVR boards: ArduinoCore-avr/wiring_digital.c at master · arduino/ArduinoCore-avr · GitHub.

8µs is suspicious, because the micros() function is accurate to 4µs for an Arduino Uno.
It turns out that it was a flaw in my sketch, I will try to make a new version on Github later on.
Below is a accurate test.
I think the digitalWrite() has been improved, because I measure 3.2 µs for digitalWrite().

// Test how long digitalWrite is with an Arduino Uno
// Timer0 is used by Arduino and that is still running.

// The macro X1000 executes something 1000 times.
#define X10(a)      a;a;a;a;a;a;a;a;a;a
#define X1000(a)    X10(X10(X10(a)))

// Using global or local variables or constants make a difference for the time.
const byte pin = 13;
byte level = HIGH;

void setup()
{
  Serial.begin( 9600);
}

void loop()
{
  unsigned long t1, t2, result;

  t1 = micros();
  X1000(digitalWrite( pin, level));       // The function under test is called 1000 times
  t2 = micros();
  result = t2 - t1;

  Serial.print( result);
  Serial.println( " ns");

  delay( 500);
}

max777 · July 20, 2019, 2:56pm

Suspicious you said...

I made an update to 1.8.9 with absolutely no changes...

Can you have a look on the logic analyser window (attached to my first post) ?
About 1 uS from end of transmission to start of next transmission. I guess there is not a different delay from end of SPI.transfer to digitalWrite() beginning.
About 7 uS from end of transmission to CS high ; so digitalWrite means "at least" 6uS + the time to "return" after setting of CS. 8uS seems not so suspicious...

If there is a way to improve to your 3,2uS , I'm very interested.

Some times compiler are optimizing repeated sequences ; that why I put variable code inside the loop.

max777 · July 20, 2019, 3:17pm

I'm very sorry, I didnt see the code for digitalWrite...

So much function call means so much time... I will write my own digitalWrite()

With digitalWriteFast I get a little less 7uS...

thx

Koepel · July 20, 2019, 3:58pm

You can't see this ?
ArduinoCore-avr/wiring_digital.c at master · arduino/ArduinoCore-avr · GitHub.

That is the open source code of digitalWrite(). The highlighted line is at the top.

The AVR microcontrollers have a special assembly instruction to change an output pin. That is the "direct port manipulation" and takes 125ns. You can use that in your code, or use the digitalWriteFast, which is trying to do that if everything is known during compilation.

The Arduino macro bitSet() and bitClear() can be used with a register and are translated into the 125ns direct port manipulation.

The SPI.transfer() function has to return, and the Arduino digitalWrite() changes the pin at the end of the function. Perhaps an extra microsecond for the variable 'csnPin' and then there are still 2 µs that I don't know Perhaps an extra microsecond because the Arduino optimizes for size instead of speed. The remaining microsecond is within the margin of error

Robin2 · July 20, 2019, 4:28pm

max777:
With digitalWriteFast I get a little less 7uS...

If it only appears to be slightly faster than the regular digitalWrite() then there is something wrong with your test procedure..

Are you timing (say) 10,000 repeats of the function call so as to eliminate, or at least minimise, short term timing errors?

...R

WattsThat · July 20, 2019, 5:16pm

Why all the hand wringing over a couple of microseconds at the end of a library function? Either tell us why ~4 microseconds really matters or it’s just an academic exercise.

If those few microseconds matter that much to your application, you should determine how much of the time you think is being wasted is actually due to the SPI transfer and how much is under your control with the toggling of the CS pin. Only then can you decide which bits of code need attention but most importantly, is it even possible to realize a reduction in the time required when using a hardware based function.

Frankly, this thread has all the classic signs of being an x-y problem.

CrossRoads · July 20, 2019, 5:58pm

Why are you not running the SPI clock at 8 MHz?

You can toggle IO pins with a write to the PINx register, only takes 1 clock cycle.

So:

PINB = 0x00000100; // toddle D10 to low on a 328P, make sure you know it was high before this
SPDR = yourdata; // clocks out the data
nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop; // wait for data to clock out, vs waiting for interrupt
PINB = 0x00000100; // toddle D10 on a 328P // CS back high

Using this method, you send out a byte in about 20 clock cycles, 1.25uS. Need this at the top of the sketch to define the nop;

#define NOP __asm__ __volatile__ ("nop");

I did a project where I blasted out 45 bytes of data, updated every 50uS (20 KHz rate), using an interrupt from an external source to sync up the data sending.

Nick Gammon has much more detail here.

max777 · July 22, 2019, 10:50am

To close the subject...

I'll use ATMEGA328 @ 8MHz for very low power application so I can't go faster than 4MHz for SPI.

As I'll use 8MHz, 8uS delay @16MHz will be a 16uS delay...

Multiple Spi transfer AND bit manipulation around, take more time than slave device operations, that why I had to examine where time is wasted.

I solve that issue with the excellent advice from Koeppel (bitset and bitclear) I'm a newbie with Atmel so I didnt know these macros. (@Koeppel ... my poor english... I've seen the link and the source code... that is very well explaining why digitalWrite() is so slow. thx).

But... Special thx to Robin2 ! I will remember the lesson (mistyping... 7 was 1).
With 100 loops I found 4,36uS for digitalWrite() ; with 1000 -> 3,38uS and with 10000 -> 3,28.
Same for digitalWriteFast 1,24uS / 0,24 / 0,14
(toogle a bit)

For me it's a mistery...

Anyway, the "real" time to execute 1 single digitalWrite after SPI.transfer to put CS high is about 8uS (logic analyser mesure).

Thx all for your help and sorry for time spend

Robin2 · July 22, 2019, 12:58pm

The change from 3.28 µsecs to 0.14µsecs suggests that digitalWriteFast() is 23 times faster. And 0.14µsecs is the time for 2 instructions.

...R

Topic		Replies	Views
Bit-Banging Old Keyboard Protocols Networking, Protocols, and Devices	3	66	January 9, 2026
DigitalWrite & DigitalRead and Delay in ns 3rd Party Boards	90	1183	April 22, 2025
Code efficiency / Performance / Energy saving question around digitalWrite / digitalRead Programming	37	246	December 31, 2025
digitalWriteFast for GIGA? GIGA R1 WiFi	8	939	October 23, 2023
GIGA R1 Wifi : Fast digital output handling Programming	63	621	February 19, 2026

digitalWrite in question ?

Related topics