Go Down

Topic: digitalWrite in question ? (Read 245 times) previous topic - next topic

max777

Jul 19, 2019, 08:49 am Last Edit: Jul 19, 2019, 11:02 am by max777
Hi

That is SPI communication.

You can see on the linked screen a long delay (about 7uS) between end of transmission and CS high.

Code is below (regWrite function) and CSN_HIGH macro is

#define CSN_HIGH   digitalWrite(csnPin,HIGH);

There is no delay between two access to SPI.transfer, and I can't imagine how digitalWrite could have a latency of 7 uS.

UNO board at 16MHz ; SPI speed 4MHz

A look to spi.h doesnt explain either

  // Write to the SPI bus (MOSI pin) and also receive (MISO pin)
  inline static uint8_t transfer(uint8_t data) {
    SPDR = data;
    /*
     * The following NOP introduces a small delay that can prevent the wait
     * loop form iterating when running at the maximum speed. This gives
     * about 10% more speed, even if it seems counter-intuitive. At lower
     * speeds it is unnoticed.
     */
    asm volatile("nop");
    while (!(SPSR & _BV(SPIF))) ; // wait
    return SPDR;
  }

Any idea ?

Robin2

can't imagine how digitalWrite could have a latency of 7 uS.
The digitalWrite() function is slow because it has to figure out which pin on which I/O port needs to be changed.

There is a digitalWriteFast library which is very much quicker - but it requires the I/O pin to be known at compile time.

...R
Two or three hours spent thinking and reading documentation solves most programming problems.

Koepel

#2
Jul 19, 2019, 01:17 pm Last Edit: Jul 19, 2019, 03:16 pm by Koepel
You could measure the time that is used by digitalWrite().
You can use my millis_measure_time.ino sketch and replace analogWrite() with digitalWrite() in three places.
I did not test it myself, but I know from the past that it is 5 µs.

The ATmega328p only needs 0.125 µs to change a pin. The 5 µs is Arduino overhead.
The SPI code uses the SPI hardware of the ATmega328p, and uses even "inline" code to control the hardware. That is very little overhead.

So everything seems just as it should be. If you want to know exactly what is going on, then you should read the assembly code that is generated by the compiler.
If you change something in other parts of the sketch, then the compiler could decide to optimize this code in a different way, and that could change the timings.

max777

Hi !

@Robin2

Thx, I understand what you mean, but...

As 7 uS are a quite long time to get 2 adresses and read/modify/write a byte for a 16MHz MCU, I spend some minutes to write the below lines to evaluate the duration ;

Function digitalSet() seems to be doing same operations than digitalWrite() ; digitalSet is not "pure" assembly langage so should be far slower than digitalWrite();

That's not the case :


max777

(oups wrong typing)

I have about 8uS for digitalWrite() and 5 for digitalSet()...
DigitalWrite() is surely doing very complicated things ....


@Koepel

Thx, Arduino IDE has secrets  :)

Here I see assembly code slower than compiled C++ ; I had a similar issue with the powerup time on ESP66 who is far too long to allow real lowpower applications.

++

max777

hummm forgotten the "below lines"


Code: [Select]

#define NBPORTS 3
#define NBPPPORT 8
#define NBPIN   NBPORTS*NBPPPORT

uint8_t ports[NBPORTS];
byte mask[]={0x01,0x02,0x04,0x08,0x10,0x20,0x40,0x80};

uint8_t v=0,l=0;

uint8_t pin=6;

void setup() {
 Serial.begin(112500);
 Serial.println("ready");
}

void digitalSet(uint8_t pin,bool level)
{
 byte scratchPort;
 // read byte
 scratchPort=ports[pin/NBPPPORT];
 // modify byte
 scratchPort &= ~mask[pin%NBPPPORT];
 if(level){scratchPort |= mask[pin%NBPPPORT];}
 // write it
 ports[pin/NBPPPORT]=scratchPort;
}



char getch()
{
 if(Serial.available()){
   return Serial.read();
 }
 return 0;
}


void loop() {

// show pins
for(int i=0;i<NBPORTS;i++){
 for(int j=0;j<NBPPPORT;j++){
   Serial.print((ports[i]>>(j))&0x01);Serial.print(" ");
}}Serial.println();

// get new value
v=0;l=0;
Serial.print("enter pin number (0-9)");
while(v==0){v=getch();}v-=48;
Serial.println(v);
Serial.print("level (0-1) ");
while(l==0){l=getch();}l-=48;
Serial.println(l);

// get time for 100 loop then
// set new value 100 times
 long time_beg=micros();
 char a=1;
 for(uint8_t k=0;k<100;k++){
   //digitalSet(v,l);
   a=(a>>k)+1;
 }
 long offset=micros()-time_beg;
Serial.print(a);Serial.print(" offset time=");Serial.println((long)offset);

 time_beg=micros();
 for(uint8_t k=0;k<100;k++){
   digitalSet(v,l);
   //digitalWrite(pin,HIGH);
   a=(a>>k)+2;
 }
Serial.print(a);Serial.print("  total time=");Serial.println(micros()-time_beg);
Serial.print(a);Serial.print(" durat=");Serial.println((float)(micros()-time_beg-offset)/100);
}

Koepel

#6
Jul 20, 2019, 03:49 pm Last Edit: Jul 20, 2019, 04:12 pm by Koepel
Arduino is doing "things", not complicated things  ;)
This is the digitalWrite() for the AVR boards: https://github.com/arduino/ArduinoCore-avr/blob/master/cores/arduino/wiring_digital.c#L138.

8µs is suspicious, because the micros() function is accurate to 4µs for an Arduino Uno.
It turns out that it was a flaw in my sketch, I will try to make a new version on Github later on.
Below is a accurate test.
I think the digitalWrite() has been improved, because I measure 3.2 µs for digitalWrite().

Code: [Select]

// Test how long digitalWrite is with an Arduino Uno
// Timer0 is used by Arduino and that is still running.

// The macro X1000 executes something 1000 times.
#define X10(a)      a;a;a;a;a;a;a;a;a;a
#define X1000(a)    X10(X10(X10(a)))

// Using global or local variables or constants make a difference for the time.
const byte pin = 13;
byte level = HIGH;

void setup()
{
  Serial.begin( 9600);
}

void loop()
{
  unsigned long t1, t2, result;

  t1 = micros();
  X1000(digitalWrite( pin, level));       // The function under test is called 1000 times
  t2 = micros();
  result = t2 - t1;

  Serial.print( result);
  Serial.println( " ns");

  delay( 500);
}

max777

Suspicious you said...

I made an update to 1.8.9 with absolutely no changes...

Can you have a look on the logic analyser window (attached to my first post) ?
About 1 uS from end of transmission to start of next transmission. I guess there is not a different delay from end of SPI.transfer to digitalWrite() beginning.
About 7 uS from end of transmission to CS high ; so digitalWrite means "at least" 6uS + the time to "return" after setting of CS. 8uS seems not so suspicious...

If there is a way to improve to your 3,2uS , I'm very interested.

Some times compiler are optimizing repeated sequences ; that why I put variable code inside the loop.


max777

I'm very sorry, I didnt see the code for digitalWrite...

So much function call means so much time... I will write my own digitalWrite()  :)

With digitalWriteFast I get a little less 7uS...


thx


Koepel

#9
Jul 20, 2019, 05:58 pm Last Edit: Jul 20, 2019, 06:01 pm by Koepel
You can't see this ?
https://github.com/arduino/ArduinoCore-avr/blob/master/cores/arduino/wiring_digital.c#L138.

That is the open source code of digitalWrite(). The highlighted line is at the top.

The AVR microcontrollers have a special assembly instruction to change an output pin. That is the "direct port manipulation" and takes 125ns. You can use that in your code, or use the digitalWriteFast, which is trying to do that if everything is known during compilation.

The Arduino macro bitSet() and bitClear() can be used with a register and are translated into the 125ns direct port manipulation.

The SPI.transfer() function has to return, and the Arduino digitalWrite() changes the pin at the end of the function. Perhaps an extra microsecond for the variable 'csnPin' and then there are still 2 µs that I don't know  :smiley-confuse: Perhaps an extra microsecond because the Arduino optimizes for size instead of speed. The remaining microsecond is within the margin of error ;)

Robin2

With digitalWriteFast I get a little less 7uS...
If it only appears to be slightly faster than the regular digitalWrite() then there is something wrong with your test procedure..

Are you timing (say) 10,000 repeats of the function call so as to eliminate, or at least minimise, short term timing errors?

...R
Two or three hours spent thinking and reading documentation solves most programming problems.

WattsThat

Why all the hand wringing over a couple of microseconds at the end of a library function? Either tell us why ~4 microseconds really matters or it's just an academic exercise.

If those few microseconds matter that much to your application, you should determine how much of the time you think is being wasted is actually due to the SPI transfer and how much is under your control with the toggling of the CS pin. Only then can you decide which bits of code need attention but most importantly, is it even possible to realize a reduction in the time required when using a hardware based function.

Frankly, this thread has all the classic signs of being an x-y problem.
Vacuum tube guy in a solid state world

CrossRoads

Why are you not running the SPI clock at 8 MHz?


You can toggle IO pins with a write to the PINx register, only takes 1 clock cycle.

So:
Code: [Select]

PINB = 0x00000100; // toddle D10 to low on a 328P, make sure you know it was high before this
SPDR = yourdata; // clocks out the data
nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop; // wait for data to clock out, vs waiting for interrupt
PINB = 0x00000100; // toddle D10 on a 328P // CS back high


Using this method, you send out a byte in about 20 clock cycles, 1.25uS. Need this at the top of the sketch to define the nop;
Code: [Select]
#define NOP __asm__ __volatile__ ("nop");


I did a project where I blasted out 45 bytes of data, updated every 50uS (20 KHz rate), using an interrupt from an external source to sync up the data sending.

Nick Gammon has much more detail here.
Designing & building electrical circuits for over 25 years.  Screw Shield for Mega/Due/Uno,  Bobuino with ATMega1284P, & other '328P & '1284P creations & offerings at  my website.

max777

To close the subject...

I'll use ATMEGA328 @ 8MHz for very low power application so I can't go faster than 4MHz for SPI.

As I'll use 8MHz, 8uS delay @16MHz will be a 16uS delay...

Multiple Spi transfer AND bit manipulation around, take more time than slave device operations, that why I had to examine where time is wasted.

I solve that issue with the excellent advice from Koeppel (bitset and bitclear) I'm a newbie with Atmel so I didnt know these macros. (@Koeppel ... my poor english... I've seen the link and the source code... that is very well explaining why digitalWrite() is so  slow. thx).


But... Special thx to Robin2 ! I will remember the lesson (mistyping... 7 was 1).
With 100 loops  I found 4,36uS for digitalWrite() ; with 1000 -> 3,38uS and with 10000 ->  3,28.
Same for digitalWriteFast 1,24uS / 0,24  / 0,14
(toogle a bit)


For me it's a mistery...

Anyway, the "real" time to execute 1 single digitalWrite after SPI.transfer to put CS high is about 8uS (logic analyser mesure).

Thx all for your help and sorry for time spend


Robin2

The change from 3.28 µsecs to 0.14µsecs suggests that digitalWriteFast() is 23 times faster. And 0.14µsecs is the time for 2 instructions.

...R
Two or three hours spent thinking and reading documentation solves most programming problems.

Go Up