Question about ESP32 speed?

Hello,
I am a newby with the esp32 and to make at first a speedtest. I write with the arduino ide:
digitalWrite(0, HIGH);
digitalWrite(0, LOW);
I look the sqaresignal on an oscilloscopescreen and see that the lowcycle is three times longer as the highcycle and
the freqency is 300kHz.
Can someone tell why the cycles are diffrent and why the frequency is so slow?
Many thanks foe every answer
Hans

Well, digital write and digital read are functions. There are some code behind them. They are not a single assembler instruction, so they take a hundreds of processor ticks to execure. Thats why you see 300Khz and not 240Mhz.

For the difference in timing please make an experiment. Insert short delay, like delay(10); after each digitalWrite and check the timing again

If you just need a squarewave signal generation then ESP32 has a functions for that: ledcWriteTone

You could also show your code and say which ESP32 variant you are using.

Is your "low" setting immediately followed by the potentially cache-defeating looping instructions?

if you wrote

void loop() {
  digitalWrite(0, HIGH);
  digitalWrite(0, LOW);
}

then the assembly code that gets generated is

set the pin HIGH
set the pin LOW
go back to the main() function which called loop
check a few things
go back to the loop() again

and so the code does

go to the loop() (save context)
set the pin HIGH
set the pin LOW
go back to the main() function which called loop
check a few things
go to the loop() (save context)
set the pin HIGH
set the pin LOW
go back to the main() function which called loop
check a few things
go to the loop() (save context)
...

as you can see there is lots that happens in between LOW and HIGH whereas there is "nothing" happening between HIGH and LOW ➜ that would explain what you see.

1 Like

Try this in setup().

void setup ()  {
.
.
.
  while ( true )  {
    digitalWrite(0, HIGH);
    digitalWrite(0, LOW);
  }

}

you still have a branch at the end of the while whereas there is no instruction between the HIGH and LOW

it will be faster than the loop() — which does more — but it's also likely to fail because of the watchdog.

True. As suggested above a small delay in-between will most likely smooth that out.

In fact, more like:

  • Save some stuff to stack
  • Look up some stuff about pin and port numbers
  • Check a whole bunch of things
  • Then conditionally set a pin high
  • Return from function

Probably 10-20 lines of assembler at least.

1 Like

Here is the complete Code:
void setup () {
pinMode(0, OUTPUT);
}

void loop () {
digitalWrite(0, HIGH);
digitalWrite(0, LOW);
}

The board is a ESP32_C3 Super Mini
What will be the code in assembler?
Thanks
Hans

No, not really... Don't forget the optimiser will come into play

the assembly generated on a UNO for

void setup () {
  pinMode(13, OUTPUT);
}

void loop () {
  digitalWrite(13, HIGH);
  digitalWrite(13, LOW);
}

looks like this (for the loop part)

void loop () {
  digitalWrite(13, HIGH);
 2c4:	81 e0       	ldi	r24, 0x01	; 1
 2c6:	0e 94 70 00 	call	0xe0	; 0xe0 <digitalWrite.constprop.0>
  digitalWrite(13, LOW);
 2ca:	80 e0       	ldi	r24, 0x00	; 0
 2cc:	0e 94 70 00 	call	0xe0	; 0xe0 <digitalWrite.constprop.0>
 2d0:	20 97       	sbiw	r28, 0x00	; 0
 2d2:	c1 f3       	breq	.-16     	; 0x2c4 <main+0xc0>

you can actually see that the main() got optimized out as well as the explicit storage of the pin number as the optimizer inlined it directly within the digitalWrite implementation

the sbiw is fun to see (subtracting 0 won't do much but resets the zero flag ) so that breq .-16 will always happen.

so basically you have the same 2 lines for the digitalWrite — except for one the r24 register is loaded with 1 (HIGH) and 0 (LOW) for the other

the difference in pulse rate will come from the 2 instructions sbiw and breq

3 Likes

The code may be strongly entangled with the RTOS run time system and not really recognisable as a stand alone entity.

1 Like

I found digitalReadFast and digitalWriteFast. Is this also available for a ESP32?

You should say what you are trying to do and what waveform (frequency, dutycycle etc.) you are expecting to appear on a digital pin. There is almost certainly a better way that simply using digitalWrite() to achieve it.
One suggestion appears in post #2.

Oh! A RISC-V variant!

What will be the code in assembler?

Ugh. extracted from 4MB of slowly generated RISC-V code:
(assorted pithy comments noted with ";;")

;;  main task...

  for (;;) {
    if (loopTaskWDTEnabled) {
      esp_task_wdt_reset();
    }
    loop();
    if (serialEventRun) {
42000c1e:	42001437          	lui	s0,0x42001
    if (loopTaskWDTEnabled) {
42000c22:	3fc8f4b7          	lui	s1,0x3fc8f
    if (serialEventRun) {
42000c26:	00c40413          	addi	s0,s0,12 # 4200100c <serialEventRun()>
    yieldIfNecessary();
42000c2a:	3fa5                	jal	42000ba2 <yieldIfNecessary()>
    if (loopTaskWDTEnabled) {
42000c2c:	9384c783          	lbu	a5,-1736(s1) # 3fc8e938 <loopTaskWDTEnabled>
42000c30:	c399                	beqz	a5,42000c36 <loopTask(void*)+0x46>
      esp_task_wdt_reset();
42000c32:	2ac080ef          	jal	ra,42008ede <esp_task_wdt_reset>
    loop();
42000c36:	bf0ff0ef          	jal	ra,42000026 <loop()>
    if (serialEventRun) {
42000c3a:	d865                	beqz	s0,42000c2a <loopTask(void*)+0x3a>
      serialEventRun();
42000c3c:	2ec1                	jal	4200100c <serialEventRun()>
42000c3e:	b7f5                	j	42000c2a <loopTask(void*)+0x3a>
    }
  }

;; loop(), itself...

42000026 <loop()>:
42000026:	1141                	addi	sp,sp,-16
42000028:	4585                	li	a1,1
4200002a:	4505                	li	a0,1
4200002c:	c606                	sw	ra,12(sp)
4200002e:	28c1                	jal	420000fe <__digitalWrite>
42000030:	40b2                	lw	ra,12(sp)
42000032:	4581                	li	a1,0
42000034:	4505                	li	a0,1
42000036:	0141                	addi	sp,sp,16
42000038:	a0d9                	j	420000fe <__digitalWrite>


;; digitalWrite()

420000fe <__digitalWrite>:

extern void ARDUINO_ISR_ATTR __digitalWrite(uint8_t pin, uint8_t val) {
420000fe:	1141                	addi	sp,sp,-16
42000100:	c422                	sw	s0,8(sp)
42000102:	c606                	sw	ra,12(sp)
42000104:	c226                	sw	s1,4(sp)
#ifdef RGB_BUILTIN
;; Oh!  Special code to make a neopixel behave like a normal LED!

  if (pin == RGB_BUILTIN) {
42000106:	47f9                	li	a5,30
extern void ARDUINO_ISR_ATTR __digitalWrite(uint8_t pin, uint8_t val) {
42000108:	842e                	mv	s0,a1
  if (pin == RGB_BUILTIN) {
4200010a:	02f51063          	bne	a0,a5,4200012a <__digitalWrite+0x2c>
  if (perimanGetPinBus(pin, ESP32_BUS_TYPE_GPIO) != NULL) {
    gpio_set_level((gpio_num_t)pin, val);
  } else {
    log_e("IO %i is not set as GPIO.", pin);
  }
}
4200010e:	4422                	lw	s0,8(sp)
42000110:	40b2                	lw	ra,12(sp)
42000112:	4492                	lw	s1,4(sp)
    const uint8_t comm_val = val != 0 ? RGB_BRIGHTNESS : 0;
42000114:	00b036b3          	snez	a3,a1
    RGB_BUILTIN_storage = val;
42000118:	3fc8f7b7          	lui	a5,0x3fc8f
4200011c:	92b789a3          	sb	a1,-1741(a5) # 3fc8e933 <RGB_BUILTIN_storage>
    const uint8_t comm_val = val != 0 ? RGB_BRIGHTNESS : 0;
42000120:	069a                	slli	a3,a3,0x6
    neopixelWrite(RGB_BUILTIN, comm_val, comm_val, comm_val);
42000122:	8636                	mv	a2,a3
42000124:	85b6                	mv	a1,a3
}
42000126:	0141                	addi	sp,sp,16
    neopixelWrite(RGB_BUILTIN, comm_val, comm_val, comm_val);
42000128:	a38d                	j	4200068a <neopixelWrite>
  if (perimanGetPinBus(pin, ESP32_BUS_TYPE_GPIO) != NULL) {
4200012a:	4585                	li	a1,1
4200012c:	84aa                	mv	s1,a0
4200012e:	2655                	jal	420004d2 <perimanGetPinBus>
42000130:	c909                	beqz	a0,42000142 <__digitalWrite+0x44>
    gpio_set_level((gpio_num_t)pin, val);
42000132:	85a2                	mv	a1,s0
}
42000134:	4422                	lw	s0,8(sp)
42000136:	40b2                	lw	ra,12(sp)
    gpio_set_level((gpio_num_t)pin, val);
42000138:	8526                	mv	a0,s1
}
4200013a:	4492                	lw	s1,4(sp)
4200013c:	0141                	addi	sp,sp,16
    gpio_set_level((gpio_num_t)pin, val);
4200013e:	3ca0306f          	j	42003508 <gpio_set_level>
}


;; gpio_set_level (wow, that's a lot of code for the bottom level function!!)
(Ok, it's not "bottom level" in the source.  There are a chain of inline functions...)

42003508 <gpio_set_level>:
42003508:	1141                	addi	sp,sp,-16
4200350a:	c606                	sw	ra,12(sp)
4200350c:	c422                	sw	s0,8(sp)
4200350e:	c226                	sw	s1,4(sp)
42003510:	00054f63          	bltz	a0,4200352e <gpio_set_level+0x26>
42003514:	862a                	mv	a2,a0
42003516:	842a                	mv	s0,a0
42003518:	00400537          	lui	a0,0x400
4200351c:	84ae                	mv	s1,a1
4200351e:	157d                	addi	a0,a0,-1 # 3fffff <_esp_mmu_block_size+0x3effff>
42003520:	4581                	li	a1,0
42003522:	fdffd097          	auipc	ra,0xfdffd
42003526:	30e080e7          	jalr	782(ra) # 40000830 <__lshrdi3>
4200352a:	8905                	andi	a0,a0,1
4200352c:	e521                	bnez	a0,42003574 <gpio_set_level+0x6c>
4200352e:	fe387097          	auipc	ra,0xfe387
42003532:	ee2080e7          	jalr	-286(ra) # 4038a410 <esp_log_timestamp>  ;;; Really??
42003536:	3c0315b7          	lui	a1,0x3c031   ;; Eww!
4200353a:	3c0318b7          	lui	a7,0x3c031   ;;  doesn't 
4200353e:	3c0317b7          	lui	a5,0x3c031   ;;   seem
42003542:	3c031637          	lui	a2,0x3c031   ;;    efficient!
42003546:	86aa                	mv	a3,a0
42003548:	f4858713          	addi	a4,a1,-184 # 3c030f48 <__func__.6+0x28>
4200354c:	4505                	li	a0,1
4200354e:	f6888893          	addi	a7,a7,-152 # 3c030f68 <__func__.6+0x48>
42003552:	0ee00813          	li	a6,238
42003556:	1b078793          	addi	a5,a5,432 # 3c0311b0 <__FUNCTION__.35>
4200355a:	f5060613          	addi	a2,a2,-176 # 3c030f50 <__func__.6+0x30>
4200355e:	f4858593          	addi	a1,a1,-184
42003562:	1a9100ef          	jal	ra,42013f0a <__wrap_esp_log_write>
42003566:	10200513          	li	a0,258
4200356a:	40b2                	lw	ra,12(sp)
4200356c:	4422                	lw	s0,8(sp)
4200356e:	4492                	lw	s1,4(sp)
42003570:	0141                	addi	sp,sp,16
42003572:	8082                	ret
42003574:	8a81a783          	lw	a5,-1880(gp) # 3fc8bca8 <gpio_context>
42003578:	4398                	lw	a4,0(a5)
4200357a:	04000793          	li	a5,64
4200357e:	00879533          	sll	a0,a5,s0
42003582:	8119                	srli	a0,a0,0x6
42003584:	c889                	beqz	s1,42003596 <gpio_set_level+0x8e>
42003586:	471c                	lw	a5,8(a4)
42003588:	fc0006b7          	lui	a3,0xfc000
4200358c:	8ff5                	and	a5,a5,a3
4200358e:	8fc9                	or	a5,a5,a0
42003590:	c71c                	sw	a5,8(a4)
42003592:	4501                	li	a0,0
42003594:	bfd9                	j	4200356a <gpio_set_level+0x62>
42003596:	475c                	lw	a5,12(a4)
42003598:	fc0006b7          	lui	a3,0xfc000
4200359c:	8ff5                	and	a5,a5,a3
4200359e:	8fc9                	or	a5,a5,a0
420035a0:	c75c                	sw	a5,12(a4)
420035a2:	bfc5                	j	42003592 <gpio_set_level+0x8a>

Suffice to say that it could all be done much faster, if anyone cared very much...

2 Likes

Hello,
my goal is not to generate a waveform. I'm just looking for a way to make setting and reading I/Os faster so that a program runs faster.

Don't use digitalWrite/Read; use PWM instead.

1 Like

A bit too late too reply but may be it will help others:

On a basic ESP32 one can reach 16Mhz speed. On ESP32S3 it is up to 80Mhz but is a different story.

Code below switches pin 2 at approximately 16 Mhz. I measured it. It takes 12 CPU cycles to toggle a pin. Hopefully this helps. Code works on any ESP32.

To reach 80Mhz rate on ESP32S3 you have to read about "dedicated GPIO"

#include <Arduino.h>

#include <hal/gpio_ll.h>            // gpio_ll_ functions
#include <soc/gpio_struct.h>   // GPIO global variable 

#define LED 2

void setup() {
  Serial.begin(115200);
  pinMode(LED,OUTPUT);
}

void loop() {
  
  while( 1 ) {
    gpio_ll_set_level(&GPIO,2,0);  // digitalWrite(2,LOW)
    gpio_ll_set_level(&GPIO,2,1);  // digitaelWrite(2.HIGH)
  }

}

There also faster versions of digitalRead() as well

Are we talking ESP32 (tensilica xtensa CPU) or ESP32C (RISC-V CPU)?

On ESP32:

            hw->out_w1tc = (1 << gpio_num);
400d1674:	0020c0        	memw
400d1677:	3899      	s32i.n	a9, a8, 12
            hw->out_w1ts = (1 << gpio_num);
400d1679:	0020c0        	memw
400d167c:	2899      	s32i.n	a9, a8, 8
400d167e:	fffc86        	j	400d1674 <loop()+0x8>

On ESP32C:

        hw->out_w1tc.out_w1tc = (1 << gpio_num);
42000062:	47d8                	lw	a4,12(a5)
42000064:	8f75                	and	a4,a4,a3
42000066:	00476713          	ori	a4,a4,4
4200006a:	c7d8                	sw	a4,12(a5)
        hw->out_w1ts.out_w1ts = (1 << gpio_num);
4200006c:	4798                	lw	a4,8(a5)
4200006e:	8f75                	and	a4,a4,a3
42000070:	00476713          	ori	a4,a4,4
42000074:	c798                	sw	a4,8(a5)
42000076:	b7f5                	j	42000062 <loop()+0xc>

(which is ... weird. The .h file apparently defines out_w1ts as a struct with "reserved" bits, which forces the compiler to do the whole read/modify/write sequence (rather defeating the purpose of having SET and CLEAR registers!)

    union {
        struct {
            uint32_t out_w1tc:  26;
            uint32_t reserved26: 6;
        };
        uint32_t val;
    } out_w1tc;

this sort of "non-existent pins are "reserved" rather than just "ignored" is ... uncommon.)

In both cases, it would probably get somewhat faster if the function is moved into RAM.

I have submitted an "issue" for Espressif to consider.

1 Like