Go Down

Topic: ESP32 beats STM32F407 and is 48 times faster than the Mega 2560? (Read 1 time) previous topic - next topic

rtek1000

Code: [Select]

Mega 2560.... (16Mhz):......1'326'856 us
UNO.......... (16MHz):........861'460 us
Arduino DUE.. (84MHz):........445'766 us
STM32F103C... (72MHz):........153'107 us
ESP8266..... (160MHz):........103'134 us
STM32F407V.. (168MHz):.........34'576 us
ESP32....... (240MHz):.........27'347 us




sketch:

Code: [Select]
//#define led 13    // DUE/Mega/UNO
//#define led PE0   // STM32F407V generic
//#define led PC13  // STM32F103C generic
//#define led 2     // ESP32/ESP8266


uint32_t micros1 = 0;

// the setup function runs once when you press reset or power the board
void setup() {
  // initialize digital pin LED_BUILTIN as an output.
  pinMode(led, OUTPUT);
  Serial.begin(115200);
  Serial.println("Start");
}

long i = 0;

// the loop function runs over and over again forever
void loop() {

  micros1 = micros();

  for (i = 0; i < 100000; i++) {
    digitalWrite(led, HIGH);   // turn the LED on (HIGH is the voltage level)
    digitalWrite(led, LOW);    // turn the LED off by making the voltage LOW
  }

  Serial.println(micros() - micros1);

  delay(1000);                       // wait for a second
}



It's really unfortunate the ESP32 to have a few general purpose pins.
Please avoid private messages, your question may be someone's answer in the future!

alkuentrus

I wouldn't even imagine such a big difference. Thanks :)

Koepel

Nice test !

SAMD21 @48MHZ (Zero, MKR, M0) : 339'355

That is faster than the Due, but the digitalWrite() is very short and optimized code for the SAMD21.

rtek1000

Please avoid private messages, your question may be someone's answer in the future!

ard_newbie

digitalWrite() is far from optimized....I get 42 ' 910 us on a DUE for your 100000 iterations if I toggle led pin with:

PIOB->PIO_ODSR ^= PIO_ODSR_P27; instead of a super slow digitalWrtite()

and 23 ' 638 us  if I toggle led pin with:

PIOB->PIO_SODR = PIO_SODR_P27; // set pin
PIOB->PIO_CODR = PIO_CODR_P27; // clear pin

Cool bro !

westfw

Quote
the digitalWrite() is very short and optimized code for the SAMD21.
Not really.  (Somewhat disappointingly, it's about as much slower than "direct port IO" as the AVR code (~40x)  https://forums.adafruit.com/viewtopic.php?f=57&t=133497)
Adafruit Metro M4 (SAMD51 @ 120MHz): 91795
The Mega2560 time (slowness) surprises me, and the ESP32 is surprisingly fast.

3Dgeo

Wow, this benchmark is wrong in so many levels.....

You are comparing potatoes to bananas - if You want real speed use direct port manipulation with ALL boards and You still will not get "real" data due to some boards being more optimized with IDE/language than others.

Watch from 2 min mark:
Port manipulation


Koepel

Well, why don't you make a good Arduino benchmark then ;)
Without direct port manipulation, but with balanced calculations of 32-bit float, calculations with 8, 16, 32 and 64-bit integers, and  with common arduino functions. It should run on Arduino Uno.

For a quick test, the BigNumber for Pi would be nice :P

3Dgeo

Well, why don't you make a good Arduino benchmark then ;)
Without direct port manipulation, but with balanced calculations of 32-bit float, calculations with 8, 16, 32 and 64-bit integers, and  with common arduino functions. It should run on Arduino Uno.

For a quick test, the BigNumber for Pi would be nice :P
I think You missed the point, I think original intentions was to compare I/O speed, not calculation speed. There are plenty calculations benchmarks already. And to make proper I/O benchmark will require digging in all these boards datasheets, finding correct registers... Much time I rather spend on doing something else. My goal was to show that this benchmark is far from actual capabilities of these boards.

Tho here is STM32 Port manipulation code, feel free to do proper benchmark:
Code: [Select]

GPIOB_BASE->BSRR = 0b00000000000000000000001000000000;   // PB9 HIGH
GPIOB_BASE->BSRR = 0b00000010000000000000000000000000;   // PB9 LOW


Running code above, but changed to port manipulation I'm getting 38911 on STM32F103C, almost 4 times faster, I hope this proves my point.

Riva

You are comparing potatoes to bananas - if You want real speed use direct port manipulation with ALL boards and You still will not get "real" data due to some boards being more optimized with IDE/language than others.
Direct port manipulation will always be the fastest but reduces code portability. For most users the Arduino built in I/O commands is all they need.

I have not yet tried the tests posted here on the esp8266 or ESP32
Don't PM me for help as I will ignore it.

westfw

The original benchmark is fine.   It compares the speed of the stock digitalWrite() across multiple boards.
That does make it a "library benchmark" rather than CPU benchmark, but it's still valid (more valid than just comparing clock rates, for example.)

It'd be nice if digitalWrite() were more consistent internally (on some boards, it will turn off PWM if a prior analogWrite() has been done.   On other boards it doesn't (and I'm not sure what happens.))

Note that the ESP32 boards seem to locate digitalWrite() in RAM, which is typically faster than running it from flash.

PS: do not try to disassemble ESP32 code with the ESP8266 version of objdump!

dsyleixa

I once published a benchmark feat. low and high level tests, and GPIO r/w for an AVR both by registers and by digitalRead/Write:

https://forum.arduino.cc/index.php?topic=431169.msg4144804#msg4144804

test design:
  0   int_Add     50,000,000 int +,- plus counter
  1   int_Mult    10,000,000 int *,/  plus counter
  2   fp32_ops    2,500,000 fp32 mult, transc.  plus counter
  3   fp64_ops    2,500,000 fp64 mult, transc.  plus counter (if N/A: 32bit)
  4   randomize   2,500,000 Mersenne PRNG (+ * & ^ << >>)
  5   matrx_algb  150,000 2D Matrix algebra (mult, det)
  6   arr_sort    1500 shellsort of random array[500]
  7   GPIO toggle 6,000,000 toggle GPIO r/w  plus counter
  8   Graphics    10*8 textlines + 10*8 shapes + 20 clrscr

.


Benchmarks:

Arduino MEGA + ILI9225 + Karlson UTFT + Arduino GPIO-r/w
  0     90244  int_Add
  1    237402  int_Mult
  2    163613  fp32_ops(float)
  3    163613  fp32_ops(float=double)
  4    158567  randomize
  5     46085  matrx_algb
  6     23052  arr_sort
  7     41569  GPIO toggle
  8     62109  Graphics   
runtime ges.:  986254
benchmark:     51




Arduino MEGA + ILI9225 + Karlson UTFT + Register bitRead/Write
  0     90238  int_Add
  1    237387  int_Mult
  2    163602  fp32_ops (float)
  3    163602  fp32_ops (float=double)
  4    158557  randomize
  5     45396  matrx_algb
  6     23051  arr_sort
  7      4528  GPIO_toggle bit r/w
  8     62106  Graphics   
runtime ges.:  948467
benchmark:     53


Arduino MEGA + adafruit_ILI9341 Hardware-SPI  Arduino GPIO r/w
  0     90244  int_Add
  1    237401  int_Mult
  2    163612  fp32_ops (float)
  3    163612  fp32_ops (float=double)
  4    158725  randomize
  5     46079  matrx_algb
  6     23051  arr_sort
  7     41947  GPIO toggle
  8      6915  Graphics   
runtime ges.:  931586
benchmark:     54


Arduino MEGA + adafruit_ILI9341 Hardware-SPI  GPIO register r/w
  0     90244  int_Add
  1    237401  int_Mult
  2    163612  fp32_ops (float)
  3    163612  fp32_ops (float=double)
  4    158725  randomize
  5     46079  matrx_algb
  6     23051  arr_sort
  7      4528  GPIO toggle register r/w
  8      6915  Graphics   
runtime ges.:  894167
benchmark:     56



Arduino/Adafruit M0 + adafruit_ILI9341 Hardware-SPI
  0      7746  int_Add
  1     15795  int_Mult
  2     89054  fp32_ops
  3    199888  fp64_ops(double)
  4     17675  randomize
  5     18650  matrx_algb
  6      6328  arr_sort
  7      9944  GPIO_toggle
  8      6752  Graphics
runtime ges.:  371832
benchmark:     134



Arduino DUE + adafruit_ILI9341 Hardware-SPI
  0      4111  int_Add
  1      1389  int_Mult
  2     29124  fp32_ops(float)
  3     57225  fp64_ops(double)
  4      3853  randomize
  5      4669  matrx_algb
  6      2832  arr_sort
  7     11859  GPIO_toggle
  8      6142  Graphics   
runtime ges.:  121204
benchmark:     413   



Arduino/Adafruit M4 120MHz + adafruit_HX8357 Hardware-SPI
  0      2253  int_Add
  1       872  int_Mult
  2      2773  fp32_ops (float)
  3     24455  fp64_ops (double)
  4      1680  randomize
  5      1962  matrx_algb
  6      1553  arr_sort
  7      2395  GPIO_toggle
  8      4600  Graphics   
runtime ges.:  39864
benchmark:     1254   



Arduino/Adafruit ESP32 + adafruit_HX8357 Hardware-SPI
  0      2308  int_Add
  1       592  int_Mult
  2      1318  fp32_ops
  3     14528  fp64_ops
  4       825  randomize
  5      1101  matrx_algb
  6       687  arr_sort
  7       972  GPIO_toggle
  8      3053  Graphics   
runtime ges.:  25384     
benchmark:     1969


Raspberry Pi:

Raspi 2 (v1): 4x 900MHz,  GPU 400MHz, no CPU overclock, full-HD, openVG:
  0     384  int_Add
  1     439  int_Mult
  2     346  fp32_ops(float)
  3     441  fp64_ops(double)
  4     399  randomize
  5     173  matrx_algb
  6     508  arr_sort
  7     823  GPIO_toggle
  8    2632  graphics
runtime ges.: 6145
benchmark: 8137   

dsyleixa

as to GPIO r/w,

on a Mega2560  it's
41569 ms   (Arduino digitalRead/Write)
 4528 ms    (Register bit r/w)

M0 SAMD21:         9944   ms   (digitalRead/Write)
Arduino Due:         11859  ms   (digitalRead/Write)
Adafruit M4 120MHz:    2395 ms   (digitalRead/Write)
Adafruit ESP32:      972 ms   (digitalRead/Write)


rtek1000

Very well guys!

Congratulations on the additional information and explanations!

Information is always welcome!

Because datasheets are usually just summaries, and not everyone can understand them so easily.

Thank you!
Please avoid private messages, your question may be someone's answer in the future!

Go Up