How fast does an Arduino UNO execute a loop? [solved]

Hi all
I hope I'm in the right section of the forum.
I have tried to understand how fast my Arduino can execute a loop.
In the attached pics the results and the code.
I've tried two codes: one with a 10ms delay function and the other one without it.
I have connected the scope probe to the Arduino pin 4.
In the delayed loop I got 50Hz. I'm a beginner in this field so I thought it should have been something like 100Hz..
In the loop without the delay I got 129kHz... Where does this value comes from? Isn't this value pretty far from 16MHz clock?
It should be 16 millions instructions per second, so do I have to conclude that more than 100 instructions are executed under the hood with my simple loop?
Cheers

code.png

In the delayed loop I got 50Hz. I'm a beginner in this field so I thought it should have been something like 100Hz..

You are delaying 10mS, setting the output LOW, delaying 10mS, setting the output HIGH, for a total cycle time of 20mS, which would be 50Hz.

In the loop without the delay I got 129kHz... Where does this value comes from? Isn't this value pretty far from 16MHz clock?
It should be 16 millions instructions per second, so do I have to conclude that more than 100 instructions are executed under the hood with my simple loop?

There is some unseen code that is executed between loops that checks the Serial port, but most of the time is being spent in digitalWrite(), which can be quite slow. There is also the repetitive interrupt that updates the millis() timer once per millisecond.
If you need more speed you can directly write to the output ports, see this page: Port Manipulation. What is not mentioned on that reference page is an even faster way to toggle an output, which is to write a HIGH to the corresponding bit in the port INPUT register.

Thanks

Oh, yeah, definitely >100 instructions!

All the functions that operate with arduino pin numbers are slow - the lookup process (to convert arduino pin number to port register and bit within port) is kinda awkward. Plus, they don't know if there might be an interrupt that might also try to write the same port, so they have to briefly disable interrupts for the read-modify-write cycle, there goes a few more cycles... And there is some stupid garbage for the "serial event" abomination that they added not all that long ago (and which nobody uses - ffs, just poll serial!)

If you override main, which is weakly defined, with this to ditch SerialEvent, you gain a bit of speed - but not all that much:

void main() {
init(); //this sets up timers for millis and pwm, among other things.
setup();
while(1){
loop();
}
}

It's mostly just how slow digitalWrite is. Assuming an Uno where pin14 (A0) is set as output already (picked that pin because I happen to remember what port it's on off the top of my head):

while(1) {
PORTC|=1;
PORTC&=~1;
}
is much faster.
while(1){
PINC=1;
}
is faster still!

I've seen an implementation of digitalWrite() that actually optimizes for compile-time-known cases (ie, when you say digitalWrite(5,HIGH), everything is known at compile time, so that could be simplified to just interrupt-safe port write - but the core doesn't do that) - I was going to take it for my cores, but the code is... ugly...

Oh, and I almost forgot, if the pin has PWM, it also has to check what timer the pin is on and make sure it's not set to output PWM...

There is extensive discussion on the whole "maximum pin toggle speed" here:Maximum pin toggle speed - Frequently-Asked Questions - Arduino Forum(ok, it might be a little aged by now. In particular, it doesn't talk at all about the whole "we need to emulate the ATmega8 pull-up enable feature and any new chip that would normally do it differently!" (done on m8 by writing a 1 to the output register while the pin is in input mode. Almost always a separate configuration in newer chips (SAM, Even ATmega4809.) digitalWrite() keeps getting slower, cycles-wise.)

Every IO pin is part of an 8 bit port. Every port has 3 registers

PORT register that you write to to set or clear pins internally.
PIN register that you read to see actual pin states.
DDR data direction INPUT/OUTPUT for each pin.

Bonus is that if you write any bit in the PIN register it will toggle the same bit of the PORT register.
No need to read PINx, make a mask and write PORTx just to change a bit.

Where you get savings is when you have multiple I/O's on the same port.