Improving Performance of 'void loop()'

Hello,

I'm in the middle of porting one of my projects which I started in 2003 over to the Arduino platform. Namely, I'm modernizing the device using newer technologies. While writing native code was nice and the throughput was great, I also spent far too much time writing my own drivers while the Arduino platform offers a canned solution for just about everything I need.

Part of this porting activity requires that I benchmark the native calls as much as possible so I can squeeze every bit out of this processor. Much of what I'm looking for is "Low hanging fruit" that can be improved on.

The first issue I identified was the woefully slow implementation of "DigitalWrite", which I eventually found multiple references to throughout this forum. I may write a new abstraction for that, but I see very little need since it's very easy to perform bit wise operations on the PORT* registers.

The second low hanging fruit absolutely surprised me and I've found no reference to it here, so I thought I'd share.

In short, the function "void loop()" that our main code resides is REALLY SLOW.

Case 1 - Running code within the loop() function:

void loop()
{

  PORTB ^= 0xFF; 

}

Case 2 - Running code nested within a for ( ;; ) loop within the loop() function:

void loop()
{

  for (;;) {
    PORTB ^= 0xFF; 
  }

}

Case 1 can toggle the state of PORTB at 533kHz while Case 2 can toggle the state of PORTB at 1.6MHz. This is an impressive improvement.

Given that with everything else our code will be doing, as a ratio, this overhead may not amount to very much -- it does count for something and is very easy to work around at the expense of a for loop. Surprisingly, the compiled size of both cases was 4274 bytes. This tells me that the compiler's optimizer successfully optimized the code and there is no negative impact in terms of using more flash storage.

Attached are two screen shots from my scope. The first "loop" describes case 1 and "loop for" describes case 2.

For reference, this is running on an atmega32u4 at 16mhz.

Hope this helps someone!

Cheers!

  • Jm

loop for.png

loop.png

Welcome to the Arduino platform.

I started with an ATtiny and the www.avrfreaks.net website. Since I moved to Arduino, I have enough possibilities to optimize timing without leaving the Arduino platform, alltough I miss sometimes the extreme global optimizations that could reduce the code size by 20 or 30%.
Arduino uses TIMER0 for timing, and it generates an interrupt. You have to keep that in mind, the interrupt can disturb specific timing critical code.
The code in the Arduino can be a mix of direct register manipulation in 'C' and also 'C++'. You can use code from the Projects section of avrfreaks.net, as long as it doesn't conflict with something else.

I never heard someone complain about the timing of the loop() function :~
When the loop() is finished, it checks for a "serial event" before returning to a new loop().

Do you use the new Arduino IDE 1.5.7 BETA ? It uses a newer avr-gcc version and creates more optimized code.

Hi what does your program do that makes it sensitive to loop time?

Tom.... :slight_smile:

@Peter_n - Timer0 functions as it does with the rest of the AVR line. I've seen some differences with the SAMD21, but that is a whole different architecture. There may be a way to abate this by running the majority of sensitive code within interrupts and defining interrupt priorities, but I have not yet had to do that. That method is well documented in Atmel's uC datasheets.

Not a complaint, just a report of an observation and a free viable work around.

I'm using Arduino IDE 1.5.7 BETA integrated with Atmel Studio and Visual Micro.

@Tom_George - I've had a few projects that run out of cycles on my uC and/or CPU and found the best way to make sure this doesn't happen is to spend a day or two to understand the errata of the platform. With that information, I (or you) can build a set of personal best practices to do more with what little we have to work with.

Cheers!

Don't bother with loop()

just put the while statement into the bottom of setup();

void setup()
{
// actual setup code here



while(true)
{
    PORTB ^= 0xFF; 

}
}

If you look at the core code for what loop() does, you;ll see that it called checks the serial port and calls SerialEvent if there are any char's every time the loop() function finishes

Personally I'd just prefer to use void main() like normal C code, but I can understand why the original developers of the Arduino platform separated setup and loop to help non-programmers

PS. SerialEvent is also a complete waste of time as its not called on the Serial interrupt at all.

rogerClark:
Don't bother with loop()

just put the while statement into the bottom of setup();

Interesting Thread.

If you do as you suggest and leave loop() empty does the compiler/linker optimize it (loop) away or does it keep working and using up CPU cycles?

I agree 100% about serialEvent().

...R

@Robin2

No. Sorry.

I meant leave loop() in place just don't ever leave setup();

I just did some tests and loop() is needed as its called from the core.

and also the compiler doesnt seem to be smart enough to notice in this test that loop() is never going to get called

int i;
void setup() {
  // put your setup code here, to run once:
  
while(true)
{
  Serial.println(i,DEC);
}
}

void loop() {
  // put your main code here, to run repeatedly: 
  while(true)
{
  Serial.println(i,DEC);
}
}

I was really just trying to point out that you can any program without ever needing to have any code in loop()

The first issue I identified was the woefully slow implementation of "DigitalWrite", which I eventually found multiple references to throughout this forum. I may write a new abstraction for that, but I see very little need since it's very easy to perform bit wise operations on the PORT* registers.

Bit manipulation is indeed simple, but not immediately portable.
That's one of the reasons that "digitalRead"/"digitalWrite" exist.

The first issue I identified was the woefully slow implementation of "DigitalWrite", which I eventually found multiple references to throughout this forum. I may write a new abstraction for that,

http://code.google.com/p/digitalwritefast/

Thanks for highlighting the speed difference between "for" and loop().

While I'm able to learn a lot about Arduino from reading the references, learning about speed is mostly through reading the forum.

I had found out about DigitalReadFast (and the non-portable direct register manipulation) earlier, but hadn't seen any mention of the speed of loop().