Fundamentals - Code size/speed: Arduino, Native-C, Assembly

falcon74 · December 30, 2011, 5:22pm

Having used Arduino for a while now, and only of late moving to bare-metal AVRs, I've come accross several comments from which it can be inferred that same logic (something that is not as trivial as blinking LED or switching a small set of pins, but a bit more real, more involved) when implemented using Arduino, native C (dealing directly with ports/registers etc.) and code in AVR 8-bit assembly language, will yield progressively more compact, efficient code (in that order).

Now my question is, while I can understand the differences between C and Assembly program to an extent, my understanding was that modern compilers, with aggrasive optimization can bring the size/speed difference between C ans assembly code, in close proximity, quite often, although assembly in the hands of a clever code craftsman, would invariably win. Is this assumption correct ? What about the avr-gcc used in Arduinos ? Can we say that it is indeed such a modern compiler, that does a good job of optimization ?

Another question is, apart from the 'Arduino Bootloader' which only makes the tasks of "reprogramming" quick-n-easy (& nothing else), does "Arduino" style of writing software, add extra 'flab' in anyway, compared to native C ? I believe that Arduino code does get converted in an intermediate step to C (or C++) code and then compiled. Is this conversion suboptimal, compared to hand-written C ? Or is the culprint the linked libraries which might be the extra baggage (due to unused functionality in the libraries) ?

Finally, if Arduino code gets converted to C/C++ as an intermediate step, is it possible to see the result of this step, halt it at that, and manually editing it, then compiling ? One may ask the reason / benefit of this approach. I'd presume this as a quick-n-easy way to jump-start on native C, without knowing too much of AVR intricacies. So far, all this is just a thought... haven't tried anything as yet.

dc42 · December 30, 2011, 5:38pm

The code you write in the Arduino environment is almost legal C++ already. The main difference is that the Arduino environment generates the header file for you, saving you from writing it (or alternatively, from forward-declaring any functions or global variables that you forward reference).

The main flab that the Arduino environment adds is library functions such as digitalRead and digitalWrite. These functions are not very large, however they take a significant amount of time to execute. Direct port access is much faster, and should be used where speed is critical.

In Arduino projects, typically you run out of RAM before you run out of program space. So code flab is not much of a problem, but data flab is. There isn't much data flab in the Arduino library apart from the transmit/receive buffers for the serial port, which you need anyway if you are using it (it's worse on the Mega because there are more serial ports). The main ways to conserve RAM are to avoid using the String class and to store text strings and read-only data tables in PROGMEM.

If you are developing code for native AVR processors, you can save 512 bytes of code space by not using the bootloader - just upload your program via ICSP instead.

Almost nobody programs microprocessors in assembler these days, except for very short pieces of assembler in an otherwise C or C++ program. gcc is quite good at optimization.

udoklein · December 30, 2011, 7:04pm

If you are not satisfied with C you should consider Forth. Forth is not mainstream though. However it usually gives more compact code than C does. Sometimes Forth programs will outperform C programs because Forth leads to different abstractions. By the way: this is also the case why you sometimes can outperform assembler programs with high level languages. Although assembler will in theory always win, there is a limit to the complexity of what you can do with assembler.

If you really need to squeeze out the maximum of your CPU the best approach is usually to write everything in C and to optimize the algorithms. If this is not sufficient you should consider the cost of optimizing the hot spots by hand. Usually this leads to the insight that it is cheaper to buy a more powerful CPU. Unless of course you are designing for a mass production. Then you might have to go for 4bit CPUs and squeeze out absolutely everything out of this hardware

falcon74 · December 31, 2011, 6:40am

dc42:
The code you write in the Arduino environment is almost legal C++ already. The main difference is that the Arduino environment generates the header file for you, saving you from writing it (or alternatively, from forward-declaring any functions or global variables that you forward reference).

The main flab that the Arduino environment adds is library functions such as digitalRead and digitalWrite. These functions are not very large, however they take a significant amount of time to execute. Direct port access is much faster, and should be used where speed is critical.

Thanks @dc42. I recently used FastDigitalWrite/Read library which has functions like digitalReadFast() / digitalWriteFast() in a project, where I could clearly see the speed difference due to the switch. However I guess using that library, "adds" to the code size, since it doesn't replace digitalRead()/digitalWrite() already part of library, right ? The arduino library is one big chunk, and not bunch of smaller libraries ?

dc42:
In Arduino projects, typically you run out of RAM before you run out of program space. So code flab is not much of a problem, but data flab is. There isn't much data flab in the Arduino library apart from the transmit/receive buffers for the serial port, which you need anyway if you are using it (it's worse on the Mega because there are more serial ports). The main ways to conserve RAM are to avoid using the String class and to store text strings and read-only data tables in PROGMEM.

If you are developing code for native AVR processors, you can save 512 bytes of code space by not using the bootloader - just upload your program via ICSP instead.

Almost nobody programs microprocessors in assembler these days, except for very short pieces of assembler in an otherwise C or C++ program. gcc is quite good at optimization.

Great bunch of tips there. These should make their way into some kind of 'Arduino on steriods' tutorial, but I'm keeping them in my own notebook.

system · December 31, 2011, 7:20am

If you never make any calls to digitalRead, then it will not get linked into your final executable. If you are really interested in this stuff, the thing to do is to roll your own command line build system and (most importantly), generate a MAP file.

Here's an example of a map file from the Blink sketch: Blink.ino linker map · GitHub

You can see there's a bit of excess crud from the Arduino environment, but it's not bad. Given that we have 32k of program space, it's a small price to pay for convenience. The biggest violators BY FAR are the serial buffers. I haven't even used them, but they show up in RAM which is the most constrained resource, and they're relatively BIG.

Anyway, I've written fairly large codebases in Assembly "back in the day", and I can safely say, thank the gods those days are past. No point in writing anything but a few key pieces of performance critical code in assembly.

As for "seeing the intermediate files", you can set an option in preferences to see the detailed output. In that spew, toward the beginning, will be the real CPP file that's being fed to the compiler. You can check that out to see what happened to your PDE/INO. There's not much difference. Now, if you want to see the assembler, and of course you have your own build system, pass "-S" to the compiler instead of "-c" and you'll see exactly what the compiler wants to hand off to the assembler. This is a useful forensic technique to figure out exactly happened to your beautiful prose.

system · December 31, 2011, 4:24pm

Oops... Need to revise this. The map listed above is wrong, because the libraries were compiled in directly and not put into a library first. This map corrects that: Blink.ino linker map (updated) · GitHub

This has even less overhead. Really the only overhead in this case is the pin mapping tables taking up program memory even though we're only using one pin.

falcon74 · January 1, 2012, 10:14am

Thanks @maniacbug.

Your point about assembly lang programming is duly noted. About 20yrs back I've done some pretty painful machine language programming on a Commodore64, feeding each instruction using BASIC's poke(). Argh, dreadful days. No debugger, no breakpoints, no nothing.

Regarding the generation of map-files, rolling a command-line build system, is it documented somewhere under the Arduino tutorials ('coz I didn't find it), or only under avr-gcc documentation ? Not sure if I am looking at all the right places.

Wish you and all fellow members here a happy new year.

udoklein · January 1, 2012, 10:24am

Google finds:

http://www.arduino.cc/playground/Learning/CommandLine

However I would prefer scons:

http://code.google.com/p/arscons/

system · January 1, 2012, 3:23pm

Search the forums for 'makefile'. Here's one recent post: http://arduino.cc/forum/index.php/topic,83725.0.html

Personally, I use Jam. I find both simpler and more powerful than make. Although no one else seems to use it, so I may switch to scons too in the future.

Topic		Replies	Views
Speed of Arduino vs C vs asm? Development	8	7459	May 6, 2021
Another Compiler News	14	7479	May 6, 2021
Assembler Development	14	7258	May 6, 2021
Arduino vs native language Programming	8	1660	May 5, 2021
Going from Arduino to C Frequently-Asked Questions	8	1784	May 6, 2021

Fundamentals - Code size/speed: Arduino, Native-C, Assembly

Related topics