"optimizing compiler"

I'm in the home stretch on a large project involving radio remote control and speech among other things, and found that my sketch won't fit in 32k anymore, so I set about seeing what I could do to optimize it. Either I'm doing something very wrong, or the compiler is. I'm hoping that someone with more experience with large projects can shed some insight on my experiences.

First off, I use serial to debug, a lot. This eats up sram of course, and will bring an ardweeny to its knees pretty fast. So I do my development on a mega, and then turn off the debug code, and it will then run fine on a smaller proc. I'm turning the debugs on and off with #define switches, to exclude use of Serial. This has the double effect of keeping the Serial library off the flash as well as shedding a ton of string literals. I've ran into the ironic situation that turning OFF the debug is yielding a larger build. As in, ALL references to Serial are removed via a main #define being commented out.

Your knee-jerk reaction may be 'post the code', but most of you probably don't actually want that. My main sketch is nothing but 15 includes, and some of those are 20 pages long. It compiles to over 46k. In my reductions for testing, here's a "compile size sensitive" function:

// add a word to the message. provide duration in tens of ms
void AddWord(uint8_t got_word_id) {
message_word_id[message_wordcount] = got_word_id;
message_wordcount = min(message_wordcount+1,message_max);
//booya
// Serial.print(":AddWord/"); // 4.5 KB gets ADDED to the sketch if this line is commented!
}

I spent two hours with a machete, hacking things down to try to produce a small reproducible issue, but the flood gradually drops down to a trickle as I cut things away, so I stopped trimming off the top and went into the bowels to find the offender in a commonly-called area. The above function gets called A LOT. The commenting of those three lines (Serial, and the two message_) produces such counter-intuitive results that it's difficult to believe.

2292: all three commented
3162: serial and one message commented
5082: two messages commented
5108: one message commented (small difference which one)
9842: serial commented

So strangely, the worst configuration is with the serial command commented OUT! (and that is THE only serial command left in the entire sketch) And, look at the incredible sketch savings for commenting out ONE of the message commands. In that respect, it's as though the compiler isn't calling the function, but is instead replacing each of the function calls with the copies of the function's code. (add one line to the function, compiler adds 200 copies of the line to your sketch! yikes!)

So I'm somewhat at a loss for words as to how to "beat the compiler at its own game" of code optimizing. Doing things that seem to make sense, (consolidating frequently used code into a called function, and removing unnecessary code and libraries: serial) have the exact opposite of the expected effects. Does anyone understand the "method to the madness"?

The serial overhead for each function is added once, then each repeated use is just a function call.
If you do not use an array/variable, or there is no side effect of using it, then it will be completely dropped. Are the messages you commented out used/repeated any where else ( not some random code somewhere, but code that is actually used, in the app your compiling ).

Of course we can only speculate without your code to test. Some of your claims seem a bit far fetched and there is probably something else causing a problem.

For fun, create a small test app. Add serial to it then comment it out. Notice how the code does not grow.

pYro_65:
The serial overhead for each function is added once, then each repeated use is just a function call.

That's what I would expect. My observations appear to support the opposite however.

]If you do not use an array/variable, or there is no side effect of using it, then it will be completely dropped. Are the messages you commented out used/repeated any where else ( not some random code somewhere, but code that is actually used, in the app your compiling ).

It's possible that's the last of the accessors to those arrays, and it may be squeezing it out a bit. But removing EITHER of them causes a reduction in size. Removing the other causes another large drop in code usage. I would expect removing BOTH could lead to a sudden drop in used space, but not individually.

]Of course we can only speculate without your code to test. Some of your claims seem a bit far fetched and there is probably something else causing a problem.

That's actually useful to know. I was considering the possibility this was a known issue. But if it's something I'm doing, it means it's fixable.

]For fun, create a small test app. Add serial to it then comment it out. Notice how the code does not grow.

I already did that, and it behaved as I would expect it to. Unfortunately, the margin of (size with serial commented) vs (size with serial in) drops constantly as I remove code from my project. It's like there's a multiplier in there somewhere that gets flipped on by that serial command, and says "ok he commented out the serial, lets increase project size by 15%" And something like that is impossible to chase down by reduction, because the problem withers away into nothingness by the time you get your project reduced.

Since you've expressed a willingness for abuse :wink: I will see about gathering together all the bits that hopefully you can compile and see for yourself. You'll almost certainly run into a dependency or two I've overlooked on the first attempt, but I'll try to get it as complete as possible on the first try.

pYro_65:
Of course we can only speculate without your code to test. Some of your claims seem a bit far fetched and there is probably something else causing a problem.

http://vftp.net/n0zyc/CleverFox/CleverFox_Project.zip

It compiles right now to about 30k. Open the Config module and comment out "#define debugging_serial" and compile again. Compile size here jumps to 35k. Which is really annoying because it won't fit on an Uno then. And all those Serial.print are killing its limited sram unless I turn off debugging.

99% of debugging_serial's effect is to remove code from the build. There are a handful of lines of code in one place that are substituted, but nothing that should bump compile size (I tried moving them out of the #else and it didn't really matter) Just look for #else if you want to see them.

You aren't using the Arduino IDE are you? Or are you using mac specific directory commands. Because your code needs tremendous mods to even work in the IDE, too much so that I've lost motivation.

For example what is this:

../Debouncer/Debouncer.h

That was the entire file???

Maybe an include:

#include "../Debouncer/Debouncer.h"

even still, why not just include the library like you do elsewhere:

#include <Debouncer.h>

Sorry.

pYro_65:
You aren't using the Arduino IDE are you?

The problem I've ran into (and attempted to get a solution for in the forum) was that I have made libraries, that include other libraries that I've also made. From what little definitive answers I've found, that's simply not directly supported by the IDE/compiler, so I was forced to find a solution on my own. If you can figure out a way to get a library to use another library, I'd LOVE to know how you do it, it's not as simple as adding a #define <> somewhere, not that I've figured out anyway. I may not be placing my libraries in the correct place. The only place they "work normally" is if I stuff them deep in the bowels of the IDE's java folder, where the core libraries are, and I'd really like to avoid doing that. I don't want to expect users to put a library of mine inside the ice rather than in the library folder.

Because your code needs tremendous mods to even work in the IDE, too much so that I've lost motivation.

I thought you said it was tweaking a couple includes? ("even still, why not just include the library like you do elsewhere:") or is it something more involved?

fwiw, I think it's possible to just put the libraries (CPP and H) in the same folder as the sketch and then include them in <> normally, but I'm using these libraries in several sketches, and so I'm not about to risk forking my libraries by accident because they exist in multiple places.

I have a few FAQ's that might help you with the library situation.

Check the third bullet point: http://arduino.land/FAQ/content/1/3/en/what-does-the-ide-change-in-my-sketch.html

And down the bottom is some library specific links.

pYro_65:
I have a few FAQ's that might help you with the library situation.

Yep that's the one. "This has been a source of error for many new Arduino users attemping to write their own libraries. As confusing as it sounds, any library included into other libraries must also be included in the sketch; simply so it can be copied to the temporary location."

One man's "confusing" is another man's "broken". If it's confusing a significant number of people, it clearly needs to be addressed, rather than listed as a "known issue" and closed. The fact that includes work in core libraries suggests the problem isn't insurmountable. (the "copy-in" process simply needs to be recursive) It works for A, users expect it to work for B. And I don't think that's unreasonable, or something that should be ignored.

I think I will just surrender and go that route, and expect the user to include the dependency libraries in the main sketch. Sort of how you have to include Wire in some sketches not because you need them, but because a library you're using needs them. Just doesn't seem like a very clean solution when a functional include IN the library would do the work for the user, and mask some of the unnecessary details. I see the job of a library to mask as much of the mechanics of what it does as possible, freeing the developer to concentrate solely on their idea and let me worry about the finer details in my library.

Anyway we're off topic. Back to the sketch. Still looking for ideas on how commenting OUT a line of code causes a 3-12kb INCREASE in compile size.

I ran across this recently --

I can't say if it applies to your situation, but it could be worth a look.

pantaz:
I ran across this recently --
Arduino compiler problem with #ifdefs solved. | Sub-Etha Software

I can't say if it applies to your situation, but it could be worth a look.

Thanks for the pointer, I'll file that one away in my bag of tricks, I'll be victimized by that bug at some point in the future.

Unfortunately it's not a lack of definitions that's getting me. It appears to be defining something it should not. But I don't know how best to debug from here. The more I try to simplify the sketch, the smaller the wasted space is. Until I have very little left of my sketch and close to 0 bytes wasted.

still looking for ideas if anyone has them. Or cares to look at my sketch and see if they can confirm or workaround it. My sketch uses too much sram for the uno, so if I compile with debug on, it won't even boot up the sketch. The sram just gets gridlocked. According to my debugger, debugging consumes an additional 3.7k of sram :stuck_out_tongue_winking_eye:

pYro_65:
Maybe an include:

#include "../Debouncer/Debouncer.h"

even still, why not just include the library like you do elsewhere:

#include <Debouncer.h>

fwiw I removed the include for semaphore in the debouncer library, and then added an include into the main sketch first off with:
#include <Semaphore.h>

And now when I compile, I get:

In file included from /Users/virtual1/Applications/Hardware/Arduino/Projects/libraries/DTMFdecoder/Debouncer.cpp:41:
/Users/virtual1/Applications/Hardware/Arduino/Projects/libraries/DTMFdecoder/Debouncer.h:189: error: 'Semaphore' does not name a type
/Users/virtual1/Applications/Hardware/Arduino/Projects/libraries/DTMFdecoder/Debouncer.cpp: In member function 'int Debouncer::PopValue()':
/Users/virtual1/Applications/Hardware/Arduino/Projects/libraries/DTMFdecoder/Debouncer.cpp:217: error: 'Debouncer_Semaphore' was not declared in this scope
/Users/virtual1/Applications/Hardware/Arduino/Projects/libraries/DTMFdecoder/Debouncer.cpp: In member function 'bool Debouncer::PopState()':
/Users/virtual1/Applications/Hardware/Arduino/Projects/libraries/DTMFdecoder/Debouncer.cpp:253: error: 'Debouncer_Semaphore' was not declared in this scope
/Users/virtual1/Applications/Hardware/Arduino/Projects/libraries/DTMFdecoder/Debouncer.cpp: In member function 'void Debouncer::UpdateValue(int)':
/Users/virtual1/Applications/Hardware/Arduino/Projects/libraries/DTMFdecoder/Debouncer.cpp:290: error: 'Debouncer_Semaphore' was not declared in this scope

so including Semaphore in the sketch (at the top) doesn't seem to allow my libraries to work. Could be an issue of

Yee-hah, found it. I had to include the sub-libraries at ALL levels. I have my sketch, using DTMFDecoder, which uses Debouncer, which uses Semaphore. And without includes for Semaphore in my sketch, DTMFdecoder, AND Debouncer, Debouncer.h would error out not finding Semaphore.

Annoying, but work-around-able. In which case I don't really care anymore.

Now would you care to tackle to space usage?

I haven't got much time lately, but post it and I'll check it out.

pYro_65:
I haven't got much time lately, but post it and I'll check it out.

I appreciate the help. FWIW, a small group of us have just organized this morning to meet twice a week for an evening of arduino hacking. :slight_smile: I need to find more people in my area that use arduino. I suspect I'm near the high end, but everyone has something to learn, and new ideas/challenges to present.

The sram just gets gridlocked.

Are you still using precious SRAM for debug messages?

AWOL:

The sram just gets gridlocked.

Are you still using precious SRAM for debug messages?

I'm damned if I do, and damned if I don't. If I ifdef off the debug commands, it balloons by over 6kb and won't fit on the uno. If I turn the switch to include the debugs, it uploads, and then gridlocks.

and that is why I am so interested in knowing why IFDEF'ing out my serial causes such an increase in upload size.

This is actually pretty familiar territory for me. Years ago I got onboard with crossbasic/realbasic very early, and over the next year got to know the compiler VERY well, in all its bizarre idiosyncrasies. People would come to me to find out why the compiler was doing completely crazy things, and I could help because I'd already seen it and fought the battle and won. I find myself strangely on the other side of the fence with this new IDE, now I'm the one looking for a wizard that knows the method to the madness :wink:

update… I decided to just do a global replace for "Serial." -> "//Serial." using bbedit. After unreplacing the one include for the serial module itself, I did another compile to compare. This was while leaving the debug define alone and enabled.

Before commenting out Serial. everywhere: 32k
AFTER commenting out Serial. everywhere: 38k

so, commenting out 244 serial commands led to an INCREASE of 6k in sketch size.

does anyone have a clue why this is happening?

So is there some way to understand what is going on? Yes, there are tools to this. Not in the Arduino library but in the AVR toolbox. You can check what goes into the build and understand the difference and thus find the source to the problem. Below is a Linux shell script that I use to generate a list and symbol file (to reduce the Cosa footprint, do not what to end up with this type of problems :wink:

avr-size -C /tmp/build*.tmp/$1.cpp.elf > $1.lst
avr-objdump -d /tmp/build*.tmp/$1.cpp.elf >> $1.lst
avr-nm -CS /tmp/build*.tmp/$1.cpp.elf > $1.sym

The script generates size information and an assembly listing but also a symbol file.

"Know your numbers",

Cheers!

kowalski:

avr-size -C /tmp/build*.tmp/$1.cpp.elf > $1.lst

avr-objdump -d /tmp/build*.tmp/$1.cpp.elf >> $1.lst
avr-nm -CS /tmp/build*.tmp/$1.cpp.elf > $1.sym

I haven't done that sort of thing before. avr-size isn't in PATH on my mac (10 .8 ) Where do I enter that? and the .elf files, are those generated somewhere by the compiler / when? pretend I don't know anything about the ave commands, because I don't :wink: I AM very proficient with bash etc however.

Ok. I do not use a mac so this is hand-waving.

First you have to find the Arduino installation and locate the AVR tools bin directory. Add that to your PATH or simply execute the script from there. On my machine it is /opt/arduino-1.5.6-r2/hardware/tools/avr/bin.

Second check where the Arduino build puts the compiled files and replace the /tmp/build*/ etc with that path. Hint: Enable compiler printout in the Arduino IDE and look for the path when compiling.

When you get this working I will be waiting for the next question :wink:

Cheers!

kowalski:
First you have to find the Arduino installation and locate the AVR tools bin directory. Add that to your PATH or simply execute the script from there. On my machine it is /opt/arduino-1.5.6-r2/hardware/tools/avr/bin.

Here that's wherever you put the app, dug in thusly:

Arduino 1.0.5.app/Contents/Resources/Java/hardware/tools/avr/avr/bin

and in that folder we have:

ar
as
c++
cpp
g++
gcc
gccbug
gcov
ld
nm
objcopy
objdump
ranlib
strip

Second check where the Arduino build puts the compiled files and replace the /tmp/build*/ etc with that path. Hint: Enable compiler printout in the Arduino IDE and look for the path when compiling.

That one appears to be a pseudorandom temp location. In this specific case of the sketch I have loaded and ran right now, it's

/var/folders/mh/zvj3c6xd0ms3v62t6m6wc_d00000gp/T/build1756195043828663546.tmp/

So I whipped up:

#!/bin/bash
if [ -n "$1" ] ; then
  p=$1
else
  echo "provide path to build, something like /var/folders/mh/zvj3c6xd0ms3v62t6m6wc_d00000gp/T/build1756195043828663546.tmp/_Heartbeat"
  read -p "path: " p
fi
cd "${0%/*}"
./as -C p.cpp.elf > $p.lst
./objdump -d $p.cpp.elf >> $p.lst
./nm -CS $p.cpp.elf > $p.sym
echo
echo "done"
echo
echo "List file: $p.lst"
echo "Symbol file: $p.sym"
echo
open "${p%/*}"

but got:

path: /var/folders/mh/zvj3c6xd0ms3v62t6m6wc_d00000gp/T/build1756195043828663546.tmp/_Heartbeat
./as: unrecognized option `-C'

done

List file: /var/folders/mh/zvj3c6xd0ms3v62t6m6wc_d00000gp/T/build1756195043828663546.tmp/_Heartbeat.lst
Symbol file: /var/folders/mh/zvj3c6xd0ms3v62t6m6wc_d00000gp/T/build1756195043828663546.tmp/_Heartbeat.sym

So I assume "as" is not what you're looking for. Here's as's --help:

Usage: ./as [option...] [asmfile...]
Options:
  -a[sub-option...]	  turn on listings
                      	  Sub-options [default hls]:
                      	  c      omit false conditionals
                      	  d      omit debugging directives
                      	  g      include general info
                      	  h      include high-level source
                      	  l      include assembly
                      	  m      include macro expansions
                      	  n      omit forms processing
                      	  s      include symbols
                      	  =FILE  list to FILE (must be last sub-option)
  --alternate             initially turn on alternate macro syntax
  -D                      produce assembler debugging messages
  --debug-prefix-map OLD=NEW  Map OLD to NEW in debug information
  --defsym SYM=VAL        define symbol SYM to given value
  --execstack             require executable stack for this object
  --noexecstack           don't require executable stack for this object
  -f                      skip whitespace and comment preprocessing
  -g --gen-debug          generate debugging information
  --gstabs                generate STABS debugging information
  --gstabs+               generate STABS debug info with GNU extensions
  --gdwarf-2              generate DWARF2 debugging information
  --hash-size=<value>     set the hash table size close to <value>
  --help                  show this message and exit
  --target-help           show target specific options
  -I DIR                  add DIR to search list for .include directives
  -J                      don't warn about signed overflow
  -K                      warn when differences altered for long displacements
  -L,--keep-locals        keep local symbols (e.g. starting with `L')
  -M,--mri                assemble in MRI compatibility mode
  --MD FILE               write dependency information in FILE (default none)
  -nocpp                  ignored
  -o OBJFILE              name the object-file output OBJFILE (default a.out)
  -R                      fold data section into text section
  --reduce-memory-overheads 
                          prefer smaller memory use at the cost of longer
                          assembly times
  --statistics            print various measured statistics from execution
  --strip-local-absolute  strip local absolute symbols
  --traditional-format    Use same format as native assembler when possible
  --version               print assembler version number and exit
  -W  --no-warn           suppress warnings
  --warn                  don't suppress warnings
  --fatal-warnings        treat warnings as errors
  -w                      ignored
  -X                      ignored
  -Z                      generate object file even after errors
  --listing-lhs-width     set the width in words of the output data column of
                          the listing
  --listing-lhs-width2    set the width in words of the continuation lines
                          of the output data column; ignored if smaller than
                          the width of the first line
  --listing-rhs-width     set the max width in characters of the lines from
                          the source file
  --listing-cont-lines    set the maximum number of continuation lines used
                          for the output data column of the listing
  @FILE                   read options from FILE
AVR options:
  -mmcu=[avr-name] select microcontroller variant
                   [avr-name] can be:
                   avr1  - classic AVR core without data RAM
                   avr2  - classic AVR core with up to 8K program memory
                   avr25 - classic AVR core with up to 8K program memory
                           plus the MOVW instruction
                   avr3  - classic AVR core with up to 64K program memory
                   avr31 - classic AVR core with up to 128K program memory
                   avr35 - classic AVR core with up to 64K program memory
                           plus the MOVW instruction
                   avr4  - enhanced AVR core with up to 8K program memory
                   avr5  - enhanced AVR core with up to 64K program memory
                   avr51 - enhanced AVR core with up to 128K program memory
                   avr6  - enhanced AVR core with up to 256K program memory
                   avrxmega4 - XMEGA, > 64K, <= 128K FLASH, <= 64K RAM
                   avrxmega5 - XMEGA, > 64K, <= 128K FLASH, > 64K RAM
                   avrxmega6 - XMEGA, > 128K, <= 256K FLASH, <= 64K RAM
                   avrxmega7 - XMEGA, > 128K, <= 256K FLASH, > 64K RAM
                   or immediate microcontroller name.
  -mall-opcodes    accept all AVR opcodes, even if not supported by MCU
  -mno-skip-bug    disable warnings for skipping two-word instructions
                   (default for avr4, avr5)
  -mno-wrap        reject rjmp/rcall instructions with 8K wrap-around
                   (default for avr3, avr5)
Known MCU names:
  avr1 avr2 avr25 avr3 avr31 avr35 avr4 avr5 avr51 avr6 avrxmega1
  avrxmega2 avrxmega3 avrxmega4 avrxmega5 avrxmega6 avrxmega7 at90s1200
  attiny11 attiny12 attiny15 attiny28 at90s2313 at90s2323 at90s2333
  at90s2343 attiny22 attiny26 at90s4414 at90s4433 at90s4434 at90s8515
  at90c8534 at90s8535 attiny13 attiny13a attiny2313 attiny24 attiny44
  attiny84 attiny25 attiny45 attiny85 attiny261 attiny461 attiny861
  attiny43u attiny48 attiny88 at86rf401 at43usb355 at76c711 atmega103
  at43usb320 attiny167 at90usb82 at90usb162 atmega8 atmega48 atmega48p
  atmega88 atmega88p atmega8515 atmega8535 atmega8hva at90pwm1 at90pwm2
  at90pwm2b at90pwm3 at90pwm3b atmega16 atmega161 atmega162 atmega163
  atmega164p atmega165 atmega165p atmega168 atmega168p atmega169
  atmega169p atmega32 atmega323 atmega324p atmega325 atmega325p atmega3250
  atmega3250p atmega328p atmega329 atmega329p atmega3290 atmega3290p
  atmega406 atmega64 atmega640 atmega644 atmega644p atmega645 atmega649
  atmega6450 atmega6490 atmega16hva at90can32 at90can64 at90pwm216
  at90pwm316 atmega16u4 atmega32c1 atmega32m1 atmega32u4 atmega32u6
  at90usb646 at90usb647 at94k atmega128 atmega1280 atmega1281 atmega1284p
  at90can128 at90usb1286 at90usb1287 atmega2560 atmega2561 atxmega64a3
  atxmega64a1 atxmega128a3 atxmega256a3 atxmega256a3b atxmega128a1

Report bugs to <http://www.sourceware.org/bugzilla/>

There are lst and sym files, but I don't know if they have what you are looking for. Thanks for the help here, I definitely appreciate it!