Go Down

Topic: "optimizing compiler" (Read 13471 times) previous topic - next topic

virtual1

update…  I decided to just do a global replace for "Serial." -> "//Serial." using bbedit.  After unreplacing the one include for the serial module itself, I did another compile to compare.  This was while leaving the debug define alone and enabled.

Before commenting out Serial. everywhere: 32k
AFTER commenting out Serial. everywhere: 38k

so, commenting out 244 serial commands led to an INCREASE of 6k in sketch size.

does anyone have a clue why this is happening?

kowalski

So is there some way to understand what is going on? Yes, there are tools to this. Not in the Arduino library but in the AVR toolbox. You can check what goes into the build and understand the difference and thus find the source to the problem. Below is a Linux shell script that I use to generate a list and symbol file (to reduce the Cosa footprint, do not what to end up with this type of problems ;-)
Code: [Select]

avr-size -C /tmp/build*.tmp/$1.cpp.elf > $1.lst
avr-objdump -d /tmp/build*.tmp/$1.cpp.elf >> $1.lst
avr-nm -CS /tmp/build*.tmp/$1.cpp.elf > $1.sym

The script generates size information and an assembly listing but also a symbol file.

"Know your numbers",

Cheers!

virtual1


Code: [Select]

avr-size -C /tmp/build*.tmp/$1.cpp.elf > $1.lst
avr-objdump -d /tmp/build*.tmp/$1.cpp.elf >> $1.lst
avr-nm -CS /tmp/build*.tmp/$1.cpp.elf > $1.sym



I haven't done that sort of thing before.  avr-size isn't in PATH on my mac (10 .8 )  Where do I enter that?  and the .elf files, are those generated somewhere by the compiler / when?  pretend I don't know anything about the ave commands, because I don't ;)    I AM very proficient with bash etc however.

kowalski

#18
Mar 05, 2014, 12:21 am Last Edit: Mar 05, 2014, 12:25 am by kowalski Reason: 1
Ok. I do not use a mac so this is hand-waving.

First you have to find the Arduino installation and locate the AVR tools bin directory. Add that to your PATH or simply execute the script from there. On my machine it is /opt/arduino-1.5.6-r2/hardware/tools/avr/bin.

Second check where the Arduino build puts the compiled files and replace the /tmp/build*/ etc with that path. Hint: Enable compiler printout in the Arduino IDE and look for the path when compiling.

When you get this working I will be waiting for the next question ;-)

Cheers!

virtual1

First you have to find the Arduino installation and locate the AVR tools bin directory. Add that to your PATH or simply execute the script from there. On my machine it is /opt/arduino-1.5.6-r2/hardware/tools/avr/bin.


Here that's wherever you put the app, dug in thusly:

Arduino 1.0.5.app/Contents/Resources/Java/hardware/tools/avr/avr/bin

and in that folder we have:
Code: [Select]

ar
as
c++
cpp
g++
gcc
gccbug
gcov
ld
nm
objcopy
objdump
ranlib
strip


Quote
Second check where the Arduino build puts the compiled files and replace the /tmp/build*/ etc with that path. Hint: Enable compiler printout in the Arduino IDE and look for the path when compiling.


That one appears to be a pseudorandom temp location.  In this specific case of the sketch I have loaded and ran right now, it's

/var/folders/mh/zvj3c6xd0ms3v62t6m6wc_d00000gp/T/build1756195043828663546.tmp/

So I whipped up:
Code: [Select]

#!/bin/bash
if [ -n "$1" ] ; then
  p=$1
else
  echo "provide path to build, something like /var/folders/mh/zvj3c6xd0ms3v62t6m6wc_d00000gp/T/build1756195043828663546.tmp/_Heartbeat"
  read -p "path: " p
fi
cd "${0%/*}"
./as -C p.cpp.elf > $p.lst
./objdump -d $p.cpp.elf >> $p.lst
./nm -CS $p.cpp.elf > $p.sym
echo
echo "done"
echo
echo "List file: $p.lst"
echo "Symbol file: $p.sym"
echo
open "${p%/*}"


but got:
Code: [Select]
path: /var/folders/mh/zvj3c6xd0ms3v62t6m6wc_d00000gp/T/build1756195043828663546.tmp/_Heartbeat
./as: unrecognized option `-C'

done

List file: /var/folders/mh/zvj3c6xd0ms3v62t6m6wc_d00000gp/T/build1756195043828663546.tmp/_Heartbeat.lst
Symbol file: /var/folders/mh/zvj3c6xd0ms3v62t6m6wc_d00000gp/T/build1756195043828663546.tmp/_Heartbeat.sym


So I assume "as" is not what you're looking for.  Here's as's --help:
Code: [Select]
Usage: ./as [option...] [asmfile...]
Options:
  -a[sub-option...]   turn on listings
                        Sub-options [default hls]:
                        c      omit false conditionals
                        d      omit debugging directives
                        g      include general info
                        h      include high-level source
                        l      include assembly
                        m      include macro expansions
                        n      omit forms processing
                        s      include symbols
                        =FILE  list to FILE (must be last sub-option)
  --alternate             initially turn on alternate macro syntax
  -D                      produce assembler debugging messages
  --debug-prefix-map OLD=NEW  Map OLD to NEW in debug information
  --defsym SYM=VAL        define symbol SYM to given value
  --execstack             require executable stack for this object
  --noexecstack           don't require executable stack for this object
  -f                      skip whitespace and comment preprocessing
  -g --gen-debug          generate debugging information
  --gstabs                generate STABS debugging information
  --gstabs+               generate STABS debug info with GNU extensions
  --gdwarf-2              generate DWARF2 debugging information
  --hash-size=<value>     set the hash table size close to <value>
  --help                  show this message and exit
  --target-help           show target specific options
  -I DIR                  add DIR to search list for .include directives
  -J                      don't warn about signed overflow
  -K                      warn when differences altered for long displacements
  -L,--keep-locals        keep local symbols (e.g. starting with `L')
  -M,--mri                assemble in MRI compatibility mode
  --MD FILE               write dependency information in FILE (default none)
  -nocpp                  ignored
  -o OBJFILE              name the object-file output OBJFILE (default a.out)
  -R                      fold data section into text section
  --reduce-memory-overheads
                          prefer smaller memory use at the cost of longer
                          assembly times
  --statistics            print various measured statistics from execution
  --strip-local-absolute  strip local absolute symbols
  --traditional-format    Use same format as native assembler when possible
  --version               print assembler version number and exit
  -W  --no-warn           suppress warnings
  --warn                  don't suppress warnings
  --fatal-warnings        treat warnings as errors
  -w                      ignored
  -X                      ignored
  -Z                      generate object file even after errors
  --listing-lhs-width     set the width in words of the output data column of
                          the listing
  --listing-lhs-width2    set the width in words of the continuation lines
                          of the output data column; ignored if smaller than
                          the width of the first line
  --listing-rhs-width     set the max width in characters of the lines from
                          the source file
  --listing-cont-lines    set the maximum number of continuation lines used
                          for the output data column of the listing
  @FILE                   read options from FILE
AVR options:
  -mmcu=[avr-name] select microcontroller variant
                   [avr-name] can be:
                   avr1  - classic AVR core without data RAM
                   avr2  - classic AVR core with up to 8K program memory
                   avr25 - classic AVR core with up to 8K program memory
                           plus the MOVW instruction
                   avr3  - classic AVR core with up to 64K program memory
                   avr31 - classic AVR core with up to 128K program memory
                   avr35 - classic AVR core with up to 64K program memory
                           plus the MOVW instruction
                   avr4  - enhanced AVR core with up to 8K program memory
                   avr5  - enhanced AVR core with up to 64K program memory
                   avr51 - enhanced AVR core with up to 128K program memory
                   avr6  - enhanced AVR core with up to 256K program memory
                   avrxmega4 - XMEGA, > 64K, <= 128K FLASH, <= 64K RAM
                   avrxmega5 - XMEGA, > 64K, <= 128K FLASH, > 64K RAM
                   avrxmega6 - XMEGA, > 128K, <= 256K FLASH, <= 64K RAM
                   avrxmega7 - XMEGA, > 128K, <= 256K FLASH, > 64K RAM
                   or immediate microcontroller name.
  -mall-opcodes    accept all AVR opcodes, even if not supported by MCU
  -mno-skip-bug    disable warnings for skipping two-word instructions
                   (default for avr4, avr5)
  -mno-wrap        reject rjmp/rcall instructions with 8K wrap-around
                   (default for avr3, avr5)
Known MCU names:
  avr1 avr2 avr25 avr3 avr31 avr35 avr4 avr5 avr51 avr6 avrxmega1
  avrxmega2 avrxmega3 avrxmega4 avrxmega5 avrxmega6 avrxmega7 at90s1200
  attiny11 attiny12 attiny15 attiny28 at90s2313 at90s2323 at90s2333
  at90s2343 attiny22 attiny26 at90s4414 at90s4433 at90s4434 at90s8515
  at90c8534 at90s8535 attiny13 attiny13a attiny2313 attiny24 attiny44
  attiny84 attiny25 attiny45 attiny85 attiny261 attiny461 attiny861
  attiny43u attiny48 attiny88 at86rf401 at43usb355 at76c711 atmega103
  at43usb320 attiny167 at90usb82 at90usb162 atmega8 atmega48 atmega48p
  atmega88 atmega88p atmega8515 atmega8535 atmega8hva at90pwm1 at90pwm2
  at90pwm2b at90pwm3 at90pwm3b atmega16 atmega161 atmega162 atmega163
  atmega164p atmega165 atmega165p atmega168 atmega168p atmega169
  atmega169p atmega32 atmega323 atmega324p atmega325 atmega325p atmega3250
  atmega3250p atmega328p atmega329 atmega329p atmega3290 atmega3290p
  atmega406 atmega64 atmega640 atmega644 atmega644p atmega645 atmega649
  atmega6450 atmega6490 atmega16hva at90can32 at90can64 at90pwm216
  at90pwm316 atmega16u4 atmega32c1 atmega32m1 atmega32u4 atmega32u6
  at90usb646 at90usb647 at94k atmega128 atmega1280 atmega1281 atmega1284p
  at90can128 at90usb1286 at90usb1287 atmega2560 atmega2561 atxmega64a3
  atxmega64a1 atxmega128a3 atxmega256a3 atxmega256a3b atxmega128a1

Report bugs to <http://www.sourceware.org/bugzilla/>


There are lst and sym files, but I don't know if they have what you are looking for.  Thanks for the help here, I definitely appreciate it!



kowalski

Replacing "avr-size" with "as" is not correct. Try removing that all together. Skip the size information for now and generate the symbol and listing file first. Then you have something to work on.

To implement the "avr-size" information I think it's easiest just to look at what the IDE for 1.5.X does as it produces that information at the end of the build.

Cheers!

virtual1


Replacing "avr-size" with "as" is not correct. Try removing that all together. Skip the size information for now and generate the symbol and listing file first. Then you have something to work on.


I found that the FlashROM module was still generating serial commands, so I flipped its switch also.  Strangely, it's switch behaved more as expected, reducing  build size slightly when debugging was disabled.  Build size continues to increase in my main sketch however.  Here are the two files requested, for each build.  Commenting out the serial use in the main sketch produces an increase of 3k in build size. (which is, oddly enough, a smaller increase now that I have disabled serial in FlashROM)   Although the build gets bigger, the output files here get smaller.   *shrug*

http://vftp.net/n0zyc/CleverFox/Cleverfox_commented_34k.zip
http://vftp.net/n0zyc/CleverFox/Cleverfox_uncommented_31k.zip

You'll have to show me how to interpret these files.

kowalski

The next thing to do is to sort the symbol files;

sort xxx.sym > xxx.sorted.sym

And then check where the commented function size is larger than uncommented. Then you check the listing to see the assembly code. And my guess is that you will find the answer --- compiler function inlining.

I did all this and found that the menu functions became much larger in the commented version. This implies the compiler took the decision to inline some functions.

00002bb6 000007e6 T MenuGeneral()         commented
vs
00002266 000001e4 T MenuGeneral()          uncommented

Hum, you need to know something about the file format. It is simply address, size, attribute and symbol. Nothing strange. Try using google and reading manuals ;-)

And, yes, you can force the compiler to not inline (for instance AddWord) to reduce footprint but that is a different line of QA.

Cheers!

virtual1

And then check where the commented function size is larger than uncommented. Then you check the listing to see the assembly code. And my guess is that you will find the answer --- compiler function inlining.


Interesting, I was having a very similar conversation with another coder this morning, and I was pondering whether or not some functions were being "unrolled" into their callers.  I suspected this because adding a single simple command to AddWord was causing a 4KB increase in compile size, which appeared to be quite outrageous, until I considered how many different places call it. (I think over a hundred?)

So, yes, I'm interested in that switch.  How do I stop it from "inlining small functions"?  I suspect that will save a tremendous amount of space in the binary for this sketch.

Looking around, there are several people suggesting ways to block inlining of a function when declaring it… is there a global switch?  I would really prefer that to having to go in and modify all my functions and possibly libraries.

In my stumblings I have also ran into using prom for strings like I am using in debug, this may save quite a bit on my sram when debugging:
http://forum.arduino.cc/index.php/topic,37337.0.html

kowalski

I think we have answered your original question. To make it easier for others to find information please post a new question instead of continueing on this post. Please make an attempt to search for the information and share instead of just asking questions. For instance google on gcc function attributes, read the manual, and ask for an interpretation and/or examples.

Cheers!

virtual1

Sorry for not researching further on this myself first but this is my first deep dive into arduino compiler options, I still don't really know what I'm looking for.

I was unable to get any reduction in binary size by trying several suggestions for making the AddWord function not go inline.  Either I'm not doing it right and need to know which of the many ways works in arduino, or the problem is in another function and I need to know how to figure out which one.  Once I have that sorted, that will solve my question about the compile size.

The inline instructions I found and tried were all covered in this post:
http://stackoverflow.com/questions/1474030/how-can-i-tell-gcc-not-to-inline-a-function

kowalski

So where did you put the function attribute and did you update the platform.txt file?

I checked your code and you are not using header files right. Please check how to partition code into header and source files. Below is the correct pattern for the "noinline" attribute.

Code: [Select]

int func(int arg) __attribute__((noinline))
{
}


As all code is included into the build through the "header" files this actually signals to the compiler that the functions are inline. This is how C++ works. Member functions in header files are automatically inline (when possible). The compiler uses a few heuristics to determine if there is actually en benifit from inline and how much the function is allowed to grow.

If your really want to do something about your source code start by cleaning up the "header" file mess. This will also clean up the footprint problem.

Cheers!

virtual1

So where did you put the function attribute and did you update the platform.txt file?


Sorry I have never seen a "platform.txt" file mentioned before.  It's not in the link I sent you nor anywhere in this thread that I can see?  Is there something I need to modify there to enable the following attribute?

Code: [Select]

int func(int arg) __attribute__((noinline))
{
}


Yep that was the most promising suggestion I tried. (and one of the few the compiler actually liked)  I wrapped that around AddWord and saw absolutely no change in binary size, which came as quite a surprise.

Quote
I checked your code and you are not using header files right. Please check how to partition code into header and source files.

As all code is included into the build through the "header" files this actually signals to the compiler that the functions are inline. This is how C++ works. Member functions in header files are automatically inline (when possible). The compiler uses a few heuristics to determine if there is actually en benifit from inline and how much the function is allowed to grow.

If your really want to do something about your source code start by cleaning up the "header" file mess. This will also clean up the footprint problem


I thought I actually did pretty well, this is my first foray into writing libraries, and I've just been tempting off other people's libraries.  Some things I don't understand the function of yet (like extern) but getting a library to compile and function correctly by itself seems to indicate you're doing it mostly right, because the compiler is very unforgiving.  I'm afraid I will need a little more hand-holding than "I think it's a mess, fix it".   All of my libraries follow a form that appears to be fairly identical to the rest of the libraries I've downloaded.  The only exception that I recently noticed is that function refs in the .h file I was starting with didn't name actual variables, just the types, and I see in other libraries now that they often include variable names also, but that appears to be optional.  Maybe I need a pointer to a "correctly formatted" library that I can adapt to?

I apologize in advance if I'm trying your patience a little bit.  I've jumped into arduino with both feet, right into the deep end, without any formal training or structured learning.  So I still will have a mix of advanced, intermediate, and even the occasional novice question all factoring into the same problems, until I can get properly up to speed.  "I don't know what I don't know" sums it up nicely.

virtual1

I don't know why the first time I tried it I didn't see a difference, but it seems to be working now.

The problem being that disabling the serial debug caused a 4.5KB drop in SRAM usage, but also caused compile size to jump from 31kb to 36kb,

Code: [Select]
__attribute__((noinline)) void AddWord(uint8_t got_word_id)  {
// void AddWord(uint8_t got_word_id)  {


With serial debugging disabled, that one change dropped binary size from 36k to 21k.  IMPRESSIVE!  It appears that the compiler doesn't take into account the number of calls to a function when deciding whether or not to automatically inline it.  If it ends up taking less space to inline in any single instance, it inlines it everywhere.  Which in my case was a disaster, because the function was being called several hundred times. (which, admittedly, is probably rather unusual for a sketch)

Thank you all for the assistance, I think we can mark this issue closed

kowalski

Great! Good luck with your project. Cheers!

Go Up