Free SRAM decreases when adding just code .. why?

After my first (flawed) attempt to ask for an answer to my question regarding increased use of SRAM after just adding "code" (see here) .. I want to try again ...

The code below contains 5 class definitions and a static factory class that creates an instance object of one of those classes, depending on a random integer.
The project can be compiled and run in 2 modes: with and without FULL_SWITCH defined.
When FULL_SWITCH is defined any one of the four derived classes can be created, when FULL_SWITCH is not defined, only the Der1 class will be defined.

What puzzles me is the amount of free SRAM that is reported after startup:

  • Without FULL_SWTCH: 7885
  • With FULL_SWITCH: 7845

So the extra code uses 40 bytes of SRAM. I can't figure our why

As I understand it SRAM contains const and static vars right after starting the sketch. Why would it decrease when I only increase code and don't change "compile time" defined variables

The optimizer cannot know which classes will be used in either scenario, so I suspect that it cannot remove the class definitions of Der2 .. Der4 .. and even if it did, why would it change SRAM size?

int freeRam() {
  extern int __heap_start, *__brkval;
  int v;
  return (int)&v - (__brkval == 0 ? (int)&__heap_start : (int)__brkval);
}

#define FULL_SWITCH

class Base2 {
public:
  Base2() {};
  virtual ~Base2() {};

  int i = 100;
  virtual int getInt() { return 0; }
};

class Der1 : public Base2 {
public:
  const int ib[150] = {};
  virtual int getInt() { return i; }
};

class Der2 : public Base2 {
public:
  const int ic[175] = {};
  virtual int getInt() { return i * 2; }
};

class Der3 : public Base2 {
public:
  virtual int getInt() { return i*3; }
};

class Der4 : public Base2 {
public:
  virtual int getInt() { return i*4; }
};

class Factory {
public:
  static Base2* makeBase(int i) {
    if (i == 1) {
      return new Der1();
#ifdef FULL_SWITCH
    } else if (i == 2) {
      return new Der2();
    } else if (i == 3) {
      return new Der3();
    } else if (i == 4) {
      return new Der4();
#endif // FULL_SWITCH
    } else {
      return new Base2();
    }
  }
};

void setup()
{
#ifdef FULL_SWITCH
  int max = 4;
#else
  int max = 1;
#endif

  Serial.begin(115200);
  Serial.println(String(freeRam()));

  Base2* b = Factory::makeBase(rand() % max + 1);
  Serial.println(String(b->getInt()));
  Serial.println(String(freeRam()));
  delete b;
}

void loop()
{
  /* add main program code here */
}

Nobody?

Some time ago I read about a tool that could show the declarations that are in the .data and .bss part of the SRAM by analyzing the .HEX file but I do'nt seem to be able to find that post anywhere...

Does anyone know where I can find this?

Ok I think Ive found the tool ... avr-readelf.exe

Now learning to read the output of that :wink:

“avr-nm” can report on usage in the .elf file. By the time you get to the .hex file, all that symbol info is gone.

Anyway: vtables. You knew “vitual” functions had some overhead associated with them, right? here, you get to see how much…

BillW-MacOSX-2<10057> avr-nm -C *.elf | grep " [bBdD] "
00800131 b Serial
[color=red]0080010a d vtable for HardwareSerial
0080011a d vtable for Der1[/color]
008001ce B __brkval
008001d2 B __bss_end
00800128 B __bss_start
00800128 D __data_end
00800100 D __data_start
008001d0 B __flp
00800100 D __malloc_heap_end
00800102 D __malloc_heap_start
00800104 D __malloc_margin
00800128 D _edata
00800106 d next
0080012c b timer0_fract
0080012d b timer0_millis
00800128 b timer0_overflow_count

BillW-MacOSX-2<10059> avr-nm -CS *.elf | grep " [bBdD] "
00800159 0000009d b Serial
[color=red]0080010a 00000010 d vtable for HardwareSerial
0080011a 0000000a d vtable for Der1
00800124 0000000a d vtable for Der2
0080012e 0000000a d vtable for Der3
00800138 0000000a d vtable for Der4
00800142 0000000a d vtable for Base2[/color]
008001f6 00000002 B __brkval
008001fa B __bss_end
00800150 B __bss_start
00800150 D __data_end
00800100 D __data_start
008001f8 00000002 B __flp
00800100 00000002 D __malloc_heap_end
00800102 00000002 D __malloc_heap_start
00800104 00000002 D __malloc_margin
00800150 D _edata
00800106 00000004 d next
00800154 00000001 b timer0_fract
00800155 00000004 b timer0_millis
00800150 00000004 b timer0_overflow_count
BillW-MacOSX-2<10060>

You knew "vitual" functions had some overhead associated with them, right?

I did not know that. I knew that virtual functions had overhead. 8)

Explanation of virtual function overhead (and defense of their use, I think.)

It’s “interesting” that the link says “one pointer of additional memory”, while the code is showing 10 bytes (5 pointers worth!) for each vtable…

@Westfv Thanks for the explanation! Given that I had the inclanation to include virtual methods tell me I alsrady had an subconcious gut feeling it had to do someting with them :wink:

As I understand it every class with virtual methods or derived from a clas with virtual methods has vtable (in SRAM) which contains pointers for every virtual methods. Or is it for all methods including the non-virtual ones?

I do have a fairly elaborate class structure (being a windows Delphi and Java programmer in my day job .. not very used to limiting amount of memory ;-)).

Would you advise against using class hierarchy and virtual methods?

Or is it for all methods including the non-virtual ones?

Hmm. I don't know. I'm not much of a C++ person; I'm just happy to analyze object files regardless of original language. Also, it's really hard to judge how the compiler will optimize a full program based on how it optimizes a small example like this. (See http://forum.arduino.cc/index.php?topic=464785.msg3190458#msg3190458 for another recent "mysterious" example.)

Would you advise against using class hierarchy and virtual methods?

Couldn't say. What is it doing for you? How hard would it be to do it some other way? How short on RAM are you likely to be? Can you just switch to a different board with more RAM? How price-sensitive are you? "elaborate classes" are a general problem on small-memory microcontrollers; AFAIK, there is no way to put part of them in Flash ('read only') and part in RAM.

I would say: write your program in the way that makes sense to you. If it fits, and runs fast enough, great! If it doesn't, then you can consider optimizations, but at least you already have the basic logic of the program laid out.

"Premature optimization is the root of much evil."