Inconsistency: Baffled by #ifdef and #include

I am baffled by the following behaviour of the IDE.

With the set-up below, I would expect that the IDE does, or does not, include and thus compile and link the files within the #ifdef statement, depending if #define SDB is commented out or not:

// case 1: include the .h files
#define SDB
#ifdef SDB
#include "sdb_support.h"
#include "disp_ulcd_144.h"
#include "grid_display.h"
#include "PString.h"
#include "silencer.h"
#endif
// case 2: do not include the .h files
//#define SDB
#ifdef SDB
#include "sdb_support.h"
#include "disp_ulcd_144.h"
#include "grid_display.h"
#include "PString.h"
#include "silencer.h"
#endif

And actually yes, any subsequent code is correctly dependent on the inclusions. All fine from this point of view.

Now, I would expect that the following case is equal to case 2 above, ie. commenting out the #include statements should be equal to excluding them by not defining SDB.

// case 3: do not include the .h files
//#define SDB
//#ifdef SDB
//#include "sdb_support.h"
//#include "disp_ulcd_144.h"
//#include "grid_display.h"
//#include "PString.h"
//#include "silencer.h"
//#endif

And again, all fine from a dependency point of view.

However, checking the size of the resulting code, as well as the RAM usage, I found that case 2 and case 3 are not equal. In fact, checking the verbose output of the compiler and linker shows, that in case 2 all the modules mentioned in my sketch are compiled and linked, even though I have excluded them using #ifdef. Only commenting out the #includes ensures that the not wanted files are not included in the resulting code.

I find this rather inconsistent: files don't get lexically included in my source, but their code does surprisingly get included in my executable file (*).

Do I miss something?

(*) Note that to make this test possible I don't actually use any functions from the #included files to allow successful compilations in all cases, but I have included some static vars and instantiation in the included files to see the effect in code size and RAM usage.

Hi

I made similar experience. It seems that the IDE scans the code and links all libraries which are referenced by their .h file from your .pde file. According to your results, the IDE is not clever enough to check for the #ifdef statements.

I did not find any documentation on this behavior.

Oliver

olikraus:
I made similar experience. It seems that the IDE scans the code and links all libraries which are referenced by their .h file from your .pde file. According to your results, the IDE is not clever enough to check for the #ifdef statements.

Which also leaves behind a nagging question about the standard libraries, which seem to get recompiled and linked in any case: what is their additional code size and RAM usage, even if not used? The linker seems smart enough not to link and include unused functions, as the code size varies depending on how many different functions we use out of a library. But what about globals? I didn’t do an in-depth analysis of all lib modules, but Serial comes to mind as obvious example. Sure, for serially downloading a program I need Serial, but, say, what about Serial1 to Serial3 on a Mega, do their data structures (mainly in-buffers) eat up my precious RAM even if never used?

Does anybody have more info and knowledge here?

sbr_: But what about globals? I didn't do an in-depth analysis of all lib modules, but Serial comes to mind as obvious example. Sure, for serially downloading a program I need Serial, but, say, what about Serial1 to Serial3 on a Mega, do their data structures (mainly in-buffers) eat up my precious RAM even if never used?

Adding some more info. Test code:

uint8_t* data_bottom = (uint8_t*)512;     // Arduino Mega
//uint8_t* data_bottom = (uint8_t*)256;  // Arduino Uno

uint8_t* heap_top;
uint8_t* stack_ptr;

void check_mem() {
  stack_ptr = (uint8_t *)malloc(4);
  heap_top = stack_ptr; 
  free(stack_ptr);
  stack_ptr =  (uint8_t *)(SP);
}

void setup() {
  check_mem();
  Serial.begin(19200);
  Serial.println((uint16_t)heap_top - (uint16_t)data_bottom);
  Serial.println((uint16_t)heap_top - (uint16_t)data_bottom, HEX);
  Serial.println((uint16_t)stack_ptr - (uint16_t)data_bottom);
  Serial.println((uint16_t)stack_ptr - (uint16_t)data_bottom, HEX);
}

void loop() {}

I get: for Arduino Mega (8,192B RAM): heap_top: 711 (0x2C7), stack_ptr: 8178 (0x1FF2) for Arduino Uno (2,048B RAM): heap_top: 210 (0xD2), stack_ptr: 2037 (0x7F5)

sbr_: Sure, for serially downloading a program I need Serial, but, say, what about Serial1 to Serial3 on a Mega, do their data structures (mainly in-buffers) eat up my precious RAM even if never used?

As I recall, once the Serial library is linked in, on a Mega you get all 4 serial ports. This is because of these lines in HardwareSerial.cpp:

// Preinstantiate Objects //////////////////////////////////////////////////////

#if defined(UBRRH) && defined(UBRRL)
  HardwareSerial Serial(&rx_buffer, &UBRRH, &UBRRL, &UCSRA, &UCSRB, &UDR, RXEN, TXEN, RXCIE, UDRE, U2X);
#elif defined(UBRR0H) && defined(UBRR0L)
  HardwareSerial Serial(&rx_buffer, &UBRR0H, &UBRR0L, &UCSR0A, &UCSR0B, &UDR0, RXEN0, TXEN0, RXCIE0, UDRE0, U2X0);
#elif defined(USBCON)
  #warning no serial port defined  (port 0)
#else
  #error no serial port defined  (port 0)
#endif

#if defined(UBRR1H)
  HardwareSerial Serial1(&rx_buffer1, &UBRR1H, &UBRR1L, &UCSR1A, &UCSR1B, &UDR1, RXEN1, TXEN1, RXCIE1, UDRE1, U2X1);
#endif
#if defined(UBRR2H)
  HardwareSerial Serial2(&rx_buffer2, &UBRR2H, &UBRR2L, &UCSR2A, &UCSR2B, &UDR2, RXEN2, TXEN2, RXCIE2, UDRE2, U2X2);
#endif
#if defined(UBRR3H)
  HardwareSerial Serial3(&rx_buffer3, &UBRR3H, &UBRR3L, &UCSR3A, &UCSR3B, &UDR3, RXEN3, TXEN3, RXCIE3, UDRE3, U2X3);
#endif

So on a Mega, it pre-instantiates all 4 serial ports. Mind you, you have more RAM to play with. Personally I don't like the idea of pre-instantiating stuff like that. It would be better to do a "new" when required, but I think this is designed to make things easy for beginners.

You could try #undef UBRR1H, but somehow I doubt that will be honoured (that is, the sequence of processing include files will probably define the symbol, and use it, before you get a chance to undefine it).

There is nothing really stopping you making a modified serial library that doesn't pre-instantiate the objects, and then just declare them if you need them.

[quote author=Nick Gammon link=topic=80173.msg606338#msg606338 date=1322252263] There is nothing really stopping you making a modified serial library that doesn't pre-instantiate the objects, and then just declare them if you need them. [/quote]

Sure, but for the sake of future compatibility, I really prefer not to customize the Arduino libraries, unless really needed. And as you say, on the Mega, there's quite some RAM available... Anyway, I just used the Serial lib as obvious example, being aware that there are pre-instantited objects, but my point was more generic, in that I think it's plain wrong to have code (and possibly RAM usage) thrown in through the backdoor by the linker that I could not even use, as the compiler didn't know about it in the first place, as explained in my original post. Library modules not #included by my code, or indirectly in via another (standard) module, should not end up in my binary. Sure, knowing about this, I can handle it if needed, but it's a rather surprising behaviour, at least for me. I still love the Arduinos. :)

Things you don't use are not included.

Compare:

void setup () {
}

unsigned long counter;

void loop ()
{
}

Compiles to 450 bytes.


void setup () {
  Serial.begin (115200);
}

unsigned long counter;

void loop ()
{
}

Compiles to 1436 bytes.

So you aren't getting those extra objects unless you "pull in" the library. However once you refer to the library, you get the pre-instantiated objects.

The above code (with Serial.begin) for example compiles to 2274 bytes on the Mega, which is probably due to the extra objects (the code to do Serial.begin would be much the same).

[quote author=Nick Gammon link=topic=80173.msg606620#msg606620 date=1322271882] Things you don't use are not included.

So you aren't getting those extra objects unless you "pull in" the library. However once you refer to the library, you get the pre-instantiated objects.[/quote]

Isn't that the result of the -fdata-sections and -ffunction-sections compiler flags?

They may help, but I think it is just a case of library code (and data) not being included (linked in) unless needed.

Please read my original post. I think I have shown that #includes that are inhibited using #ifdef, so that I cannot use them, actually do get included in the code and occupy RAM if there are pre-instantiated objects. You must manually comment unused libraries out to avoid this.

I read your post.

Let me explain it this way …

  • Say you refer to Serial in your sketch.
  • The IDE includes into the compiling / linking process HardwareSerial.cpp.
  • When HardwareSerial.cpp is compiled, it is a stand-alone module, compiled under the current compiler defines
  • The compiler defines (set up indirectly by your choice of board) include the board type (eg. Mega2560)
  • Any attempts to undefine things in your module, foo.cpp, are irrelevant to how HardwareSerial.cpp is compiled.
  • Thus, you can’t easily change the way the libraries compile.

I would expect that the IDE does, or does not, include and thus compile and link the files within the #ifdef statement, depending if #define SDB is commented out or not: …

I think the IDE is “smart” enough to scan the known libraries, and using the keywords.txt files, deduce which libraries your source is using, and add them to its list of libraries to be compiled and linked accordingly.

And even if that wasn’t the case, undefining things in your foo.cpp file is still not going to change the way HardwareSerial.cpp (and other relevant files) are compiled, because they are not “subsidiaries” of foo.cpp.

[quote author=Nick Gammon link=topic=80173.msg607196#msg607196 date=1322339363] I think the IDE is "smart" enough to scan the known libraries, and using the keywords.txt files, deduce which libraries your source is using, and add them to its list of libraries to be compiled and linked accordingly. [/quote]

The aspect of HardwareSerial.cpp (or any other standard lib for that matter) was a second thought. My original post stated that (non-standard) library modules get compiled and linked even if you have excluded them using #ifndef/#endif, and that you must comment out the corresponding #include statements to avoid this. So no, the IDE is not smart, to the contrary: it compiles and links code that you have excluded explicitly (again, non std libs). The IDE even compiles modules that you don't #include -- if you just happen to keep them in the same directory with another module that you do #include, even if they are not in any way related to the code being compiled, as the verbose output shows.

And I really hope that the keywords.txt files are not being used to influence the compile and link process...

I put the word "smart" into quotes, in a mocking way. I don't necessarily agree with the way they are doing things, and in particular the way you sometimes have to juggle things around to get them to compile in a "normal" way.

However to make things clear about your point:

... it compiles and links code that you have excluded explicitly ...

#include "sdb_support.h"
#include "disp_ulcd_144.h"
#include "grid_display.h"
#include "PString.h"
#include "silencer.h"

Those includes do not, and would not even under a normal compilation setup, influence which modules are part of the compilation process.

All they are doing is making the variables/declarations/defines in them available to your current source file. That's all.

Say, for example this line:

#include "disp_ulcd_144.h"

That doesn't cause disp_ulcd_144.cpp to be compiled. How could it? And omitting it doesn't cause it to not be compiled.

I haven't attempted to browse how the IDE works, however by observation it appears that it copies some or all of the .cpp files in your current project into the /tmp directory (whatever its name is exactly) based on some sort of criteria. I think that if you add files to your "project" by adding tabs then they will be included. Possibly, and possibly not, other .cpp files in your project's directory. And based on I-don't-know-what rule, it also copies and compiles other files (eg. wiring.c).

But I honestly think that this notion here is incorrect, even not using the IDE:

I would expect that the IDE does, or does not, include and thus compile and link the files within the #ifdef statement, depending if #define SDB is commented out or not ...

Again, including files into a particular module can't really influence which other modules are presented to the compiler for compiling.

[quote author=Nick Gammon link=topic=80173.msg607196#msg607196 date=1322339363]

I would expect that the IDE does, or does not, include and thus compile and link the files within the #ifdef statement, depending if #define SDB is commented out or not: ...

I think the IDE is "smart" enough to scan the known libraries, and using the keywords.txt files, deduce which libraries your source is using, and add them to its list of libraries to be compiled and linked accordingly.

And even if that wasn't the case, undefining things in your foo.cpp file is still not going to change the way HardwareSerial.cpp (and other relevant files) are compiled, because they are not "subsidiaries" of foo.cpp. [/quote]

From the data presented in the original post, I think the IDE does something like:

  • Scan the source file for #include statements
  • For each of those #include statements, add the corresponding .cpp file to the set of files compiled
  • Link with the output of all compiled files

If this is the case, then, because the files for the modules you're using are linked as compiled, not as archives, then the linker does not do "dead code elimination" on the object files. (This is a traditional behavior of the GNU linker and most UNIX linkers)

Thus, if there are any static or global objects in modules that the IDE thinks you're using, you will end up with those objects in your output, even if you aren't ACTUALLY using them. The difference between #ifdef and //commented out includes seem to indicate that the IDE is not smart enough to tell when something is #ifdefed out, and instead compiles-in anything it sees a #include for.

Well let’s work through it …

I created a sketch “sbr_forum_problem.cpp” as follows:

sbr_forum_problem.cpp

#include "bar.h"

void setup () {  bar (); }
void loop ()  { }

Also another file “bar.cpp”:

bar.cpp

// bar.cpp

volatile int x;

void bar ()  { x = 42; }

And an include file for it:

bar.h

// bar.h

volatile char barbuf  [100];

void bar ();

First test: compiling in the IDE, with bar.cpp and bar.h in the same directory as the sketch.

Results:

sbr_forum_problem.cpp:1:17: error: bar.h: No such file or directory
sbr_forum_problem.cpp: In function 'void setup()':
sbr_forum_problem:2: error: 'bar' was not declared in this scope

Conclusion: The IDE does not automatically know about, nor compile, a xxx.cpp file just because it is in the same directory as the sketch.


Second test: Added bar.h into the IDE by clicking on “new tab” and creating the file (same contents as before).

Results:

sbr_forum_problem.cpp.o: In function `setup':
sbr_forum_problem.cpp:6: undefined reference to `bar()'

Conclusion: The IDE now knows about the bar.h file, so the compile succeeded, but the link failed because it didn’t compile bar.cpp.


Third test: Added bar.cpp into the IDE in the same way.

Results:

Binary sketch size: 468 bytes (of a 32256 byte maximum)

Conclusion: The IDE only compiles the files you tell it about in the IDE (disregarding the #include directives).


Fourth test: Commented out the references to bar as follows"

//#include "bar.h"

void setup () { 
//  bar (); 
}

void loop ()  { }

Results:

bar.cpp still compiled:

/var/folders/1l/43x8v10s1v36trvjz3v92m900000gn/T/build5356224081509626413.tmp/sbr_forum_problem.cpp.elf /var/folders/1l/43x8v10s1v36trvjz3v92m900000gn/T/build5356224081509626413.tmp/bar.cpp.o

Sketch slightly smaller (18 bytes smaller):

Binary sketch size: 450 bytes (of a 32256 byte maximum)

This 18 bytes can be accounted for as follows. The bar function (14 bytes):

void bar ()  { x = 42; }
  a6:	8a e2       	ldi	r24, 0x2A	; 42
  a8:	90 e0       	ldi	r25, 0x00	; 0
  aa:	90 93 01 01 	sts	0x0101, r25
  ae:	80 93 00 01 	sts	0x0100, r24
  b2:	08 95       	ret

Calling bar() - 4 bytes:

 bar (); 
  b6:	0e 94 53 00 	call	0xa6	; 0xa6 <_Z3barv>

Conclusion: Even though bar.cpp was compiled, its code was not included since there was no reference to it. I would regard this as a linker decision.


Fifth test: Removed bar.h and bar.cpp from the project.

Results:

Binary sketch size: 450 bytes (of a 32256 byte maximum)

Conclusion: Same size as in previous test, so it would appear that the linker efficiently strips out code not required in the sketch.


Sixth test: Added bar.cpp and bar.h back into the project. Changed the main sketch to read:

#if 0
  #include "bar.h"
#endif

void setup () { 
#if 0
  bar (); 
#endif
}

void loop ()  { }

Results:

Binary sketch size: 450 bytes (of a 32256 byte maximum)

Conclusion: This is the same size as commenting out the #include line. So it would appear that there is no difference between commenting-out and include, and #ifdef’ing it out.

Note that the original post was about libraries, not files included in the sketch.
AFAICT, you can get the project to include a “library” of your own by pointing the IDE at some particular directory that contains directories for libraries.
You can actually include code of your own that way, without “adding” it to the sketch. You’ll have to use another editor, though.
For example, consider:

c:\code\sketchbooks
  c:\code\sketchbooks\libraries
    c:\code\sketchbooks\libraries\MyLib
      MyLib.h
      MyLib.cpp
  c:\code\sketchbooks\mysketch
    mysketch.cpp

Further, if you look at what I suggested, I suggested it has something to do with global or static data, rather than just unreferenced functions.

I put in the following files:

//MyLib.h

class CMyLib {
public:
  CMyLib();
  char data[80];
};
extern CMyLib MyLib;
// Mylib.cpp

#include <MyLib.h>
#include <string.h>

CMyLib MyLib;
CMyLib::CMyLib()
{
  memset(data, 1, sizeof(data));
}
// mysketch.pde

#include <MyLib.h>

void setup() {}
void loop() {}

Compile this, and get 524 bytes.

Now, #ifdef out the #include of MyLib.h in the project PDE file, and compile again.

// mysketch.pde
#if defined(FOO)
#include <MyLib.h>
#endif

void setup() {}
void loop() {}

524 bytes.

Now, comment out the #include line:

// mysketch.pde
#if defined(FOO)
//#include <MyLib.h>
#endif

void setup() {}
void loop() {}

Compile, and: 450 bytes!

I would say that the behavior that I suggest might cause the symptoms that the original poster asked about are actually pretty strongly indicated at this point.

Note that this separates the behavior of a “library” in the IDE from the behavior of a “library” such as managed by “ar” – in an “ar” library, no objects are included unless at least one symbol is actually referenced. But, because of the IDE “magic” to make it easier for people who are new to compiling and linking, my guess is that we see this behavior.

I can’t figure out how to attach files, but you can download this example from http://www.watte.net/sketchbooks.zip

Note: you have to set the sketchbooks directory in preferences to be “sketchbooks” from this archive and restart the IDE to get the automatic library inclusion behavior.

jwatte:
For example, consider:

c:\code\sketchbooks

c:\code\sketchbooks\libraries
    c:\code\sketchbooks\libraries\MyLib
      MyLib.h
      MyLib.cpp
  c:\code\sketchbooks\mysketch
    mysketch.cpp

Add any other .h/.cpp combo to c:\code\sketchbooks\libraries\MyLib, like:

c:\code\sketchbooks
  c:\code\sketchbooks\libraries
    c:\code\sketchbooks\libraries\MyLib
      MyLib.h
      MyLib.cpp
      MyOtherLib.h
      MyOtherLib.cpp
  c:\code\sketchbooks\mysketch
    mysketch.cpp

…restart the IDE, and if you compile and link this sketch

// mysketch.pde

#include <MyLib.h>

void setup() {}
void loop() {}

…MyOtherLib.cpp will be compiled and even linked as well, if I can trust the verbose output of the compilation and linking process. I’d say that the gcc linker is smart enough to then not actually link (unused) symbols from MyOtherLib, but if there are globals in MyOtherLib, they will be included.

In fact, the IDE compiles the whole standard library every time we compile a sketch, plus of course all referenced local libs in the ‘libraries’ folder, plus the .cpp files in the same folder as the sketch (assuming you have restarted the IDE). The standard lib gets slapped into an archive core.a, while the non-std libs remain as .o files, and the compiled sketch is linked against all of these. By re-compiling the standard library every time, the IDE copes with the different Arduino boards and configurations, for example.

So my observation and conclusion is that the IDE actually does use the #includes to determine the compilation process (and even in an “extensive” way – better a few more compiles than missing one!), contrary to what you’d expect from the usual behavior of a C/C++ environment. Of course I see the intentions, and benefits for the target audience of the Arduino technology.

I just fell into this hole and had no idea why my display stopped working. I am using an ePaper display. A friend has one that is a different size so I put the #include for the display drivers inside of #ifdef/#else/#endif. And suddenly my display stopped working. I was doing many changes to bring it up to where it should be since this was a fork of my main package so I was unsure which change made the display stop.

It took a while to realize it since ePaper contents do not go away on a reboot. This is on version 1.8.5 so it is still a bug, in my view.

To all tempted to reply to the posts from 2011, this thread is nearly 7 years old.