Unused code included in build

Hi,

Not being a deeply experienced C coder, I stumbled across something that made me wonder:

When my sketch - using Ethernet (W5500) - got close to consuming 100% program space, I started analyzing it with avr-nm and found that the function "getHostByName" required quite a lot of space.

Since I wasn't using any DNS stuff but only IPs, I wondered why it might get included at all and started commenting all functions that were using it (besides the getHostByName itself, mainly Ethernet::connect(const char *...) and alike). In the end my very same sketch still compiled and worked, it didn't complain about missing function declarations - and required 1.7KB less space! That was proof for me that those code parts are actually not required.

My experience so far was that only code/functions actually used are included in the build and I still can't see the reason why all that "connect via hostname" stuff had been.
I also wiped the build directory, restarted the Arduino IDE, etc. but without commenting out those functions they always got included.

Perhaps someone can explain why that happens and if there's maybe a "neater" way of getting rid of unused code.

Thank you!

I just did some more tests and here's a simple sketch that includes that "expensive" getHostByName function:

#include <Ethernet.h>

EthernetClient ethClient;

void setup() {
}

void loop() {
}

which results in a sketch using 7408 bytes (for my Arduino Pro Mini). And avr-nm shows:

00002464 00000322 t _ZN13EthernetClass11socketBeginEhj.part.1
00003870 00000410 t _ZN13EthernetClass10socketRecvEhPhi
00004852 00001216 t _ZN9DNSClient13getHostByNameEPKcR9IPAddressj.constprop.12

What I then commented was:

  • int getHostByName(const char* aHostname, IPAddress& aResult, uint16_t timeout=5000); in DNSClient
  • virtual int beginPacket(const char *host, uint16_t port); in EthernetUDP
  • virtual int connect(const char *host, uint16_t port); in EthernetClient
  • virtual int connect(const char *host, uint16_t port) =0; in Client (arduino cores)
  • virtual int beginPacket(const char *host, uint16_t port) =0; in Udp (arduino cores)

The very same sketch then compiles successfully to a binary of only 4152 bytes. :astonished:

the compiler can't evaluate if virtual functions connect and beginPacket are used

try to add final keyword to EthernetClient and EthernetUDP class

class EthernetUDP final : public UDP {

Juraj:
the compiler can't evaluate if virtual functions connect and beginPacket are used

try to add final keyword to EthernetClient and EthernetUDP class

class EthernetUDP final : public UDP {

I tested it. final will not help

Hi Juraj,

Thank you for your quick response and your hint on the "virtual" keyword.

I did a little research and you're obviously right: those functions being virtual seems to be the reason for them to be included.

Inclusion of unused virtual functions seems to be a well known issue and while it seems there have been several attempts to fix this, it also seems it has been dropped due to complexity and bugginess (e.g. compiler flag -fvtable-gc).

It also seems that there is no simple solution like your suggestion of making the class final. So, I think the only ways of saving this unused space is to:

  • either comment/remove the whole chain of those virtual functions (like I did above), which is the safest way but requires this to be done in all used classes derived from the base class (in my case "Client")
  • or - in order to not have to touch the "core" - to have them emptied in the derived class(es) (in my case "EthernetClient" (derived from "Client") and "EthernetUDP" (derived from "UDP"). But then you'd have to ensure yourself that they're not called - also not by any other library you provide them to! (e.g. "PubSubClient", which makes use of that "Client" virtualization to support different Ethernet hardware)

Regards,
hubsif.

hubsif:
Hi Juraj,

Thank you for your quick response and your hint on the "virtual" keyword.

I did a little research and you're obviously right: those functions being virtual seems to be the reason for them to be included.

Inclusion of unused virtual functions seems to be a well known issue and while it seems there have been several attempts to fix this, it also seems it has been dropped due to complexity and bugginess (e.g. compiler flag -fvtable-gc).

It also seems that there is no simple solution like your suggestion of making the class final. So, I think the only ways of saving this unused space is to:

  • either comment/remove the whole chain of those virtual functions (like I did above), which is the safest way but requires this to be done in all used classes derived from the base class (in my case "Client")
  • or - in order to not have to touch the "core" - to have them emptied in the derived class(es) (in my case "EthernetClient" (derived from "Client") and "EthernetUDP" (derived from "UDP"). But then you'd have to ensure yourself that they're not called - also not by any other library you provide them to! (e.g. "PubSubClient", which makes use of that "Client" virtualization to support different Ethernet hardware)

Regards,
hubsif.

some #define and #idfef in functions bodies or only in getHostByName could be used