Go Down

Topic: Which Files/Functions/Tables are compiled along with Blank Sketch of UNO  (Read 252 times) previous topic - next topic

GolamMostafa

Under IDE 1.8.0 and Arduino UNO, we have compiled a Blank Sketch (no codes, no header files; only the empty setup() and loop()). We have found that the corresponding hex file (excluding bootloader) has taken the size of 1265 byte. We would like to know the names of these specific files/functions/tables which are being compiled along with this blank sketch.
 

pert

That's quite strange. Compiling for Uno with 1.8.0 I get this:
Code: [Select]
Sketch uses 444 bytes (1%) of program storage space. Maximum is 32256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2039 bytes for local variables. Maximum is 2048 bytes.

That's with the bundled Arduino AVR Boards version of 1.6.16. Which one are you  using (found at Tools > Board > Boards Manager)? Have you modified your core?

During the sketch preprocessing step the Arduino IDE adds the following line to your sketch:
Code: [Select]
#include <Arduino.h>
which brings in the core libraries:
https://github.com/arduino/ArduinoCore-avr/tree/master/cores/arduino

In main.cpp you find this line:
Code: [Select]
init();
which does the hardware initialization. After commenting that line out I get:
Code: [Select]
Sketch uses 146 bytes (0%) of program storage space. Maximum is 32256 bytes.
Global variables use 0 bytes (0%) of dynamic memory, leaving 2048 bytes for local variables. Maximum is 2048 bytes.

but there's definitely some important stuff that happens in init() so it's not something to just throw out without thought. Lower hanging fruit is this line:
Code: [Select]
if (serialEventRun) serialEventRun();
with that commented I get:
Code: [Select]
Sketch uses 432 bytes (1%) of program storage space. Maximum is 32256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2039 bytes for local variables. Maximum is 2048 bytes.

A reduction of 12 bytes isn't so much but the only purpose of this line is to provide the serialEvent() feature, which practically nobody uses and can easily be replicated in the sketch with a couple lines of code, making things more transparent. You also incur a small bit of clock overhead for that function call on every single loop, even when you're not using the feature. If you don't want to modify the core you can define an empty serialEventRun() (which is weakly defined in the core) in your sketch:
Code: [Select]
void serialEventRun(){}
which results in:
Code: [Select]
Sketch uses 434 bytes (1%) of program storage space. Maximum is 32256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2039 bytes for local variables. Maximum is 2048 bytes.

Not quite as good but quite an easy way to drop 12 bytes when you're not using serialEvent().

You can also override the core's main() (which is weakly defined) by defining it in your sketch. The only drawback to this is that I haven't found any way to call initVariant() from the sketch but that function is only used by certain boards, and not the Uno.

With this sketch:
Code: [Select]
int main(void) {
  init();
#if defined(USBCON)
  USBDevice.attach();
#endif
  setup();
  for (;;) {
    loop();
  }
  return 0;
}
void setup() {}
void loop() {}

I'm back to:
Code: [Select]
Sketch uses 432 bytes (1%) of program storage space. Maximum is 32256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2039 bytes for local variables. Maximum is 2048 bytes.

With recent versions of Arduino AVR Boards (including 1.6.16) there's no overhead for the loop() and setup() calls because they're inlined by the compiler.

GolamMostafa

First of all, thanks with +1 for the information that we have been looking for a long time. This information will now help us to delete undesired feature (s) to reduce the flash consumption.   

Quote
Sketch uses 444 bytes (1%) of program storage space. Maximum is 32256 bytes.
Global variables use 9 bytes (0%) of dynamic memory, leaving 2039 bytes for local variables. Maximum is 2048 bytes.
I have also received the same message in the IDE Window. Because my interest is in the hex file which is uploaded in the target microcontroller, I have taken the size of the hex file from the sketch folder. (I have not changed anything anywhere.)

==> Verify/Compile
==> Export compiled Binary
==> Show Sketch Folder
==> sketch_nov08a.ino.eightanaloginputs (Type of File: .hex; Size (1265 bytes).

I hope that I will back soon with more queries on this matter.

pert

Why is the size of the .hex file of interest to you? What's most important is the memory usage on the microcontroller. For that you can either use the message printed by the Arduino IDE after compilation or you can use the avr-size tool, which is what the Arduino IDE uses to get those numbers.

GolamMostafa

Quote
Why is the size of the .hex file of interest to you? What's most important is the memory usage on the microcontroller. For that you can either use the message printed by the Arduino IDE after compilation or you can use the avr-size tool, which is what the Arduino IDE uses to get those numbers.
But, the figures are differing!

My understanding, based on my ATmega32A Assembly Language Programming, is that the hex file is the actual file that is fused inside the Code Memory (flash) of the MCU. Accordingly, I have known that the hex file created by the Arduino IDE is the output of my sketch; so, it is fused inside the flash of ATmega328P of UNO during uploading process. Therefore, the size of the hex file will tell me how much flash memory space is going to be occupied by the sketch. I would like to take the hex file size from the properties list of the file as the figures provided by 'IDE Window' and 'hex file size' are differing.

RayLivingston

A HEX file is typically 3X the size of the code it represents.  The size in FLASH is the size the IDE displays at the end of the compile.

Regards,
Ray L.

pert

Try this:
Compile File > Examples > Ethernet > DhcpChatServer. Now tell me how you can fit a 41 kB .hex file on an Uno that only has 31.5 kB of flash free after the boot section is reserved.

CrossRoads

The .hex file has more than the 444 bytes that will get loaded into flash.

Open the file with Notepad++ and you will see the leading characters and the ending characters on each line that make the file larger.
https://notepad-plus-plus.org/download/v7.5.3.html

For example, here is the first line of blink.hex
Code: [Select]
:100000000C945C000C946E000C946E000C946E00CA
Only the 16 byte in the middle are the code that goes into flash:
000C945C000C946E000C946E000C946E


The rest is : and 6 characters of the starting address for the block of 16 bytes, and 4  bytes for the checksum for the line.
:100000 and 00CA on this line.
End even then 8 bits are used to represent each character, while in flash a byte holds 2 characters,
thus each line of 43 characters would use 43 bytes in the hex file, yet only 16 bytes in flash.
Designing & building electrical circuits for over 25 years.  Screw Shield for Mega/Due/Uno,  Bobuino with ATMega1284P, & other '328P & '1284P creations & offerings at  my website.

GolamMostafa

Let us do some simple calculations to get the meanings of various numerical figures that we are encountering in connection with the size of the compiled file of a sketch.

Let us note: When it is said that the hex file is fused inside the Code Memory (flash) of the target MCU, it has implicitly said that the 'fused codes' are only the 'program codes' of the hex file excluding the 'Transmission file formatting bytes which are: first 9 characters (header information), last 2 characters ( CHKSUM)' of every frame of the Intel-Hex formatted hex file. In addition, the last frame is also not a part of the program codes.

The compilation of the following sketch (to blink L at 1-sec interval) has produced a hex file of size 2799 bytes. The IDE Windows says 'Sketch uses 990 bytes (3%) of program storage space.' Let us see if they agree finally as to the actual size of consumed flash.

Code: [Select]
void setup()
{
  pinMode(13, OUTPUT); //the same command is repeated for 5 times intentionally to make a Marker;
  pinMode(13, OUTPUT); //this Marker will help to locate the address of the flash from which the
  pinMode(13, OUTPUT); //application begins to be stored.   
  pinMode(13, OUTPUT);
  pinMode(13, OUTPUT);
}

void loop()
{
  digitalWrite(13, !digitalRead(13));
  delay(1000);
}

  
Partial listing of the hex file:
Code: [Select]
:100000000C9461000C9473000C9473000C947300B6
:100010000C9473000C9473000C9473000C94730094
-------------------------------------------------------------
-------------------------------------------------------------
:1003B00019F7C114D10409F498CF0E94000095CF19
:0403C000F894FFCFDF

:00000001FF  //last frame; it is not part of sketch; it indicates (01) that no more frame is to arrive

There are 62 frames in the hex file. The last frame is to be excluded. Each of the first 60 frames contains 16 byte program codes. The 61th frame contains 4-byte program codes.

So, the total program codes are: 60x16 + 04 = 960 + 4 = 964 bytes. The figure is almost equal to the figure announced by the IDE Window.

From qualitative point of view, both 'the hex file size' and 'the IDE Message' are proportional to flash consumption; but, 'IDE Message' indicates almost the actual consumption of flash.

@pert: As a novice of Embedded-C programmer (who has migrated from assembly programming), I have learnt that the IDE Message is to be cared to know the actual flash and RAM consumption.

@RayLivingston: Yes! The size of hex file is almost 3X of program codes.

@CrossRoads: In Intel-hex formatted frame, the first 9 characters are header information and the last 2 characters (1-byte) are checksum.

westfw

Presumably, you want to know what is included in the final sketch, rather than what is "compiled."  The Arduino build process compiles a LOT of code that ends up being discarded by the linker because it is never referenced.

An empty Uno sketch compiles to a binary that includes:
Code: [Select]
BillW-MacOSX-2<10004> avr-nm -SC --size-sort *.elf
00800104 00000001 b timer0_fract
00800105 00000004 b timer0_millis
00800100 00000004 b timer0_overflow_count
00000074 00000010 T __do_clear_bss
00000090 00000094 T __vector_16
00000124 00000094 T main


main() in this case is going to have things like init() in-lined by link-time-optimization.
But all together: about 160 bytes of timer0 overflow ISR code (__vector_16), 160 bytes of init()/main()/loop()/setup()/exit() (main()), and 120 bytes of vector table (the stuff that comes before do_clear_bss())
The size of the .hex file is only slightly related to the size of the actual code, as explained by CrossRoads...

You could look at https://www.element14.com/community/docs/DOC-29257/l/analysis-of-an-empty-arduino-sketch  - that's pretty old, and some of the exact details have changed because of LTO and other changes, but it's still approximately the same.


GolamMostafa

@westfw
So much hard work you did back in 2010 to bring the picture to audience - triple thanks with +1.

@To all
Please review/correct/comment on the following Flash Map of ATmega328P of Arduino UNO (IDE 1.8.0), which I have prepared being inspired by the posts of this thread. I have also few queries; more queries will come later as my studies/understanding slowly progresses on this issue.



To prepare the above memory map, I have taken the following sketch (and its standard hex file) as example.
Code: [Select]
void setup()
{
  pinMode(13, OUTPUT); //same functions are intentionally put for 5 times to detect the beginning of sketch
  pinMode(13, OUTPUT);
  pinMode(13, OUTPUT);
  pinMode(13, OUTPUT);
  pinMode(13, OUTPUT);
}

void loop()
{
  digitalWrite(13, !digitalRead(13));
  delay(1000);
}



Listing of hex file without Bootloader
Code: [Select]
: 10 0000 00 0C946100 0C947300 0C947300 0C947300 B6  //RESET_vect goes to 0x0061 (00C2)
: 10 0010 00 0C947300 0C947300 0C947300 0C947300 94
: 10 0020 00 0C947300 0C947300 0C947300 0C947300 84
: 10 0030 00 0C947300 0C947300 0C947300 0C947300 74
: 10 0040 00 0C94DF00 0C947300 0C947300 0C947300 F8   //TIMER0-OVFFFFFFFFFFF_vect goes to 0x00DF (01BE)
: 10 00500 00C947300 0C947300 0C947300 0C947300 54
: 10 0060 00 0C947300 0C947300 0000000023002600 21
:10007000290000000008000201000003040700003E
:1000800000000000000000000000250028002B00F8
:1000900000000000240027002A00040404040404D3
:1000A0000404020202020202030303030303010227
:1000B00004081020408001020408102001020408F6
: 10 00C0 00 1020 11241FBECFEFD8E0DEBFCDBF21E0 4E   //ISR(RESET_vect)
:1000D000A0E0B1E001C01D92A930B207E1F70E9493
:1000E00029010C94ED010C940000EBEBF0E024915D
:1000F000E7EAF0E08491882399F090E0880F991F57
:10010000FC01E057FF4FA591B491FC01EA57FF4F66
:10011000859194918FB7F894EC91E22BEC938FBF7B
:100120000895833081F028F4813099F08230A1F075
:1001300008958730A9F08830B9F08430D1F48091E7
:1001400080008F7D03C0809180008F778093800036
:10015000089584B58F7702C084B58F7D84BD0895DE
:100160008091B0008F7703C08091B0008F7D809325
:10017000B00008953FB7F894809105019091060171
:10018000A0910701B091080126B5A89B05C02F3F9B
:1001900019F00196A11DB11D3FBFBA2FA92F982FAD
:1001A0008827820F911DA11DB11DBC01CD0142E028
: 10 01B0 00 660F771F881F991F4A95D1F70895 1F92 E0  //ISR(TIMER0_OVF_vect)
:1001C0000F920FB60F9211242F933F938F939F930B
:1001D000AF93BF938091010190910201A09103011F
:1001E000B09104013091000123E0230F2D3720F45A
:1001F0000196A11DB11D05C026E8230F0296A11D81
:10020000B11D209300018093010190930201A093FE
:100210000301B09304018091050190910601A09122
:100220000701B09108010196A11DB11D8093050140
:1002300090930601A0930701B0930801BF91AF917D
:100240009F918F913F912F910F900FBE0F901F9014
:100250001895789484B5826084BD84B5816084BD2E
:1002600085B5826085BD85B5816085BD80916E0054
:10027000816080936E001092810080918100826085
:1002800080938100809181008160809381008091C2
:1002900080008160809380008091B10084608093B1
:1002A000B1008091B00081608093B00080917A00AD
:1002B000846080937A0080917A00826080937A00D3
:1002C00080917A00816080937A0080917A008068C2
: 10 02D0 00 80937A00 1092C100 0E9475000E947500 00  //Sketch begins at: 0x016C (02D8)
:1002E0000E9475000E9475000E947500CFE7D0E063
:1002F0000BEB10E047EAE42E40E0F42E50E0C52E70
:1003000050E0D52EFE018491F801A490F701B4903D
:10031000BB20A1F081110E949100EB2DF0E0EE0FC7
:10032000FF1FE859FF4FA591B4918C91A82291E04D
:1003300080E009F490E0A92EB82E02C0A12CB12CC7
:10034000FE018491F8018490F70194909920B9F00E
:1003500081110E949100892D90E0880F991F8A5782
:100360009F4FFC01A591B4918FB7F8949C91AB2855
:1003700019F08094892201C0892A8C928FBF0E9433
:10038000BA002B013C0188EE882E83E0982EA12C28
:10039000B12C0E94BA00DC01CB0184199509A60991
:1003A000B709883E9340A105B10558F0F1E08F1AD6
:1003B0009108A108B10828EE420E23E0521E611CEC
:1003C000711C81149104A104B10419F7C114D10462
: 0E 03D0 00 09F498CF0E94000095CFF894FFCF 5B   //Sketch ends here
:00000001FF


Listing of the Boot Loader Code (partial and from the beginning);
Code: [Select]

:107E0000112484B714BE81FFF0D085E080938100F7
:107E100082E08093C00088E18093C10086E0809377
:107E2000C20080E18093C4008EE0C9D0259A86E02C
-------------------------------------------------------


My Queries:
1.  As far as I know, the fuse bytes for ATmega328P of Arduino UNO are:
0x05 : Extended Fuse Byte
0xDA ; High Fuse Byte
0x0xFF : Low Fuse Byte

When the above bytes are expanded and checked with the data sheets, it is seen that the Boot Section occupies the space : 3C00 - 3FFF. But, the Bootloader hex file shows: 3F00 - 3FFF (7E00 -  ) and accordingly the High Fuse Byte should be 0xDE and not 0xDA. Please clarify.

2. Is the beginning address of the sketch (user application program) is fixed, and it is always 0x016C? (Please see the listing of the standard hex file). This is the location at which the MCU makes a jump after doing the tasks as prescribed in the Boot Loader. I executed few sketches of different sizes, and I have always found the location at 0x016C (02D8).

3. For ATmega328P of Arduino UNO, ISR(RESET_vect: 0x0000) will never ever be executed. am I correct?

4. ISRs for so many interrupts are linked to the same location (0x0073) which means that they are not initialized by Arduino (they will be initialized as needed by the application program). These are marked as ; 0xa2 <__bad_interrupt> in this post by @westfw.  

5. (About storage order of 16-bit operand in Word location of flash.) Is it the lower byte of the operand that goes into upper byte (Bit15-Bit8) of the Word location?

Jiggy-Ninja

But, the figures are differing!

My understanding, based on my ATmega32A Assembly Language Programming, is that the hex file is the actual file that is fused inside the Code Memory (flash) of the MCU. Accordingly, I have known that the hex file created by the Arduino IDE is the output of my sketch; so, it is fused inside the flash of ATmega328P of UNO during uploading process. Therefore, the size of the hex file will tell me how much flash memory space is going to be occupied by the sketch. I would like to take the hex file size from the properties list of the file as the figures provided by 'IDE Window' and 'hex file size' are differing.
Intel Hex format is not a raw binary dump. Values are encoded as they're ASCII hexadecimal notation so you can open up the file in a text editor (like Notepad) to view and interpret it. This is an automatic doubling of file size. On top of that, there is extra overhead since each line also has a checksum (to protect against transmission errors) and the memory address where that line is to be stored.

The encoding and additional overhead are why RayLivingston says that a Hex file is about 3 times larger than the binary information it represents.
Hackaday: https://hackaday.io/MarkRD
Advanced C++ Techniques: https://forum.arduino.cc/index.php?topic=493075.0

westfw

Quote
1.  As far as I know, the fuse bytes for ATmega328P of Arduino UNO are:
0x05 : Extended Fuse Byte  0xDA ; High Fuse Byte 0x0xFF : Low Fuse Byte
[but] the High Fuse Byte should be 0xDE and not 0xDA. Please clarify.
The hardware/arduino/avr/boards.txt file has:

Code: [Select]
uno.bootloader.low_fuses=0xFF
uno.bootloader.high_fuses=0xDE
uno.bootloader.extended_fuses=0xFD


(matching what you think it should be.   Note that the popular Nano, and pre-uno ATmega328-based boards, have an older bootloader (2k in length) and use the older fuse definitions.)


Quote
2. Is the beginning address of the sketch (user application program) is fixed, and it is always 0x016C?
No.  I mean, it depends somewhat on what you consider the "beginning of the sketch" (setup()? main()? C startup code?), but a lot of things can push the start address around, including the use of global constructors or variables in pgmspace.  If you compile the "ASCIITable" example, main() ends up at 0x4c0, and moves to 0x4dc if you put an F() macro around the header string.

Quote
This is the location at which the MCU makes a jump after doing the tasks as prescribed in the Boot Loader.
I think you're mistaking the C startup code for part of the bootloader.  the reset_vector is the proper "start of sketch", and your memory dump shows it jumping to 0x61, which is the start of the actual sketch code (not including the vectors.)  This is also subject to moving around, though.


Quote
3. For ATmega328P of Arduino UNO, ISR(RESET_vect: 0x0000) will never ever be executed. am I correct?
The bootloader starts the sketch by jumping to the reset vector at 0x0.   So yes, it IS executed.


Quote
4. ISRs for so many interrupts are linked to the same location (0x0073) which means that they are not initialized by Arduino
They're not USED (hopefully.)   They are initialized to point to the__bad_interrupt function (which is a "jmp 0" attempt to restart.  Probably not very useful.)
Quote
they will be initialized as needed by the application program
No, they're in flash, so they can't be changed by the application.  If you don't set up your application with the proper ISR() functions, you can NEVER use that interrupt.


Quote
5. (About storage order of 16-bit operand in Word location of flash.) Is it the lower byte of the operand that goes into upper byte (Bit15-Bit8) of the Word location?
Hmm.  Hard to say.   The AVR is a little-endian architecture, which means that the low byte of a constant is stored in the low byte of memory.  But it's less clear how the 16bit instruction words are documented/listed, and how objdump handles things.   If you look at:
Code: [Select]
  0:   0c 94 43 00     jmp     0x86    ; 0x86 <__ctors_end>
The "43 00" part is pretty clearly low-byte first.  The jmp instruction is documented as being "1001.010k kkkk.110k / kkkk.kkkkk kkkk.kkkk", so it looks like it is being stored low-byte first as well.

Go Up