Learning about the hex file

Hi folks!
I am trying to learn more about the hex file that is generated when I compile a sketch.

So down below, you can see a regular BLINK sketch and its hex file.

  1. Does every line on the hex file correspond to a certain command in the sketch ( like Void, pinMode, loop)?
  2. What are the uses of having a hex file?
  3. The last line in the hex file is ":00000001FF", which is 511 in decimal. Is there something special about this line becuase it seems that almost all sketches end with line.

Ty for the replies.

It's a complete image of all the code, PLUS the stuff you don't see, like the interrupt vector table and the stuff that goes on before setup even gets executed.
It's compiled code, so you will have difficulty recognising anything you wrote.

It is used to allow the bootloader to put the image into flash memory.

If you want to see what the code compiles to, check how to use avr-objdump

The quickest way to learn about it is to print the assembly code file generated by the "C" compiler. It will show ALL the instructions generated and you can find the matching parts in the HEX file. I do not know how to locate the assembly file, I am sure others can point out the link.
Paul

To see the "machine code"

  • go to the folder where the .hex file was located
  • you'll find a .elf file (mine was called Blink.ino.elf as I used the Blink example)
  • run the command line avr-objdump -D Blink.ino.elf

Note that you might have to provide the path to avr-objdump, you can see that path from the compilation information, on my Mac it's in
~/Library/Arduino15/packages/arduino/tools/avr-gcc/7.3.0-atmel3.6.1-arduino7/bin/

You'll see a looooong listing of all the assembly language commands the compiler built for you

Blink.ino.elf:     file format elf32-avr


Disassembly of section .text:

00000000 <__vectors>:
   0:	0c 94 5c 00 	jmp	0xb8	; 0xb8 <__ctors_end>
   4:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
   8:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
   c:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  10:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  14:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  18:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  1c:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  20:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  24:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  28:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  2c:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  30:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  34:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  38:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  3c:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  40:	0c 94 13 01 	jmp	0x226	; 0x226 <__vector_16>
  44:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  48:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  4c:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  50:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  54:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  58:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  5c:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  60:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>
  64:	0c 94 6e 00 	jmp	0xdc	; 0xdc <__bad_interrupt>

00000068 <__trampolines_end>:
  68:	00 00       	nop
  6a:	00 00       	nop
  6c:	24 00       	.word	0x0024	; ????
  6e:	27 00       	.word	0x0027	; ????
  70:	2a 00       	.word	0x002a	; ????

00000072 <port_to_output_PGM>:
  72:	00 00       	nop
  74:	00 00       	nop
  76:	25 00       	.word	0x0025	; ????
  78:	28 00       	.word	0x0028	; ????
  7a:	2b 00       	.word	0x002b	; ????

0000007c <digital_pin_to_port_PGM>:
  7c:	04 04       	cpc	r0, r4
  7e:	04 04       	cpc	r0, r4
  80:	04 04       	cpc	r0, r4
  82:	04 04       	cpc	r0, r4
  84:	02 02       	muls	r16, r18
  86:	02 02       	muls	r16, r18
  88:	02 02       	muls	r16, r18
  8a:	03 03       	mulsu	r16, r19
  8c:	03 03       	mulsu	r16, r19
  8e:	03 03       	mulsu	r16, r19

00000090 <digital_pin_to_bit_mask_PGM>:
  90:	01 02       	muls	r16, r17
  92:	04 08       	sbc	r0, r4
  94:	10 20       	and	r1, r0
  96:	40 80       	ld	r4, Z
  98:	01 02       	muls	r16, r17
  9a:	04 08       	sbc	r0, r4
  9c:	10 20       	and	r1, r0
  9e:	01 02       	muls	r16, r17
  a0:	04 08       	sbc	r0, r4
  a2:	10 20       	and	r1, r0

000000a4 <digital_pin_to_timer_PGM>:
  a4:	00 00       	nop
  a6:	00 08       	sbc	r0, r0
  a8:	00 02       	muls	r16, r16
  aa:	01 00       	.word	0x0001	; ????
  ac:	00 03       	mulsu	r16, r16
  ae:	04 07       	cpc	r16, r20
	...

...

the first column is the address in memory, then you have some bytes, those are what will be written in the flash and then you have the corresponding assembly language commands for those bytes

Now you have a whole new language (and programming concepts) to learn: AVR Assembly language :slight_smile:

1 Like

Thank you for the post. I am sure that will help the OP with his study.
About 30 years ago I began a project to replace a main frame computer/high speed check reader-sorter connection with a PC connection.
Bisync point-to-point transparent communication at 19,200 bps. I did not know assembly for the IBM PC and planned to use C to generate the assembly code, which I would modify. After looking at the first cut of the assembly code, I decided to stick with C. 18 months later, we had a PC product that replaced a main frame.
Paul

Also, check this

Ty. How do I get to the command line? Through Windows PowerShell? Ty

I don't know. much about PCs - I'm on a Mac, so I've Unix under the hood.... May be windows users can help

Press and release the Windows key then type cmd and press return

I am, however, afraid that if you had to ask this then you will have little or no chance of understanding the disassembled file. Assuming, of course, that you get as far as to produce the disassembly

To meet your curiosity about knowing the use of hex fiel (in fact, it is called Intel-Hex Formatted File), I am putting below some information with an example of an assembly program.

1. The following assembly program ignites (ON) a LED connected at PB0-pin of a stand-alone ATmega328P MCU (not the one that is on the UNO Board).

        .include "m328Pdef.inc"
		.cseg
		.org	$0000
RESET: 	ldi		r16, 0xFF         ; 
		out		ddrb, r16         ;all port lines of Port-B are output lines
		sbi		portb, pb0      ;Logic High is asserted on PB0-pin of Port-B
		.exit

2. The following are the binary codes for the ASM (assembly) Codes of Step-1 (created using ATmel Studio 7.0)

                               		    .cseg
                                 		.org	$0000
000000 ef0f                      RESET:	ldi		r16, 0xFF
000001 b904                      		out		ddrb, r16
000002 9a28                      		sbi		portb, pb0

3. The following is the Intel-Hex formatted file (containing 3 frames) for the ASM Program of Step-1 (created using ATmel Studio 7.0)

:020000020000FC
:060000000FEF04B9289A7D
:00000001FF

4. Use of Intel-Hex formatted file.
(1) In Step-2, there are 6-bytes code/data for the progrm, which must be loaded into the flash memory of the MCU. The code/data are:

 ef0f
 b904
 9a28

(2) We can see that the code/data are packed within Frame-2 of Step-3. The codes/data are packed in this parcular way so that they could be transmitted from PC to the target receiver using UART Port; where, each symbol/digit/character goes in its ASCII code. Thus, the symbols/digits of Frame-2 travel as:

3A 30 36 30 30 30 30 30 30 30 46 45 46 30 34 42 39 32 38 39 41 37 44 (spaces are  shown for clarity)
(ASCII codes for these symbols : 0 6 0 0 0 0 0 0 0 F E F 0 4 B 9 2 8 9 A 7 D of Frame-2)

The receiver does the reverse processing and extract the original code/data from the ASCII codes and put them into the flash. (How? It is beyond the discussion of this scope.)

5. Meanings of various fields of Intel-Hex formatted file. Here, we take Frame-2 as an example

:060000000FEF04B9289A7D
==> :    06    0000    00   0FEF04B9289A   7D
   (a)   (b)    (c)    (d)         (e)     (f)

Field-a: Colon says to the receiver that a new frame is to arrive from the transmitter
Field-b: Indicates number of code/data (here 6-byte) bytes in this frame and
are contained in Field-e

Field-c: Indicates byte-oriented/organized memory location number of the flash from which the code/data bytes will get stored.

Field-d: When it is 00, it indicates that there are still more frames to arrive from transmitter. When it is 01 (see the last frame), it indicates that there are no more frames to arrive from transmitter.

Field-e It indicates the actual code/data bytes that have come from transmitter (lower bte comes first).

Field-f: It is check sum (CHKSUM). The transmitter sends it to receiver as the last data byte of the frame to allow the receiver checking the validity of the received frame. The CHKSUM is calculated in the following way:
(i) Add all the bytes from Field-b to Field-e. Thus we have: 0x06 + 0x00 +0x00 + 0x00 +0x0F + 0xEF + 0x04 + 0xB9 + 0x28 + 0x9A = 0x0283 (0x mean hex numbers)

(ii) Discard the carry. There remains: 0x83 (10000011).
(iii) Takes the 2's complement of 0x83 and sends at the end of frame.
10000011
==> 01111100 + 1 = 01111101 = 7D

see Intel Hex

Intel hexadecimal object file format , Intel hex format or Intellec Hex

Isn't that what I already wrote, just delayed by a few hours?

you win.

Dont worry. What I was asking was if he meant another shell within the IDE. Not all simple questions mean I dont know whats going on. Also, it is always good to clarify. Thanks for your input.