Understanding the .hex file

For example, the Sparkfun website has a tutorial for getting started with loading programs onto the microcontroller. They supply a .zip file with a .c file and the compiled .hex file (I've added spaces to indicate the formatting described below):

:10 0000 00 0C9434000C944F00 0C944F000C944F00 4F
:10 0010 00 0C944F000C944F00 0C944F000C944F00 24
:10 0020 00 0C944F000C944F00 0C944F000C944F00 14
:10 0030 00 0C944F000C944F00 0C944F000C944F00 04
:10 0040 00 0C944F000C944F00 0C944F000C944F00 F4
:10 0050 00 0C944F000C944F00 0C944F000C944F00 E4
:10 0060 00 0C944F000C944F00 11241FBECFEFD4E0 2E
:10 0070 00 DEBFCDBF11E0A0E0 B1E0E8EFF0E002C0 EC
:10 0080 00 05900D92A030B107 D9F711E0A0E0B1E0 E2
:10 0090 00 01C01D92A030B107 E1F70C9467000C94 E9
:10 00A0 00 00008FEF84B987B9 8EEF8AB9089501C0 37
:10 00B0 00 0197009759F020E0 0000000000000000 C8
:10 00C0 00 000000002F5F2A35 99F3F6CF08958FEF D7
:10 00D0 00 84B987B98EEF8AB9 8FEF88B985B98BB9 A2
:10 00E0 00 84EF91E00E945700 18B815B81BB884EF 50
:08 00F0 00 91E00E945700F0CF DF
:00 0000 01 FF

The format of the hex file is (Intel HEX - Wikipedia):

Start code, one character, an ASCII colon ':'.

Byte count, two hex digits, a number of bytes (hex digit pairs) in the data field. 16 (0x10) or 32 (0x20) bytes of data are the usual compromise values between line length and address overhead.

Address, four hex digits, a 16-bit address of the beginning of the memory position for the data. Limited to 64 kilobytes, the limit is worked around by specifying higher bits via additional record types. This address is big endian.

Record type, two hex digits, 00 to 05, defining the type of the data field.

Data, a sequence of n bytes of the data themselves, represented by 2n hex digits.

Checksum, two hex digits

[:][Byte Count][Address][Record Type][Data][Checksum]

So, the following line of the hex file can be split like this:

[:][10] [0060] [00] [0C944F000C944F0011241FBECFEFD4E0] [2E]

Instructions are 16-bits, or 4 hex digits.
The RJMP opcode is described in the ATMEGA datasheet as : 1100 kkkk kkkk kkkk
0b'1100 = 0xC
So, an instruction that has 0xC as the first digit, implies that it is an RJMP instruction:

The data from the .hex file can be parsed into the following instructions:
0C94
4F00
0C94
4F00
1124
1FBE
CFEF <--- This is the RJMP 0xFEF instruction. (0xFEF = decimal 4079)
D4E0

According to the addresses specified in the .hex file, the actual program is only 0x00F8 bytes long. I'm unsure why we're jumping 4079 bytes, when our program does not exist there.

1 Like