Compiler simplification?

Hi All!

I’m writing a microcontroller simulator for my ATMega328p so that I can really understand the instruction set and the hardware architecture better. I’ve made quite a bit of progress implementing the instructions from the AVR Datasheet for the processor. I have GUI windows that show the program memory (and highlights the current instruction at the program counter) and data memory after each instruction.

I compiled the following program and pulled the assembled .hex file from where the Arduino IDE compiles it.

int i=0;
int x=0;
int main() {
 for (int i=0;i<5;i++)
 {
   x=x+2;
 }
}

Resulting .hex file:

:100000000C9434000C9451000C9451000C94510049
:100010000C9451000C9451000C9451000C9451001C
:100020000C9451000C9451000C9451000C9451000C
:100030000C9451000C9451000C9451000C945100FC
:100040000C9451000C9451000C9451000C945100EC
:100050000C9451000C9451000C9451000C945100DC
:100060000C9451000C94510011241FBECFEFD8E026
:10007000DEBFCDBF11E0A0E0B1E0E2ECF0E002C0F5
:1000800005900D92A030B107D9F711E0A0E0B1E0E2
:1000900001C01D92A230B107E1F70E9453000C94F9
:1000A0005F000C94000080910001909101010A967C
:1000B000909301018093000180E090E00895F8940E
:0200C000FFCF70
:00000001FF

My simulator loads the .hex file and executes the instructions just as a processor would. As I trace through the instructions being executed, it becomes clear that it’s not actually performing the for loop, but rather just stores the end value (0x0a) on the heap (at address 0x0100) at the end. (See instructions at address 0x00AE through address 0x00B4. It seems weird that the compiler created it’s own subroutine where I did not have any separate c++ methods.)

Is the compiler smart enough to remove the for loop and just insert the end value, or am I missing something?

Is there some other simple c++ code I could compile that would force the compiler to generate instructions that would actually go through the for loop and be able to see the index variables changing in memory?

Here’s the instructions that I’ve manually gone through and determined what each is doing:

address=word1 word2 instruction description mask binaryInstruction //interpretation
-------------------------------------------------------------------

0x0000=0c94 3400 JMP Jump 1001010kkkkk110k 1001010000001100  //Jump to address 0x0068
0x0004=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100  //Jump to address 0x00A2  (which then clears the interrupt flag and jumps to the program end.)
0x0008=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x000c=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0010=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0014=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0018=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x001c=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0020=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0024=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0028=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x002c=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0030=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0034=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0038=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x003c=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0040=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0044=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0048=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x004c=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0050=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0054=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0058=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x005c=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0060=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0064=0c94 5100 JMP Jump 1001010kkkkk110k 1001010000001100
0x0068=1124      CLR Clear Register 001001dddddddddd 0010010000010001  //Clear R1
0x006a=1fbe      OUT Store Register to I/O Location 10111AArrrrrAAAA 1011111000011111  //Write R1 to OUT 0x3F (0x5F) Status Register.
0x006c=cfef      LDI Load Immediate 1110KKKKddddKKKK 1110111111001111  //Load 0xFF into R28
0x006e=d8e0      LDI Load Immediate 1110KKKKddddKKKK 1110000011011000  //Load 0x08 into R29  0x08FF = 2303 = RAM end
0x0070=debf      OUT Store Register to I/O Location 10111AArrrrrAAAA 1011111111011110  //Store R29 into address 0x5E  (Stack Pointer High)
0x0072=cdbf      OUT Store Register to I/O Location 10111AArrrrrAAAA 1011111111001101  //Store R28 into address 0x5D  (Stack Pointer Low)
0x0074=11e0      LDI Load Immediate 1110KKKKddddKKKK 1110000000010001  //Load 0x01 into R17
0x0076=a0e0      LDI Load Immediate 1110KKKKddddKKKK 1110000010100000  //Load 0x00 into R26
0x0078=b1e0      LDI Load Immediate 1110KKKKddddKKKK 1110000010110001  //Load 0x01 into R27
0x007a=e2ec      LDI Load Immediate 1110KKKKddddKKKK 1110110011100010  //Load 0xc2 into R14
0x007c=f0e0      LDI Load Immediate 1110KKKKddddKKKK 1110000011110000  //Load 0x00 into R31
0x007e=02c0      RJMP Relative Jump 1100kkkkkkkkkkkk 1100000000000010  //Jump + 0x04 to address 0x0084
0x0080=0590      LPM_3 Load Program Memory (3) 1001000ddddd0101 1001000000000101
0x0082=0d92      STX_2 Store Indirect From Register to Data Space using Index X 1001001rrrrr1101 1001001000001101
0x0084=a030      CPI Compare with Immediate 0011KKKKddddKKKK 0011000010100000  //Compare R26 with 0x00.  0x00-0x00=0.  ITHSVNZC=--000010  Set Zero flag in SREG.
0x0086=b107      CPC Compare with Carry 000001rdddddrrrr 0000011110110001  //Compare R27 with R17.  0x01 - 0x01 - 0x00 = 0x00.  ITHSVNZC=--0000-0.  No change to SREG.
0x0088=d9f7      BRNE Branch if Not Equal 111101kkkkkkk001 1111011111011001  //Branch if zero flag not set.  Continuing without branching.
0x008a=11e0      LDI Load Immediate 1110KKKKddddKKKK 1110000000010001  //Load 0x01 into R17
0x008c=a0e0      LDI Load Immediate 1110KKKKddddKKKK 1110000010100000  //Load 0x00 into R26
0x008e=b1e0      LDI Load Immediate 1110KKKKddddKKKK 1110000010110001  //Load 0x01 into R27
0x0090=01c0      RJMP Relative Jump 1100kkkkkkkkkkkk 1100000000000001  //Jump + 0x02 to address 0x094
0x0092=1d92      STX_2 Store Indirect From Register to Data Space using Index X 1001001rrrrr1101 1001001000011101  // Store R1 into DataSpace pointed to by X register.  X Register points to Heap Space 0x0100.  Storing 0x00 into address 0x0100.
0x0094=a230      CPI Compare with Immediate 0011KKKKddddKKKK 0011000010100010  //Compare R26 with 0x02.  0x00-0x02 = -2.  ITHSVNZC=--010101
0x0096=b107      CPC Compare with Carry 000001rdddddrrrr 0000011110110001  //Compare R27 with R17.  0x01 - 0x01 - 0x01 = -1.  ITHSVNZC=--010101 
0x0098=e1f7      BRNE Branch if Not Equal 111101kkkkkkk001 1111011111100001  //Branch if zero flag is not on.  Branching -6 to address 0x0092.
0x009a=0e94 5300 CALL Long Call to a Subroutine 1001010kkkkk111k 1001010000001110  //Call subroutine at address 0x00A6.  Push the return address (0x009E) onto the stack.
0x009e=0c94 5f00 JMP Jump 1001010kkkkk110k 1001010000001100  //Jump to address 0x00be
0x00a2=0c94 0000 JMP Jump 1001010kkkkk110k 1001010000001100  //Jump to address 0x00be
0x00a6=8091 0001 LDS_1 Load Direct from Data Space (1) 1001000-----0000 1001000110000000  //This is the beginning of a subroutine.  //Load data in address 0x0100 into R24.
0x00aa=9091 0101 LDS_1 Load Direct from Data Space (1) 1001000-----0000 1001000110010000  //Load data in address 0x0101 into R25.
0x00ae=0a96      ADIW Add Immediate to Word 10010110-------- 1001011000001010  //Add 0x0a to register pair R24,R25.
0x00b0=9093 0101 STS_1 Store Direct to Data Space 1001001ddddd0000 1001001110010000  //Store contents of R25 into address 0x0101.
0x00b4=8093 0001 STS_1 Store Direct to Data Space 1001001ddddd0000 1001001110000000  //Store contents of R24 into address 0x0100.
0x00b8=80e0      LDI Load Immediate 1110KKKKddddKKKK 1110000010000000  //Load 0x00 into R24
0x00ba=90e0      LDI Load Immediate 1110KKKKddddKKKK 1110000010010000  //Load 0x00 into R25
0x00bc=0895      RET Return from Subroutine 1001010100001000 1001010100001000  //Return from subroutine to address the address which is on the stack (0x009E).
0x00be=f894      CLI Clear Global Interrupt Flag 1001010011111000 1001010011111000  //  Clear the interrupt flag in the Status reg.
0x00c0=ffcf      RJMP Relative Jump 1100kkkkkkkkkkkk 1100111111111111  // Jump -1.  End of main method.  Continuously jump to self.

Compilers are very smart.

For this code if you just provide an array and use x as an index to store its value. Then it cannot “unroll” the loop. Just make sure to declare the array outside of the for loop.

byte arry[11], x = 0, i;

for (i=0; i<5; i++)
{
     x = x+2;
     arry[x] = x;
}

Thanks, KeithRB.

That does help, I am seeing values increment in the data memory now. Still not doing exactly what I expect, but that's probably a separate issue.

Matthew

KeithRB:
Compilers are very smart.

Too smart in fact. I gather this compiler (gcc?) is the same one used for compiling the whole Linux operating system and has been known to surreptitiously optimise out absolutely critical security code required to clear memory after use.