compiler optimization or bugs ? you have to know how gcc behaves

Hi There,

after spending some time on trouble shooting a piece of code this week end (...), I decided to share my experience on the forum regarding how the hell our gcc compiler is generating some different code depending on its optimization mechanism and the context.

this is all about a loop waiting someting like a value in a variable or a bit in a register...

unless you are very experienced, you will probably be amazed by the example below where the compiler can generate 5 differents type of assembly code depending on the context (volatile or not).

Here is below the test code and then the complete disassembly for it.
the x variable is just set to make things easy to recognize and should be considered as a "scenario number" here:

arduino 1.02 source

char var;
char volatile var2;
byte volatile x;

void setup(){
x=0x11;asm volatile ("":::"memory");
while (var  & 1); 
x=0x22;asm volatile ("":::"memory");
while (var  & 2) asm ("nop");
x=0x33;asm volatile ("":::"memory");
while (var  & 4) asm volatile ("nop");
x=0x44;asm volatile ("":::"memory");
while (var  & 8) asm volatile ("nop":::"memory");
x=0x55;asm volatile ("nop":::"memory");
while (var2  & 1); 
x=0x66;asm volatile ("":::"memory");
while (var2  & 2) asm ("nop");
x=0x77;asm volatile ("nop":::"memory");
while (var2  & 4) asm volatile ("nop");
x=0x88;asm volatile ("":::"memory");
while (var2  & 8) asm volatile ("nop":::"memory");
}

assembly / avrdump

000001b2 <setup>:

x=0x11;asm volatile ("":::"memory");
 1b2:	81 e1       	ldi	r24, 0x11	; 17
 1b4:	80 93 0a 01 	sts	0x010A, r24

while (var  & 1); 
 1b8:	80 91 08 01 	lds	r24, 0x0108
 1bc:	80 fd       	sbrc	r24, 0
 1be:	40 c0       	rjmp	.+128    	; 0x240 <setup+0x8e>

x=0x22;asm volatile ("":::"memory");
 1c0:	82 e2       	ldi	r24, 0x22	; 34
 1c2:	80 93 0a 01 	sts	0x010A, r24

while (var  & 2) asm ("nop");
 1c6:	80 91 08 01 	lds	r24, 0x0108
 1ca:	81 ff       	sbrs	r24, 1
 1cc:	02 c0       	rjmp	.+4      	; 0x1d2 <setup+0x20>
 1ce:	00 00       	nop
 1d0:	fe cf       	rjmp	.-4      	; 0x1ce <setup+0x1c>

x=0x33;asm volatile ("":::"memory");
 1d2:	83 e3       	ldi	r24, 0x33	; 51
 1d4:	80 93 0a 01 	sts	0x010A, r24

while (var  & 4) asm volatile ("nop");
 1d8:	80 91 08 01 	lds	r24, 0x0108
 1dc:	82 ff       	sbrs	r24, 2
 1de:	02 c0       	rjmp	.+4      	; 0x1e4 <setup+0x32>
 1e0:	00 00       	nop
 1e2:	fe cf       	rjmp	.-4      	; 0x1e0 <setup+0x2e>

x=0x44;asm volatile ("":::"memory");
 1e4:	84 e4       	ldi	r24, 0x44	; 68
 1e6:	80 93 0a 01 	sts	0x010A, r24

while (var  & 8) asm volatile ("nop":::"memory");
 1ea:	01 c0       	rjmp	.+2      	; 0x1ee <setup+0x3c>
 1ec:	00 00       	nop
 1ee:	80 91 08 01 	lds	r24, 0x0108
 1f2:	83 fd       	sbrc	r24, 3
 1f4:	fb cf       	rjmp	.-10     	; 0x1ec <setup+0x3a>

x=0x55;asm volatile ("nop":::"memory");
 1f6:	85 e5       	ldi	r24, 0x55	; 85
 1f8:	80 93 0a 01 	sts	0x010A, r24

while (var2  & 1); 
 1fc:	00 00       	nop
 1fe:	80 91 09 01 	lds	r24, 0x0109
 202:	80 fd       	sbrc	r24, 0
 204:	fc cf       	rjmp	.-8      	; 0x1fe <setup+0x4c>

x=0x66;asm volatile ("":::"memory");
 206:	86 e6       	ldi	r24, 0x66	; 102
 208:	80 93 0a 01 	sts	0x010A, r24

while (var2  & 2) asm ("nop");
 20c:	01 c0       	rjmp	.+2      	; 0x210 <setup+0x5e>
 20e:	00 00       	nop
 210:	80 91 09 01 	lds	r24, 0x0109
 214:	81 fd       	sbrc	r24, 1
 216:	fb cf       	rjmp	.-10     	; 0x20e <setup+0x5c>

x=0x77;asm volatile ("nop":::"memory");
 218:	87 e7       	ldi	r24, 0x77	; 119
 21a:	80 93 0a 01 	sts	0x010A, r24

while (var2  & 4) asm volatile ("nop");
 21e:	00 00       	nop
 220:	01 c0       	rjmp	.+2      	; 0x224 <setup+0x72>
 222:	00 00       	nop
 224:	80 91 09 01 	lds	r24, 0x0109
 228:	82 fd       	sbrc	r24, 2
 22a:	fb cf       	rjmp	.-10     	; 0x222 <setup+0x70>

x=0x88;asm volatile ("":::"memory");
 22c:	88 e8       	ldi	r24, 0x88	; 136
 22e:	80 93 0a 01 	sts	0x010A, r24

while (var2  & 8) asm volatile ("nop":::"memory");
 232:	01 c0       	rjmp	.+2      	; 0x236 <setup+0x84>
 234:	00 00       	nop
 236:	80 91 09 01 	lds	r24, 0x0109
 23a:	83 fd       	sbrc	r24, 3
 23c:	fb cf       	rjmp	.-10     	; 0x234 <setup+0x82>

 23e:	08 95       	ret			// end of Setup

 240:	ff cf       	rjmp	.-2      	; 0x240 <setup+0x8e>	// LOOP ITSELF !

so,
scenario 11 will loop for ever (jump to 240...)
scenario 22 & 33 will loop for ever if the first test is false
scenario 44 will do the exact job we expect in the source code

scenario 55 will not do exactly the job we expect
scenario 66 & 77 & 88 will do the exact job we expect in the source code

good
so if you want to succeed, use volatile or make sure the while loop contains an instruction (or a call to a function) that will force the compiler to reconsider doing the test again.

cheers !

fabriceo:
so if you want to succeed, use volatile or make sure the while loop contains an instruction (or a call to a function) that will force the compiler to reconsider doing the test again.

It's hard to make out from that mass of code and assembler which codes you expected to work. But if you are testing a variable in a loop and there is no code in the loop which would cause the variable to be changed, then it is hardly surprising that the compiler will optimise the comparison away unless you tell it not to.

yep, thats why we cannot say the compiler is buggy here, maybe our mind is :slight_smile:

So if you don't have everything even down to compiler optimization exactly figured out ahead, it's a mental problem? LOL! :wink:

I can't tell what that is supposed to do. I hate code written by wise guys who want to obfuscate
what it does.

fabriceo:
yep, thats why we cannot say the compiler is buggy here, maybe our mind is :slight_smile:

Given that the code you posted does not achieve anything useful regardless of whether/how the compiler optimises it, I'd have to agree.

If you think that the compiler is doing something unexpected or wrong, you need to produce a test case which would be reasonably expected to produce one behaviour, which actually produces a different behaviour.

None of the code you posted does that, because the expected and actual behaviours are all nonsensical.

void setup(){

x=0x11;asm volatile ("":::"memory");
while (var  & 1);
x=0x22;asm volatile ("":::"memory");
while (var  & 2) asm ("nop");
x=0x33;asm volatile ("":::"memory");
while (var  & 4) asm volatile ("nop");

What on earth is the purpose of all this asm stuff? Who cares what the compiler generates?

scenario 55 will not do exactly the job we expect
scenario 66 & 77 & 88 will do the exact job we expect in the source code

I don't know what you expect, and therefore whether the compiler has done it.

Personally I don't see what scenario 55 has done wrong. Perhaps you can explain that?

hi there,
Nick, you are right scenario 55 is all good. I made a mistake in my comment due to the "nop" apearing in 1fc, but it is the one from the line above. my mistake.

I do care about what the compiler generates, this always help me to understand where are my bugs ...

a recent example from a friend mixing floant and long in a formula was debugged by looking at the asm code generated to understand where the bug comes from! same when playing with some *ptr++ with reinterpretcast . also it is good to understand how code is generated, this help on writting efficient C code. but ok, it depends on the criticality. all in all I have to say gcc is doing an amazing good job of optimization. very often I feel that it is not needed to write asm as it is already very optimized.

my original problem was about a waiting loop which was not ending as I expected, that is why I posted these scenario. not just for spaming the forum :slight_smile:
cheers again

You get that close to the metal and see how you feel when progress makes your learning time obsolete. I wouldn't bother unless I was writing the compilers or similar.

For me, most debugging requires at most some value prints and time thinking after losing my expectation-blindness.

GoForSmoke:
You get that close to the metal and see how you feel when progress makes your learning time obsolete. I wouldn't bother unless I was writing the compilers or similar.

For me, most debugging requires at most some value prints and time thinking after losing my expectation-blindness.

Me too. My bugs are almost always big scope screw ups rather then low level things. Simple print statements along with a lot of thinking about the symptom seen Vs possible causes can go a long way if debugging one's sketches. Dropping down into the weeds would not be helpful for me.

Lefty

scenario 11 will loop for ever (jump to 240...)

Which is correct.


scenario 22 & 33 will loop for ever if the first test is false

Which is also correct.


so if you want to succeed, use volatile or make sure the while loop contains an instruction (or a call to a function) that will force the compiler to reconsider doing the test again.

You don't need to disassemble for that. The compiler optimizes, that is well-known. Variables shared between an ISR and "main" code should be declared volatile.

Attempts to make loops by doing something with non-volatile variables are frequently optimized away.

... so if you want to succeed ...

Depends how you define "succeed". Adding the volatile keyword unnecessarily can slow your code down.