complier bug with unary ++ post operator (e.g. i++)

Hello,

I am having a problem with the code that is generated by the Arduino compiler. I am using the Arduino IDE, version 1.6.5 (as reported in the Help / About display). The problem is with the post operation of the unary ++ operator. The following statements are useful for analysis.

i  = 0;
i = i++ % 50;

The last statement should first assign the modulo 50 value of i to i. 0 modulo 50 is 0, so 0 should be assigned to i. Then the unary post operation increment of i, from 0 to 1, should occur. Thus, the final value of i should be 1. However, this statement fails to increment the value of i.

The following code demonstrates the problem.

Serial.begin(9600);  
Serial.println("hello world\n");

int i = 0;
Serial.print("1: i=");
Serial.println(i);             /* verify initial value of i */
i = i++ % 50;                /* should post increment i to 1 */
Serial.print("2: i=”);       
Serial.println(i);             /* should report a value of 1 */
i = 49;                              
Serial.print("3: i=”);
Serial.println(i);             /* verify initial value of i */
i = i++ % 50;                /* should post increment i to 50 */
Serial.print("4: i=”);
Serial.println(i);
i = i++ % 50;                /* should set i to 0 (i.e 50 mod 50), then post incr to 1 */
Serial.print(“5: i=”);
Serial.println(i);

The Arduino program produced the following output on the serial port monitor (annotations added):

hello world
1: i=0 <== verify beginning value of i
2: i=0 <== modulo 50 with post increment (but failed to increment)
3: i=49 <== verify beginning vlaue of i
4: i=49 <== modulo 50 with post increment (but failed to increment)
5: i=49 <== again, modulo 50 (ok) with post increment (but failed to increment)

To provide further proof that this is an error in the Arduino compiler I compiled the same code using the Windows C++ compiler. The Windows program produced the following output to a DOS window (with added annotations):

d:\myprojects\test2\Debug>test2
hello world
1: i=0 <==== verify beginning value of i
2: i=1 <==== modulo 50 with post increment correctly applied
3: i=49 <==== verify beginning value of i
4: i=50 <==== modulo 50 with post increment correctly applied
5: i=1 <==== 50 modulo 50 (= 0) with post increment correctly applied

Has anyone else encountered this bug?
Thanks, cdj

No it shouldn’t you have entered the territory of undefined behaviour.

http://c-faq.com/expr/evalorder1.html

When I compile that example it produces this warning:

sketch_oct26a.ino:7:15: warning: operation on 'i' may be undefined [-Wsequence-point]

AFAIK That type of operation is undefined in the language spec and compiler writers are free to make the result whatever is convenient or desirable to them.

Pete

Not exactly, "unspecified" means they are free to make the result whatever is convenient. "Undefined" means they don't have to do anything - it is perfectly fine if the compiler crashes.

@cdj15: there are many ways to write a program, so write it to do what you want but avoid constructs like those that cause problems.

so write it to do what you want but avoid constructs like those that cause problems.

Great advice. Can you list all the "constructs like those that cause problems" so we can avoid them?

Can you list all the "constructs like those that cause problems" so we can avoid them?

Sorry, no.

I would avoid C/C++ if I could.

i = i++ % 50;

Write that all the way out to see why it makes no sense at all.

i = (i = i + 1) % 50

How should that be evaluated?

Why not use the much simpler:

i = (i+1) %50;

Why do you want to reassign i in the middle of your assignment?

This isn't a compiler bug, this is clearly a code bug.

Or wait, it's post-increment. So it should calculate i % 50, then increment i, and then assign i % 50 to i. So i % 50 overwrites the increment every time and the number doesn't move.

Or since i++ is a statement all it's own, should it be evaluate i % 50, assign that to i and then increment i.

See there's ambiguity. At least three different ways to evaluate that one expression. That doesn't make for good programs, so the C++ people have decided that you just shouldn't use ++ and = on the same variable in the same line.

Can you list all the "constructs like those that cause problems" so we can avoid them?

Here are a bunch of them described. Mostly WRT "sequence points", with some other pointers, I think.

http://stackoverflow.com/questions/4176328/undefined-behavior-and-sequence-points

In addition to "i = i++", there are a set of things like: foo(i++, i++); And the 'well know' "avoid side effects" like "

i = i++ % 50

define max(a, b) (a > b? a:b)

b = max(12, ++b);

Most of them are "obvious problems" if you're working backward, but things that you don't see as you're moving forward.

Don't forget that C is almost always an compiled and optimized language, so there is usually very little reason to cram an expression into a single line, or a "shorter" source form. The ONLY way that a modern C compiler is likely to generate better code from an expression like "i = i++ % 50;" than from "i = (i%50) + 1;: is if it's wrong.

In short, undefined behaviour means anything can happen from daemons flying out of your nose to your girlfriend getting pregnant.

From the link in the post above (and I’m sure somewhere before that). I’m not sure which one is scarier.

That would be something to explain to some girls father though. “Well I used i = i++; and changed compilers and the next thing you know the rabbit dies…”

PaulS: Great advice. Can you list all the "constructs like those that cause problems" so we can avoid them?

Yes easily in the form of the rule: use only one operator per line'.

i = i++ % 50;

then becomes

i++; i = i % 50;

this way you never run into constructs like

i = 0; i += i++ + ++i;

what is i now? :)

robtillaart: i = 0; i += i++ + ++i;

what is i now? :)

robtillaart. And don't ever change

use only one operator per line'

So

if(digitalRead(onePin) == HIGH && digitalRead(twoPin) == HIGH)

is bad, since it uses three operators?

robtillaart: Yes easily in the form of the rule: use only one operator per line'.

i = i++ % 50;

then becomes

i++; i = i % 50;

this way you never run into constructs like

i = 0; i += i++ + ++i;

what is i now? :)

That is one solution, albeit a rather ridiculous one. FAR better to simply LEARN THE RULES of expression evaluation. And, if in doubt, USE PARENTHESES to force the evaluation order you want.

Regards, Ray L.

i = i++;

No amount of parenthesis is going to fix that. Or make it any less ambiguous.

Hi Ray,

Why do you think it is ridiculous? (interested in the arguments)

Imho the purpose of a programming language is to explain the compiler what you want to happen in the executable. This should be done preferably in a maintainable format. Simple statements help.

I expect that a sequence of simple statements can as easily be optimized by the compiler as complex expressions.

Understand me well, I agree one should learn evaluation precedence as part of the language.

robtillaart: Hi Ray,

Why do you think it is ridiculous? (interested in the arguments)

Imho the purpose of a programming language is to explain the compiler what you want to happen in the executable. This should be done preferably in a maintainable format. Simple statements help.

I expect that a sequence of simple statements can as easily be optimized by the compiler as complex expressions.

Understand me well, I agree one should learn evaluation precedence as part of the language.

An equation with 10 terms would be broken down into 10 lines of code, making the equation as a whole totally opaque to the reader. Writing the same equation on a single line using parentheses will make the code look just like the equation would be written on paper, making the intent crystal clear.

Doing much of anything where you assign to a variable where you are using pre- or post-increment on that same variable on the RHS of the statement is, in a word, stupid. If you MUST do something like that, simply use (i+1) instead, and it will work as expected every time. Breaking everything down into a gazillion lines of trivial code is a crutch, to avoid learning how the code is actually parsed, and how to do things right. In the long run, it'll cost you far more than simply learning to do it right from the start. If someone us not wiling to take the time to learn, and really understand, at least those parts of the language you intend to use, then perhaps they shouldn't be programming in the first place, because if they can't learn why something like i=i++; is bad, they're not going to get very far....

Regards, Ray L.

Regards, Ray L.

Writing the same equation on a single line using parentheses will make the code look just like the equation would be written on paper, making the intent crystal clear.

Good argument,

Thanks

cdj15: Hello,

I am having a problem with the code that is generated by the Arduino compiler. I am using the Arduino IDE, version 1.6.5 (as reported in the Help / About display). The problem is with the post operation of the unary ++ operator. The following statements are useful for analysis.

i  = 0;
i = i++ % 50;

That's absolutely terrible programming practice. Not only it it hard to read (as in "what is it supposed to do"), but it also depends on how the compiler interprets the source.

Put everything on it's own line...

i = 0;
i %= 50;
i++;

--or--

i = 0;
i++;
i %= 50;

in both cases, it's completely clear to any third party reader of your code what you are trying to accomplish, and better it doesn't depend on compiler quirks.