Would someone please try the sketch below. I must be missing something, the variable x is acting as though it were a signed 16-bit integer. I thought char was a signed 8-bit integer.
//Arduino 1.0.5, Arduino Uno.
void setup(void)
{
Serial.begin(115200);
char x = 0;
// int8_t x = 0; //gives same results as char
// uint8_t x = 0; //works as expected
for (long i=0; i<32800; i++) {
Serial.print(x, DEC);
Serial.println();
++x;
}
}
void loop(void)
{
}
Same result if I do (int)y before sending it to print.
I suspect compiler optimization bug. BTW, I read the definitions of print. There is no print(char c, int base). The closest is print (unsigned char, int).
@liudr, thanks for the test, at least it's not just me Sounds like a good theory, I wonder if all char or int8_t variables are taking up twice the amount of storage that people think.
I tried it on a few different releases. On Arduino 0022, it works as expected:
In a variation of the code I ran, I printed the address of two char array elements and they differ by 1, as expected. But the result is the same as using two char variables. I don't have 0022 any more. You should report this as a bug.
liudr:
Same result if I do (int)y before sending it to print.
I suspect compiler optimization bug. BTW, I read the definitions of print. There is no print(char c, int base). The closest is print (unsigned char, int).
Umm, technically, it is a program error and the program is undefined since the value overflows. The ISO standard defines the behavior for unsigned types (i.e. the expression is done in modulo arithmetic), but it is undefined if a signed type overflows.
As you can see, the value of x is kept in a register (R28/R29) and it does an adiw (add immediate word) at address 0xEE. This seems to me to be the bug. The fact that it is using a register means it is bypassing the normal truncation it would get if it actually put the data back into an 8-bit field.
In the C programming language, signed integer overflow causes undefined behavior, while unsigned integer overflow causes the number to be reduced modulo a power of two, meaning that unsigned integers "wrap around" on overflow.
Do a search for "c++ signed overflow undefined".
Basically since it is undefined the compiler is entitled to generate whatever code it wants to.
Thanks everyone! Wow, very interesting. I found that declaring the variable as volatile makes it behave as I expected. In case you're wondering, I was just trying to demonstrate for a friend what happens when signed and unsigned integers overflow. The behaviour I expected for signed integers was for it to overflow from the maximum value (127) to the minimum value (-128). Kind of funny as I was ad-libbing at the time and of course got totally confused. I continue to be amazed at the optimization this compiler will do.
While I certainly cannot argue that the observed behaviour does not fit the definition of "undefined" I never would have expected "We'll promote your variable from 8 bits to 16, and continue to increment it, but when the 16 bits overflows, then we'll just let it go from the maximum value to the minimum value." The joke is certainly on me, hahaha
Thanks. I'd done some searching but didn't find that thread. I agree that from a purely theoretical viewpoint of the language, the behaviour is somewhat surprising. But given the specific implementation, and considering the compiler optimizations and hardware characteristics (instruction set), "undefined" in this case just turns out to be this really weird thing. I assumed that I knew what was going to happen, and I did not realize that I was treading into "undefined" territory. Couple lessons there for sure!
Thanks for showing the assembly. It's clear they didn't do any truncating or else on registers. Does ATMEGA328 not have an "inc" or "dec" command for such simple and often-needed incrementing and decrementing by 1?
Could you also demonstrate how the volatile keyword makes different assembled code? That would be great!
Starting at address 40, ldd loads a byte from SRAM (forced there as a result of volatile), subi adds one (by subtracting -1, welcome to RISC!) and std puts the result back into SRAM.
That is illuminating! Thanks Jack. I never learned assembly for any RISC system, just x386 assembly. When I used assembly, I do tend to keep things in registers if I can. I read disassembled Turbo C code back in 90's. It was moving between memory and register so much that I couldn't stop laughing
liudr:
That is illuminating! Thanks Jack. I never learned assembly for any RISC system, just x386 assembly. When I used assembly, I do tend to keep things in registers if I can. I read disassembled Turbo C code back in 90's. It was moving between memory and register so much that I couldn't stop laughing
Yeah same here, I've done a fair amount of assembler in the past on various CISC machines, but never on a RISC machine. So I'm just feeling my way along the walls in the dark here
My signature used to be rep movsd; //do it
For anyone that programmed assembly on 386 before, in real mode, this improves data transfer rate by 100% via 32-bit operations. Works great if you were coding a 320*200 256 color mode game and try to copy your buffer onto the video card. When I was playing with the 32-bit stuff, I had no access to 32-bit assemblers, just 16-bit with real mode debuggers. So once the CPU enters 32-bit "protected mode", I was running blind. Can't count how many times I had to restart my 486, like every minute.