1. In the sketch of Step-2, I have included the following two lines of codes with an expectation that the Compiler will announce an error message (out of range) for code (2); instead, the codes are being compiled well and shows identical binary values for both variables though they are different. Would appreciate to know the reason for the Compiler not to check the "data type" against the assigned value.
(1) int y1 = -5536; //range of int: -32768 to 32767
(2) int y2 = 60000; //range of int: -32768 to 32767
2. Test Sketch
void setup()
{
Serial.begin(9600);
int y1 = -5536; //range of int: -32768 to 32767
int y2 = 60000; //range of int: -32768 to 32767
Serial.println(y1, HEX); //shows: EA60 (ignoring the sign extension bit)
Serial.println(y2, HEX); //shows: EA60 (ignoring the sign extension bit)
}
void loop() {}
Apparently not for 60000 which fits in an unsigned int but it will warn you about 67000:
sketch_may26b.ino: In function 'void setup()':
/sketch_may26b.ino:5:12: warning: overflow in implicit constant conversion [-Woverflow]
int y2 = 67000; //range of int: -32768 to 32767
^~~~~
Strangely, it still warns on "67000ul" and still calls it an "implicit constant conversion".
For the sake of beginners, it would be nice if the compiler generated a warning every time the code might cause an integer overflow. Perhaps you can convince the compiler authors to implement a "ludicrous warnings" mode where it does that. For now, we have to live with what the compiler gives us. Programmers that store out-of-range values in variables will learn from their mistakes when they find their code produces unexpected results.
The question I raised from the field where I had no convincing answer. You may help me by providing a reasonable answer rather than advocating to contact the Arduino Development Team.
The Compiler has no objection against the declaration (int y2 = 60000;); but, the human eyes clearly observe that the value is out-of-range. Do you have any explanation in favor of the Compiler that it has done the right thing?
The integer literal "60000" is of type 'long int' because it doesn't have a radix prefix (0, 0x, 0b) or a suffix (l, u, ul...) and it won't fit in an 'int' but will fit in a 'long int'.
If the destination type is signed, the value does not change if the source integer can be represented in the destination type. Otherwise the result is implementation-defined.
The value 60000l will not fit into an 'int' so the result is "implementation defined". That means that, by definition, the compiler has 'done the right thing' regardless of the results.
If the Arduino compiler ever gets updated to C++20 the result will be what you are seeing: Truncation to 16 bits.
I declared "int y2 = 60000;". How does the Compiler make it 1110 1010 0110 0000? It is just putting a natural binary number (for 60000) in the memory space regardless of the data type (int). It is the print() method that is caring the data type (int) and treating the content of the said memory location in 2's complement form and shows: -5536.
When you store that in a signed int, the result is "implementation-defined" (the compiler can do anything it wants). In cases like this, most C++ compilers will truncate 0000 0000 0000 0000 1110 1010 0110 0000 to fit into your 16-bit signed 'int': 1110 1010 0110 0000. In version C++20 and beyond that will be the rule and the compiler will no longer have the choice to do whatever it wants.
Compiler does not have a free mind like a human being that it will do whatever it likes. Compiler follows exactly what has been told in the implementation algorithm of the print() method (for example). I am still convinced to understand that "a data" gets its sign and magnitude when it is processed.
Sorry, I should have said:
"The compiler authors will no longer have a choice of what value to store in a signed variable when you store a value that can't be represented in the destination type."
I have wanted to say that a value assigned to a variable/identifier becomes a meaningful entity with respect to its sign and magnitude when that value is processed. For example:
byte y = 0b10011000;
The memory location holds: 10011000
How much is there?
Many answers:
(1) Considering natural binary, it is: 152 in decimal.
(2) Considering SM format, it is: -24 in decimal.
(3) Considering 2's complement form, it is: -104.
(4) Considering BCD, it is: 98
So, the bit pattern does not qualify the true nature of the content of the memory location; rather, it is the rule which is applied during processing.
AFAIK literals are of type int unless flagged differently (U/L...).
Thus the internal value is undoubtedly 0x98 or 152. I even could imagine that a 32/64 bit compiler treats int as 32 bit internally and shorter only when the target machine is reflected.
If this value is assigned to a variable of a different type then the value may be transformed into BCD (if supported) or float/double, else negative values may be sign extended or large values may be truncated.
Decimal literals (they don't have a prefix of 0, 0x, or 0b) are 'int' if they fit in an 'int', else 'long int' if they fit in a long int, else 'long long int'.
Interesting sub-plot: The '-' in front of a decimal literal is considered a separate negation operator. You can store the value -32768 in an 'int' but the literal -32768 is a 'long int'. The 32768 part won't fit in an int so it gets interpreted as long int before being negated.
The memory representation of a number is irrelevant to this discussion. The literal 0b10011000 is an integer with the (mathematical) value of 100110002 = 15210, nothing more, nothing less.