Weird behaviour when trying Bitwise Operators!!

Hi everyone,
I encountered a problem when trying to manipulate the bits in the High Word (Bits 31-16) of a uint32_t value (unsigned integer, 32 bits of length), using bitwise operators. I have an Arduino Mega2560.

uint8_t should be the same as unsigned byte (min value:0, max value: 255)

please try executing the following code and tell me what happens:

void setup()
{
Serial.begin(9600);
uint8_t byte0 = 0xFE; // least significant byte
uint8_t byte1 = 0xE9;
uint8_t byte2 = 0x9E;
uint8_t byte3 = 0x46;
uint32_t test, hwlb, hwhb; // High Word Low Byte, High Word High Byte
hwlb = byte2;
hwhb = byte3;
Serial.println( hwlb << 16, HEX); // outputs 0x009E0000, which is correct
Serial.println( hwhb << 24, HEX); // outputs 0x46000000, which is fine too
test |= byte0;
test |= byte1 << 8;
test |= hwlb << 16;
Serial.println(test, HEX); // outputs 0xFFFFE9FE, which is strange, 'cause only Byte0 and Byte1 are correct
test |= hwhb << 24;
Serial.println(test, HEX); // outputs 0xFFFFE9FE, like above
}

void loop()
{
}

(Why I created hwlb and hwhb to store the values of Byte2 and Byte3 ? Because I suspected that Arduino cannot bitshift a mere byte such a long way. Trying without these 2 variables got no better results. )

Does anyone have an idea what’s problematic about the operations test |= hwlb << 16; and test |= hwhb << 24; ???
Actually, the value of test should be 0x469EE9FE in the end, which is Byte0-3 rowed up.
Feel free to try other byte values as well and share your results with me :slight_smile:

Thanks in advance !!
M1cr0M4n

Just guessing… could priority of operations be causing this?

void setup()
{
   Serial.begin(19200);
   uint32_t   byte0 = 0xFE; // least significant byte 
   uint32_t   byte1 = 0xE9;
   uint32_t   byte2 = 0x9E;
   uint32_t   byte3 = 0x46;
   uint32_t test, hwlb, hwhb; // High Word Low Byte, High Word High Byte
   hwlb = byte2;
   hwhb = byte3; 
   Serial.println( hwlb << 16, HEX); // outputs 0x009E0000, which is correct 
   Serial.println( hwhb << 24, HEX); // outputs 0x46000000, which is fine too
   test |= byte0;
   test |= (byte1 << 8);
   test |= (hwlb << 16); 
   Serial.println(test, HEX); // outputs 0xFFFFE9FE, which is strange, 'cause only Byte0 and Byte1 are correct
   test |= (hwhb << 24); 
   Serial.println(test, HEX); // outputs 0xFFFFE9FE, like above 
}

void loop()
{
}

This seems to work for me. Changed the variables to 32 bit. (brute force solution, I know…)

void setup()
{
   Serial.begin(19200);
   uint8_t   byte0 = 0xFE; // least significant byte 
   uint32_t   byte1 = 0xE9;
   uint32_t   byte2 = 0x9E;
   uint32_t   byte3 = 0x46;
   uint32_t test=0, hwlb, hwhb; // High Word Low Byte, High Word High Byte
   //hwlb = byte2<<16;
   //hwlb = byte3<<24; 
   Serial.println( hwlb, HEX); // outputs 0x009E0000, which is correct 
   Serial.println( hwhb, HEX); // outputs 0x46000000, which is fine too

   test |= byte3 <<24;
   test |= byte2 << 16; 
   test |= byte1 << 8;
   test |= byte0;
   Serial.println(test, HEX); // outputs 0xFFFFE9FE, which is strange, 'cause only Byte0 and Byte1 are correct
   test |= (hwhb << 24); 
   Serial.println(test, HEX); // outputs 0xFFFFE9FE, like above 
}

void loop()
{
}

This too.

uint8_t   byte1 = 0xE9;
...
test |= byte1 << 8;

The RH side has the operators promoted to int, not unsigned long. Since the high-order bit is set, that will be set in the result.

Also (not that it matters in your example), local variable 'test' isn't initialized so it contains a random bit pattern. So it's best to mask everything but the lowest 8 bits out of the operand and set test= 0 to start with.

kind regards,

Jos

Test is an uninitialised local variable so the contents are undefined. If you want it to start off with all bits zero, you need to assign zero to it.

(Most compilers would warn you if you are using the value of an unitialised local variable, but the Arduino IDE designer decided we don't need to see any warnings.)

Test is initialized:

uint32_t test=0, hwlb, hwhb; // High Word Low Byte, High Word High Byte

Thanks to you all guys, it now works :grin: , it was necessary to initialize test AND to give the variables the following size: byte0: at least uint8_t byte1: at least uint16_t byte2 and byte3: uint32_t

if not for these minimum sizes, for some reason, Arduino fills in leading 1s if the MSB is 1 and so leads to a wrong result.

Though it makes no sense to me why someone would want to fill in leading 1s in an UNSIGNED value, since in an UNSIGNED value, filling in leading 1s at will DOES change the positve integer decimal number represented. (As opposed to signed values, where 11111110bin is the same as 1110bin, both representing -2dec.) regards, M1cr0M4n

Though it makes no sense to me why someone would want to fill in leading 1s in an UNSIGNED value, ...

Read this:

https://www.securecoding.cert.org/confluence/display/seccode/INT02-C.+Understand+integer+conversion+rules

Integer types smaller than int are promoted when an operation is performed on them. If all values of the original type can be represented as an int, the value of the smaller type is converted to an int; ...

The original values were smaller than an int, and the value could be held in an int, so they were promoted to int. And unfortunately, int is signed. The resulting value was then stored in unsigned long.

If all values of the original type can be represented as an int

Hey Nick, “int” means a signed value with 32 Bits of length here, right? Just want to be sure.

The original values were smaller than an int, and the value could be held in an int, so they were promoted to int.

Nick, your explanation works for byte1. But it seems that it cannot explain what is going on in the following code:

void setup()
{
Serial.begin(9600);
uint8_t byte0 = 0xFE; // least significant byte
uint8_t byte1 = 0xE9;
uint8_t byte2 = 0x9E;
uint8_t byte3 = 0x46;
Serial.println(byte0 , HEX); // outputs FE, just as it should
Serial.println(byte1<<8 , HEX); // outputs FF FF E9 00, which is E9 correctly shifted, but the whole thing is filled up with leading 1s (undesired)
Serial.println(byte2<<16 , HEX); // outputs 0, which makes no sense to me at all
Serial.println(byte3<<24 , HEX); // outputs 0, which makes no sense to me either
Serial.println(byte0<<-2 , HEX); // outputs 3F, this is just to show that the << operator works with negative right-hand values also (I tried it with //other values too)
}

void loop()
{
}

I have no clue why this happens. Nick, you said that if the values smaller than an int (byte0-3) can be represented as an int, they are converted to an int. which is what happens to byte1:
before after
9E → FF FF FF 9E (promotion to int)
FF FF FF E9 → FF FF E9 00 (bitshifting by dec8 to the left)
this is how I understand the “promotion to int” thing.

but then the same thing should happen to byte2 and byte3, like:
before after
E9 → FF FF FF E9 (promotion to int)
FF FF FF E9 → FF E9 00 00 (bitshifting by dec16 to the left)

and
before after
46 → 00 00 00 46 (promotion to int)
00 00 00 46 → 46 00 00 00 (bitshifting by dec24 to the left)

but that doesn’t happen, since the result is 0. Even though byte2 and byte3 are smaller than int and can be represented as an int.

Could it be possible that 16(dec) and 24(dec) are converted into something that is interpreted as a NEGATIVE value, so that byte2 and byte3 are shifted to the right instead of to the left, therefore shited into nothing and leaving just 0 ??

please share your ideas,
regards, M1cr0M4n

I had a similar problem and did it like this:

   test |= (unsigned long) byte3 << 24;
   test |= (unsigned long) byte2 << 16; 
   test |= (unsigned long) byte1 << 8;
   test |= byte0;

Type casting solved my problem without wasting space by making the bytes bigger.

There is a totally different solution to the problem of reading/writing individual bytes in larger values; unions. Here is a snippet:

union univ32_t {
  uint32_t as_long;
  uint8_t as_byte[4];
};

univ32_t d;
d.as_long = 0x12345678;
Serial.println(d.as_long, HEX);
for (uint8_t i = 0; i < 4; i++)
    Serial.println(d.as_byte[i], HEX);
d.as_byte[3] = 0x99;
Serial.println(d.as_long, HEX);

All the ORs and shifts are suddenly not needed. They are replaced by memory offsets instead. Please note that because the AVR is a little-endian processor the byte[0] is the LSB, and byte[3] is the MSB.
Cheers!

, "int" means a signed value with 32 Bits of length here, right?

if you're on a Due, correct, otherwise wrong.

M1cr0M4n:

If all values of the original type can be represented as an int

Hey Nick, "int" means a signed value with 32 Bits of length here, right? Just want to be sure.

The language rules are the same, but as AWOL points out, on the Arduino Mega2560 (and many other Arduinos) the int is a 16-bit value.

You can always print out "sizeof (int)" in a small sketch to be sure.