Bitshift operation uint8_t to long

Hello everyone,

I receive 2 bytes which should be converted to a long. To achieve this I bitshift the first value and add it to the last.

#include <Arduino.h>

// #include <Logger.h>
//  bool DEBUGLOG = true;

void setup() {
  Serial.begin(9600);
  Serial.println("--- Binary Calculator ---");


  Serial.println("LONG");

  uint8_t v1 = 0x96;
  uint8_t v2 = 0x01;

  Serial.print("v1\t"); Serial.print(v1,DEC);Serial.print("\t");Serial.print(v1,HEX);Serial.print("\t");Serial.println(v1,BIN);
  Serial.print("v2\t"); Serial.print(v2,DEC);Serial.print("\t");Serial.print(v2,HEX);Serial.print("\t");Serial.println(v2,BIN);

  long v11 = 0 + (v1<<8);
  long v22 = v2 & 0xFF;

  Serial.print("v11\t"); Serial.print(v11,DEC);Serial.print("\t");Serial.print(v11,HEX);Serial.print("\t");Serial.println(v11,BIN);
  Serial.print("v22\t"); Serial.print(v22,DEC);Serial.print("\t");Serial.print(v22,HEX);Serial.print("\t");Serial.println(v22,BIN);

  Serial.println(" ");

  long v = ((v1<<8) + ((v2)&0xFF));
// expected 1501, 00009601, 10010110 00000001

  Serial.print("v\t"); Serial.print(v,DEC);Serial.print("\t");Serial.print(v,HEX);Serial.print("\t");Serial.println(v,BIN);

  Serial.println(" ");
  Serial.println(" ");
  Serial.println("INT");

  uint8_t v111 = 0x96;
  uint8_t v222 = 0x01;

  Serial.print("v111\t"); Serial.print(v111,DEC);Serial.print("\t");Serial.print(v111,HEX);Serial.print("\t");Serial.println(v111,BIN);
  Serial.print("v222\t"); Serial.print(v222,DEC);Serial.print("\t");Serial.print(v222,HEX);Serial.print("\t");Serial.println(v222,BIN);

  short v_int = ((v111<<8) + (v222&0xFF));
  
}

void loop() {
  delay(1000);
}

Since long has 4 bytes I would expect:

00000000 00000000 10010110 00000001

but the result of the monitor is

11111111111111111001011000000001

even just using integer (2byte) should result in

10010110 00000001

but also gives

11111111 11111111 10010110 00000001
-27135 in Decimal -> expected 1501

I am unable to find the error.

you are using a signed result so the most significant bit is the sign which gets propagated

try

void setup() {
  Serial.begin(115200);
  uint8_t v1 = 0xBE;
  uint8_t v2 = 0xEF;

  uint16_t v11 =  0 + v1 << 8; // useless 0 +
  uint16_t v22 = v2 & 0xFF; // useless mask
  uint16_t v = ((v1 << 8) + ((v2) & 0xFF));

  Serial.println(v11, HEX);
  Serial.println(v22, HEX);
  Serial.println(v, HEX);
}

void loop() {}

you should see

BE00
EF
BEEF

typically we would write

void setup() {
  Serial.begin(115200);
  uint8_t v1 = 0xBE;
  uint8_t v2 = 0xEF;

  uint16_t v11 =  v1 << 8;
  uint16_t v22 = v2 ;
  uint16_t v = (v1 << 8) | v2;

  Serial.println(v11, HEX);
  Serial.println(v22, HEX);
  Serial.println(v, HEX);
}

void loop() {}

Once again I stepped in the trap of clumsy variable definition.

Thanks for the quick response.

happy if that helped

have fun!

I am putting an effort (referring to the following Fig-1, 2, and 3) to explain why you are getting
-27135 (0xFFFF9601 = 11111111 11111111 10010110 00000001)

instead of
38401 (0x00009601 = 00000000 00000000 10010110 00000001)

1. Let us remember that the Arduino Platform/Compiler use int-type (16-it signed) buffer while processing data.

2. When you have executed the following code --

long v = ((v1<<8) + ((v2)&0xFF));

(1) The RHS has been evaluted to 10010110 00000001; where, the MSbit is HIGH (1).

(2) When you have copied the 16-bit value of Step-(1) into 32-bit variable named v, the MSBit (HIGH = 1) has been copied to every bit position (b31-b16) of the variable v. (This is how the compiler works.) As a result, now v holds:

11111111 11111111 10010110 00000001

(3) You have declared the variable v as long v; so, the serial.print() command looks at the MSBit of the content of v. If it is HIGH (1), the content will be considered in 2's complement form. Accordingly, the value of v will appear as:

Vf = - 1x231 + 0x7FFF9601 (positive quantity)
==> vf = - 2,147,483,648 + 2,147,456,513
==> vf = -27135

  uint8_t v1 = 0x96;
  uint8_t v2 = 0x01;
  long v = ((v1 << 8) + ((v2) & 0xFF));
  Serial.println(v, DEC); //shows: -27135

3. To get correct value of 38401, the following codes are relevant; where, the compiler is requested to use 32-bit buffer for processing and not to copy the sign bit while transferring data to the destnation.

  uint8_t v1 = 0x96;
  uint8_t v2 = 0x01;
  long v = ((long)(v1 << 8) + (long)v2) & 0x0000FFFF;  //not to copy sign bit as number is positive
  Serial.println(v, DEC); //shows:38401

2 Likes

full explanation of my succinct

2 Likes

Thanks to @J-M-L to get me on the way.
Thanks to @GolamMostafa to the in detail explanation. I was unaware of the sequence the compiler takes for setting (copying) variables.

1 Like

(for sake of clarity)

there is one step @GolamMostafa did not explain completely

it's important to know that an int is 2 bytes on the UNO but the crux is that C++ won't perform work on bytes for mathematical operations and will promote such variables to int before doing the math (so a signed element, 2 or 4 bytes long depending on your platform)

when you do the bit shifting

  uint8_t v1 = 0x96;
  uint16_t v11 =  v1 << 8;

in the second line, v1 is first promoted to an int So it becomes 0x0096 (rules of assignment from a smaller unsigned value into a larger container so no sign propagation) and then the shifting happens and you get 0x9600

if this hidden promotion was not done for you by the compiler then the left shifting by 8 position would have emptied v1 before storing it in v11

the other point is that because that operation (although your byte was unsigned) was carried as a signed operation, if you store the result in a large enough (uint32_t or int32_t) container then the assignment from a smaller container to a larger container works and the sign gets propagated.

As @GolamMostafa explained, then because you had a 1 as the MSb, the value was considered negative and that's why you got FFFF added instead of 0000

1 Like

Math (shift) on unsigned variables (uint8_t v1) is promoted to signed ints? That seems wrong.

In any case, unsigned long v = (unsigned)((v1<<8) + ((v2)&0xFF)); would have been a sufficient fix.

That’s C++ :wink: (and C too)

The cast unsigned for the operand of RHS has stopped copying/propagatimg the sign bit (MSbit) over the upper 16-bit of the 32-bit destination variable. If this is correct, then is there any need to change the data type of the destination variable from long (signed) to unsigned long (positive)? I mean --

long v = (unsigned)((v1<<8) + ((v2)&0xFF)); //would have been a sufficient fix
Serial.println(v); //shows: 38401

Well given OP is building a 16 bit value, a signed or unsigned 32 bits is just two bytes lost :wink:

1 Like

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.