Bitwise operations and 32/64-bit integers doesn't work as expected

Hi all,

I'm trying to port a library from ESP8266 to ATmega328P (Arduino Pro Mini clone, 5V), and am running into problems working with large int - specifically bit-wise operations on 32 and 64 bit int.

My library (the offending function is below) works perfectly on the ESP8266 processor. Now trying to run it on the Arduino, I'm running into trouble. This is code to read the MS5837 pressure sensor, which requires manipulation of 32 and 64 bit integers. In many cases there are multiplications and divisions by large numbers, all powers of two, which I replaced by the equivalent direct bit operations (left and right shifts).

The first that didn't work:

uint32_t D2 = (Wire.read() << 16) | (Wire.read() << 8) | Wire.read();

I had to replace this by:

uint32_t D1 = 0;
D1 = Wire.read();
D1 = (D1 << 8) | Wire.read();
D1 = (D1 << 8) | Wire.read();

The question I have is not how to make this work again (I know how), the question is more academic: why doesn't it work on the ATmega328, while it works on the ESP8266? I know the ESP8266 is 32-bit while the ATmega328 is 16-bit but that shouldn't be the issue as in the end both can deal with 64-bit values just fine. The problem is in the bitwise shifts.

The complete function:

void MS5837::read() {

  // Check if sensor was read less than MIN_INTERVAL milliseconds ago. If so, don't read it again
  // at this time.  
  if ((millis() - lastReadTime) < MIN_INTERVAL) return;
  lastReadTime = millis();

  // Request D1 (pressure) conversion using OSR = 8192
  Wire.beginTransmission(MS5837_ADDR);
  Wire.write(MS5837_CONVERT_D1);
  Wire.endTransmission();
  delay(20);  // Wait for conversion to complete, maximum 17.2 ms according to the datasheet.
  
  // Request the ADC to provide us with the result.
  Wire.beginTransmission(MS5837_ADDR);
  Wire.write(MS5837_ADC_READ);
  Wire.endTransmission();
  
  // Request the 3 bytes of data that should be ready for us.
  Wire.requestFrom(MS5837_ADDR, 3);
//  uint32_t D1 = (Wire.read() << 16) | (Wire.read() << 8) | Wire.read();
  uint32_t D1 = 0;
  D1 = Wire.read();
  D1 = (D1 << 8) | Wire.read();
  D1 = (D1 << 8) | Wire.read();

  Serial.print("D1: ");
  Serial.println(D1);

  // Request D2 (temperature) conversion using OSR = 8192
  Wire.beginTransmission(MS5837_ADDR);
  Wire.write(MS5837_CONVERT_D2);
  Wire.endTransmission();
  delay(20);  // Wait for conversion to complete, maximum 17.2 ms according to the datasheet.
  
  // Request the ADC to provide us with the result.
  Wire.beginTransmission(MS5837_ADDR);
  Wire.write(MS5837_ADC_READ);
  Wire.endTransmission();
  
  // Request the 3 bytes of data that should be ready for us.
  Wire.requestFrom(MS5837_ADDR, 3);
//  uint32_t D2 = (Wire.read() << 16) | (Wire.read() << 8) | Wire.read();
  uint32_t D2 = 0;
	D2 = Wire.read();
	D2 = (D2 << 8) | Wire.read();
	D2 = (D2 << 8) | Wire.read();

  Serial.print("D2: ");
  Serial.println(D2);

  // Calculate temperature (first order compensation).
  int32_t dT = D2 - (C[5] << 8);
  temperature = 2000 + ((int64_t(dT) * C[6]) >> 23);
  
  // Calculate temperature compensated pressure (first order compensation).
  int64_t off = (int64_t(C[2]) << 17) + ((int64_t(C[4]) * dT) >> 6);    // Offset at actual temperature.
  int64_t sens = (int64_t(C[1]) << 16) + ((int64_t(C[3]) * dT) >> 7);   // Sensitivity at actual temperature.
  pressure = (((D1 * sens) >> 21) - off) >> 15;      // Temperature compensated pressure x100.

  // If the temperature is < 20 C, calculate the second order compensation.
  if (temperature < 2000) {
    uint64_t ti = (11 * dT * dT) >> 35;
    int64_t offi = (31 * (temperature - 2000) * (temperature - 2000)) >> 3;
    int64_t sensi = (63 * (temperature - 2000) * (temperature - 2000)) >> 5;
    off -= offi;
    sens -= sensi;
    temperature -= ti;
    pressure = (((D1 * sens) >> 21) - off) >> 15;
  }
}

Another library for the same sensor; this one does work on both the ESP8266 and ATmega328. It uses the large numbers for the divisions and multiplications.

void MS5837::read() {
	// Request D1 conversion
	Wire.beginTransmission(MS5837_ADDR);
	Wire.write(MS5837_CONVERT_D1_8192);
	Wire.endTransmission();

	delay(20); // Max conversion time per datasheet
	
	Wire.beginTransmission(MS5837_ADDR);
	Wire.write(MS5837_ADC_READ);
	Wire.endTransmission();

 	Wire.requestFrom(MS5837_ADDR,3);
	D1 = 0;
	D1 = Wire.read();
	D1 = (D1 << 8) | Wire.read();
	D1 = (D1 << 8) | Wire.read();
	
	// Request D2 conversion
	Wire.beginTransmission(MS5837_ADDR);
	Wire.write(MS5837_CONVERT_D2_8192);
	Wire.endTransmission();

	delay(20); // Max conversion time per datasheet
	
	Wire.beginTransmission(MS5837_ADDR);
	Wire.write(MS5837_ADC_READ);
	Wire.endTransmission();

	Wire.requestFrom(MS5837_ADDR,3);
	D2 = 0;
	D2 = Wire.read();
	D2 = (D2 << 8) | Wire.read();
	D2 = (D2 << 8) | Wire.read();

	calculate();
}

void MS5837::calculate() {
	// Given C1-C6 and D1, D2, calculated TEMP and P
	// Do conversion first and then second order temp compensation
	
	int32_t dT = 0;
	int64_t SENS = 0;
	int64_t OFF = 0;
	int32_t SENSi = 0;
	int32_t OFFi = 0;  
	int32_t Ti = 0;    
	int64_t OFF2 = 0;
	int64_t SENS2 = 0;
	
	// Terms called
	dT = D2-uint32_t(C[5])*256l;
	if ( _model == MS5837_02BA ) {
		SENS = int64_t(C[1])*65536l+(int64_t(C[3])*dT)/128l;
		OFF = int64_t(C[2])*131072l+(int64_t(C[4])*dT)/64l;
		P = (D1*SENS/(2097152l)-OFF)/(32768l);
	} else {
		SENS = int64_t(C[1])*32768l+(int64_t(C[3])*dT)/256l;
		OFF = int64_t(C[2])*65536l+(int64_t(C[4])*dT)/128l;
		P = (D1*SENS/(2097152l)-OFF)/(8192l);
	}
	
	// Temp conversion
	TEMP = 2000l+int64_t(dT)*C[6]/8388608LL;
	
	//Second order compensation
	if ( _model == MS5837_02BA ) {
		if((TEMP/100)<20){         //Low temp
			Serial.println("here");
			Ti = (11*int64_t(dT)*int64_t(dT))/(34359738368LL);
			OFFi = (31*(TEMP-2000)*(TEMP-2000))/8;
			SENSi = (63*(TEMP-2000)*(TEMP-2000))/32;
		}
	} else {
		if((TEMP/100)<20){         //Low temp
			Ti = (3*int64_t(dT)*int64_t(dT))/(8589934592LL);
			OFFi = (3*(TEMP-2000)*(TEMP-2000))/2;
			SENSi = (5*(TEMP-2000)*(TEMP-2000))/8;
			if((TEMP/100)<-15){    //Very low temp
				OFFi = OFFi+7*(TEMP+1500l)*(TEMP+1500l);
				SENSi = SENSi+4*(TEMP+1500l)*(TEMP+1500l);
			}
		}
		else if((TEMP/100)>=20){    //High temp
			Ti = 2*(dT*dT)/(137438953472LL);
			OFFi = (1*(TEMP-2000)*(TEMP-2000))/16;
			SENSi = 0;
		}
	}
	
	OFF2 = OFF-OFFi;           //Calculate pressure and temp second order
	SENS2 = SENS-SENSi;
	
	if ( _model == MS5837_02BA ) {
		TEMP = (TEMP-Ti);
		P = (((D1*SENS2)/2097152l-OFF2)/32768l)/100;
	} else {
		TEMP = (TEMP-Ti);
		P = (((D1*SENS2)/2097152l-OFF2)/8192l)/10;
	}
}

Hint: what happens when you shift a sixteen bit int left by sixteen bits?

That's why I'm using 32 and 64 bit types.

wvmarle:
That's why I'm using 32 and 64 bit types.

Wire.read probably returns an 8-bit byte, so shifting it ≥8 bits to the left will result in 0.
Cast them to 32-bit integers first:

uint32_t D2 = ((uint32_t)Wire.read() << 16) | ((uint32_t)Wire.read() << 8) | Wire.read();

On a 32-bit CPU, the registers and data bus are 32 bits wide, and a uint8_t is stored in a 32-bit variable for efficient access, so it won't overflow if you shift them to the right.

Pieter

Right... Starting to understand what's going on now. Never had to deal with this before :slight_smile: Actually I got some HUGE numbers out of D1 and D2, with the first 16 or so bits set. That's when I realised I had a problem there.

For that byte value, the ATmega being a 16-bit system, is that still stored as 8-bit value? Before I did realise that an int on ATmega is 16-bit while on ESP8266 it's 32-bit. Using more explicit types like int16_t and int32_t takes away that ambiguity.

Now I just have to understand why the later calculations go wrong, where I have to use shifts of up to 35 bits but that's to the right so as long as it fit in the original value it should be no problem. It's the left shifts that may drop out.

Turns out there was just a single problem in the calculation, after the Wire.read() issue.

I had to change

int32_t dT = D2 - (C[5] << 8);

to

int32_t dT = D2 - ((int32_t)C[5] << 8);

And everything else works :slight_smile:

Thanks to @PieterP as I now also understand the reason for this difference. C[5] is declared as an uint16_t, so in the ESP8266 that's stored in a 32-bit space, while the ATmega reserves only a 16-bit space for it, the space actually requested... I always assumed that the ESP8266 wouldn't be so wasteful with memory, even though it has a lot of it :slight_smile:

wvmarle:
That's why I'm using 32 and 64 bit types.

No, the point is you were not. The default for the Arduino Uno etc in expressions is 16 bit integers - unless
you use an explicit 32 or 64 values and casts you will typically see intermediate results be 16 bit.

a << 16 is 16 bit on Uno/Mega if a isn't larger than 16 bit.
a << 16L is 32 bit
((long)a) << 16 is 32 bit.

wvmarle:
I know the ESP8266 is 32-bit while the ATmega328 is 16-bit

Says here: http://www.microchip.com/wwwproducts/en/ATmega328P
That the ATmega328 is an 8-bit machine.

gfvalvo:
Says here: http://www.microchip.com/wwwproducts/en/ATmega328P
That the ATmega328 is an 8-bit machine.

I stand corrected, thanks.

The hardware register size is irrelevant to this topic - the C++ compiler default size for an int is what matters,
since the compiler hides the details of the registers/bare-metal from you.

The reason the compiler for the ATmega328 is 16 bit is due to the severe memory restrictions of only
a few k of SRAM. Anything to help keep code+data compact is welcome, even if its annoying at times.

Otherwise there would be no reason to expect an 8-bit machine to have less than 32 bit C integers.