8bit vs 32bit floating point calculations

I planning or making a sensor board and want to include a Bosch BME280 sensor. I've already using this device on an Pi using Python attaining what I believe are accurate results.

Because the compensation for this sensor is a series expansion (several) they recommend a minimum of a 32bit processor to accurately render the floating point calculations.

My question is:

If I use a Sam32 or ESP32 and the Arduino IDE can/will the compiler be able to make "accurate" floating point calculations using the appropriate variable type?

I say appropriate above because I'm not yet sure if a double float is a valid type.

As to why I want this level of computational accuracy..... I think its a character flaw :slight_smile:

John

The 32 bit (6-7 decimal digit) floating point accuracy of the standard Arduino is more than adequate for the calculations involved with an inexpensive, consumer grade sensor like the BME280.

64 bit "double precision" floats are not implemented on the standard Arduino, whereas 32 bit integer and floating point calculations are the same on all processors, just slower on an 8 bit CPU.

attaining what I believe are accurate results.

If you really want to know, compare the results with those you obtain using a calibrated, laboratory grade instrument.

@jremington,

Thanks....just what I needed to know.

jremington:
64 bit "double precision" floats are not implemented on the standard Arduino, whereas 32 bit integer and floating point calculations are the same on all processors, just slower on an 8 bit CPU.

64-bit Floating Point Representation (binary64) (binary64 Format/IEEE-754 Standard) is supported by Arduino DUE with the help of double data type. This representation offers 13-digit 15-digit accuracy after the decimal point.

GolamMostafa:
64-bit Floating Point Representation (binary64) is supported by Arduino DUE with the help of double data type. This representation offers 13-digit accuracy after the decimal point.

Not true. Check here: arduino double

arduino_new:
Not true.

The first sentence is. Which is, in my opinion, much more important.

The relevant quote from your link...

On the Arduino Due, doubles have 8-byte (64 bit) precision.

Sadly "8-byte (64 bit) precision" is an inaccurate description. But, it does tell us that double is eight bytes.

What is missing from @GolamMostafa's post is that the datatype is one of the IEEE-754 formats.

What is wrong in @GolamMostafa's post is the number of decimal digits: nearly 16 instead of 13.

The Duo double is an IEEE-754 binary64 floating point datatype.

An Arduino DUE has a 15 decimals precision with a double (= double float, 8 bytes):

uint64_t bignum0 = 0xFFFFFFFFFFFFFFllu;
uint64_t bignum1 = (1llu<<64) - 1; //pow(2, 64) - 1;
int64_t bignum2 = -(1llu<<62); //- pow(2, 62);
double bignum3 = -1.12345678987654321;
double bignum4 = -166666666666666666666e-20;

void setup() {
  Serial.begin(250000);
}

void loop() {
  printf(" bignum0 = 0x%llx\n", bignum0);
  printf(" bignum0 = 0x%llX\n", bignum0);
  printf(" bignum1 = %llu\n", bignum1);
  printf(" bignum2 = %lld\n", bignum2);
  Serial.print(" bignum3 = ");
  Serial.println(bignum3,15);
  Serial.print(" bignum4 = ");
  Serial.println(bignum4,16);
  delay(1000);
}

Without a floating point processor, all relevant math is done in software. As it is done in software, it can be done on any platform. The question should be is whether the compiler supports it, and that only suggests that there may be a built-in library for 64, 80 or 128-bit floating point. The answer with respect to avr-gcc is no. This is also the the answer when you come back and say "So it can't be done". If you spend as much time in research as you do with responding without due diligence, you would uncover little treasures such as this.

This is but one, there are several 64-bit math libraries. If, like myself, your were born before the age of calculators, we were actually taught to do all these calculations with pencil and paper. We could get as much precision as we wanted so long as we kept doing the math. The same is on the hardware platform. The real question is, how much do we really need and at what cost? Surely even 128-bit floating point operations are possible on the UNO, but do you really want to wait for the answer?

arduino_new:
Not true. Check here: arduino double

Please, check at the following experimental results using Arduino DUE for both float (binary32 Format/IEEE-754 Standard) and double (binary64 Format/IEEE-754 Standard) data types on the same data. The float gives an accuracy of 7-digit and the double gives an accuracy of 15-digit.

void setup()
{
  Serial.begin(9600);
  /*
    float x1 = 2.958654321412345675;
    float x2 = 3.987664321312312544;
    // ----------------------------------
    //  float x3 = 6.9463186   42724658219; Manual Computation
    float x3 = x1 + x2;
    Serial.println(x3, 18); //prints: 6.9463186   26403808593 ; only 7-digit accuracy
  */
  double x1 = 2.958654321412345675;
  double x2 = 3.987664321312312544;
  // ----------------------------------
  //  double x3 = 6.946318642724658  219; Manual Computation

  double x3 = x1 + x2;
  Serial.println(x3, 18); //prints: 6.946318642724658   154; 15-digit accuracy = double of float

}

void loop()
{

}

32-bit Floating Point Representation: When we make this declaration float x = 12.35;, the Compiler stores 32-bit value (4-byte) into 4 consecutive memory location as per binary32/IEEE-754 format. This is known as 32-bit Floating Point representation Scheme where the accuracy is 6-digit after the decimal point.

64-bit Floating Point Representation: When we make this declaration double x = 27.53;, the Compiler stores 64-bit value (8-byte) into 8 consecutive memory location as per binary64/IEEE-754 format. This is known as 64-bit Floating Point representation Scheme where the accuracy is 13-digit after the decimal point.

Sorry, you were right. I overlooked!

It seems to be an annoying fact of life that even if your cpu implements 32bit floating point in hardware (like some of the teensy’s and the new adafruit M4F boards), 64bit floating point calculations will still be done entirely in software.

westfw:
It seems to be an annoying fact of life that even if your cpu implements 32bit floating point in hardware (like some of the teensy’s and the new adafruit M4F boards), 64bit floating point calculations will still be done entirely in software.

UNO, MEGA, NANO, and DUE -- none of them contains FPU Module; they perform floating point calculations entirely in software. When UNO can perform 32-bit floating point calculation in software, it can also be tailored to do 64-bit floating point calculation in software; but, the feature is not yet here. Whenever any query comes in the Forum on 64-bit floating point number, we have to switch over from UNO to DUE -- this is also an annoying fact of life.