# Floating point accuracy

Hi all,

I’ve got the book “Practical C Programming” by Steve Qualline and it has a small demo program to test floating point accuracy. I copied the code and modified it for the Arduino (i.e. replacing “printf” with “Serial.print”, etc) and got this result:

8 digits accuracy in calculations
8 digits accuracy in storage

Running the same code compiled with GCC in Linux yields the same results (8 and 8).

So my question is, what does this MEAN?

I know what “accuracy is storage” means (8 bytes of storage for a floating point number, right?), but what about “accuracy in calculations”? Does it mean 8 decimal places?

FYI, this is the code I ran on the Arduino:

``````// Floating point accuracy test - from the cow book page 271

void setup (void)
{

Serial.begin(115200);

char buffer[64];

const char *mask1 = PSTR("%2d digits accuracy in calculations\r\n");
const char *mask2 = PSTR("%2d digits accuracy in storage\r\n");

int counter;
float number1, number2, result;

number1 = 1.0;
number2 = 1.0;
counter = 0;

while (number1 + number2 != number1) {
counter++;
number2 /= 10.0;
}

Serial.print(buffer);

number2 = 1.0;
counter = 0;

while (1) {
result = number1 + number2;
if (result == number1) {
break;
}
counter++;
number2 /= 10.0;
}

Serial.print(buffer);

}

void loop (void) { ; }
``````

Here’s the same thing for GCC (the code I tested the Arduino version against):

``````// Floating point accuracy test - from the cow book page 271

#include <stdio.h>

int main(void)
{
char buffer[64];

const char *mask1 = "%2d digits accuracy in calculations\r\n";
const char *mask2 = "%2d digits accuracy in storage\r\n";

int counter;
float number1, number2, result;

number1 = 1.0;
number2 = 1.0;
counter = 0;

while(number1 + number2 != number1) {
counter++;
number2 /= 10.0;
}

fprintf(stdout, "%s", buffer);
number2 = 1.0;
counter = 0;

while(1) {
result = number1 + number2;

if(result == number1) {
break;
}

counter++;
number2 /= 10.0;
}

fprintf(stdout, "%s", buffer);
}
``````

If anyone can explain to me what this all means, I’d appreciate it.

Thanks!

– Roger

8 bytes of storage for a floating point number, right?

No, 4 bytes on the Arduino.

The first test appears to be how many digits you can do something like this:

``````1.0 + 0.000001
``````

And still get a discrete result.

So after 8 digits it is doing (say):

``````1.0 + 0.00000001
``````

And getting 1.0 back, so it isn't able to add that (small) digit on.

The second test appears to be storing the number back into "result" and then comparing, so it is seeing if storing it makes a difference.

Conceivably the internal register might operate at a higher resolution (eg. if there was a maths coprocessor involved) so without storing you might get 10 digits of resolution, but once you store and retrieve you might only get 8.

Your result of 8 appears to agree with:

http://en.wikipedia.org/wiki/Floating_point#IEEE_754:_floating_point_in_modern_computers

Single precision, called "float" in the C language family, and "real" or "real*4" in Fortran. This is a binary format that occupies 32 bits (4 bytes) and its significand has a precision of 24 bits (about 7 decimal digits).

Since 2^24 = 16777216 then there are your 7 digits, with a little over, so the calculation is probably telling you that it kept getting a difference up to the 8th digit.

[quote author=Nick Gammon link=topic=138839.msg1042982#msg1042982 date=1356319851] Your result of 8 appears to agree with:

http://en.wikipedia.org/wiki/Floating_point#IEEE_754:_floating_point_in_modern_computers

Single precision, called "float" in the C language family, and "real" or "real*4" in Fortran. This is a binary format that occupies 32 bits (4 bytes) and its significand has a precision of 24 bits (about 7 decimal digits).

Since 2^24 = 16777216 then there are your 7 digits, with a little over, so the calculation is probably telling you that it kept getting a difference up to the 8th digit. [/quote]

Well that's interesting... I looked at the link you posted and noticed the other floating point types. I tried "double" and "long double" in the Arduino sketch - all of them returned "8 and 8". But in GCC and Linux, the type "double" gives me 16 and 16, and "long double" gives me 20 and 20.

I suppose this means there is no difference (for Arduino) between "float" and "double"... yes?

Krupski: I suppose this means there is no difference (for Arduino) between "float" and "double"... yes?

yes

WizenedEE:

Krupski: I suppose this means there is no difference (for Arduino) between "float" and "double"... yes?

yes

I guess that's what RTFM means, huh? :)

If you stuck with integers, you wouldn’t have these problems and your code would run faster.

Need decimal places?
How do you do say fractions of a meter?

If you work in millimeters then you have meters to 3 places. Micrometers give you 6 places.
By choosing your units, you can work with integers right down to any scale.

You can print the decimal places where you want but code-internally you don’t need them

Arduino allows 64-bit integers. That’s 19 decimal places that never rounds off unless you choose to and runs faster than FP without an FPU.

You can get even more flexibility for small loss but I forget the name of that non-fixed-point scheme and would rather go with the simpler choose-your-units method.

Yes, you can always use big numbers:

http://gammon.com.au/forum/?id=11519

[quote author=Nick Gammon link=topic=138839.msg1043056#msg1043056 date=1356333460] Yes, you can always use big numbers:

I already have that in my collection! :)

Thanks!