Float precision 6-7 digits, is it 6 or 7?

system · March 15, 2013, 9:27pm

I'm working on a project that needs signed 7 digit precision, with 5 on the right of the decimal. I've read Arduino's double/float limitation of 6-7 digits. I don't understand how it can be 6 or 7 digits.

I've searched and found nothing about when 6 or 7 would be used. Can someone help me understand this?

system · March 15, 2013, 9:36pm

Floating point doesn't have a precise precision. Hence the name "floating" point.

If you require a fixed precision, the you want to be working with fixed point maths, which is like integer maths but with an implied decmial point.

In fixed point maths you effectively multiply your values by a precision level, and work with bigger numbers. For instance, if you require 5 decimal point precision, then you can work with 10⁵ multiplication.

The number 4 would be 4x10⁵, or 400000.
The number 12.44928 would be 12.44928x10⁵, or 1244928.

The decimal point doesn't actually exist, but it is implied that it is between the 5th and 6th digits. It's up to you to handle it.

Fixed point maths is also considerably faster than floating point.

GoForSmoke · March 15, 2013, 9:50pm

It's IEEE floating point. Count on 6 places and sometimes 1 will be .9999999.

If you're good with numbers and remember the by-hand ways then you can achieve more accuracy using 32 and 64 bit integer fixed-point --- in general faster than using floats on AVR-based Arduinos.

Really, 9 places with type long and 19 places with type long long.

My suggestion is if you want like meters to 6 places then use micrometers as your unit and only print the decimal point for human use.

There are ways to use integers to get greater range but since I don't need them I forget the details. Try looking up the Big Number library just for fun, number of places is arbitrary and may be as large as you have RAM and time.

Here's more than you want to know about floating point:
http://introcs.cs.princeton.edu/java/91float/

nickgammon · March 16, 2013, 4:27am

According to the Wikipedia page:

In the IEEE binary interchange formats the leading 1 bit of a normalized significand is not actually stored in the computer datum. It is called the "hidden" or "implicit" bit. Because of this, single precision format actually has a significand with 24 bits of precision, double precision format has 53, and quad has 113.

They quote a single-precision float as having ~7.2 digits, which you can verify thus:

log (2^24) / log (10) = 7.22

So it is probably true to say you have slightly more than 7 digits of precision on the Arduino (using float) rather than 6 to 7.

2^24 = 16777216

Thus I expect to be able to store that number in a float (or maybe one less).

volatile float f = 16777216;

void setup ()
  {
  Serial.begin (115200);
  Serial.println (f);
  }  // end of setup

void loop () { }

Prints:

16777216.00

GoForSmoke · March 16, 2013, 5:49am

Try with negatives.

nickgammon · March 16, 2013, 5:59am

Absolutely.

volatile float f = -16777216;

void setup ()
  {
  Serial.begin (115200);
  Serial.println (f);
  }  // end of setup

void loop () { }

Output:

-16777216.00

michinyon · March 16, 2013, 6:06am

The floating point number has 24 bits of precision. That is, for a particular exponent, it can represent 2 to the power of 24 different numbers. That is about 16.7 million different numbers, or about 7 decimal digits.
But it is only 8.3 million positive and negative numbers.

Once you start adding, multiplying and dividing them, you get small roundoff errors at every stage, which reduces the precision.

If you have two actual numbers ( of unlimited precision ), which are one part in a million different, they will have a different floating point representation. But if they are only one part in ten million different, then they probably won't have a different floating point representation, and the distinction between them will be lost.

nickgammon · March 16, 2013, 6:13am

From the Wikipedia page:

Any integer with absolute value less than 2^24 can be exactly represented in the single precision format, and any integer with absolute value less than 2^53 can be exactly represented in the double precision format

I'll give you the "minus one" but it is saying absolute value less than 2^24.

So you should be able to store -16777215 to +16777215. Not half of that.

michinyon · March 16, 2013, 6:14am

Yeah probably. I have been doing experiments in fixed point and that is using 2-s complement so there is only half as many numbers.

retrolefty · March 16, 2013, 2:25pm

Once you start adding, multiplying and dividing them, you get small roundoff errors at every stage, which reduces the precision.

What was the old saying? Something like using floating point numbers is like moving around piles of sand, every time you move one you lose a little sand and pick up a little dirt.

Lefty

GoForSmoke · March 16, 2013, 4:02pm

When I talk about places I only count the digits that can be 0 to 9. The above has 7 places. Yet the Reference doc says 6-7 and that gives me the feeling that there's values where it's true.

32 bit signed or unsigned I call 9 places.

Docedison · March 16, 2013, 4:16pm

log₁₀ 2^³² = 9.63 a good argument for 9 places of accuracy or smaller bits of dirt to replace the lost sand.
log₁₀ 2^²⁴ = 7.22 should be enough for government work...

Bob

liuzengqiang · March 16, 2013, 7:09pm

According to the wiki page, if the exponent is non zero, then there is an implicit 1 before decimal, followed by 23 bits of significand after decimal so except for that 16xxxxxx, you can do 24 bits. You have to be at least twice or half value of the straight 16xxxxxxx.

system · March 16, 2013, 8:17pm

I tried a little experimenting to see if I could figure out the 6 or 7 digit question. I incremented several boundary values that I would working within and found strange results.

Incrementing 31.000001 by .000001 will increase it's value. However, incrementing 32.000001 by the same amount does not. Incrementing by .000002 does work, though.

float i = 31.000001;
float j = 99.000001;
float k = 32.000001;
float m = 32.000001;
float n = 63.000001;

void setup ()
  {
  Serial.begin (115200);
}  // end of setup

void loop () {

  i+=.000001;
  j+=.000001;
  k+=.000001;
  m+=.000002;
  n+=.000001;
  
  Serial.print (i,15);
  Serial.print("     ");
  Serial.print(j,15);
  Serial.print("     ");
  Serial.print(k,15);
  Serial.print("     ");
  Serial.print(n,15);
  Serial.print("     ");
  Serial.println(m,15);
  }

nickgammon · March 16, 2013, 8:46pm

retrolefty:
What was the old saying?

It's:

Beyond the third decimal place, no-one gives a damn.

nickgammon · March 16, 2013, 9:00pm

According to this page:

This gives from 6 to 9 significant decimal digits precision (if a decimal string with at most 6 significant decimal is converted to IEEE 754 single precision and then converted back to the same number of significant decimal, then the final string should match the original; and if an IEEE 754 single precision is converted to a decimal string with at least 9 significant decimal and then converted back to single, then the final number must match the original).

That seems to be saying 6 digits only are guaranteed. However that appears to be contradicted by a page they link to:

When using a decimal floating point format the decimal representation will be preserved using:

7 decimal digits for decimal32

16 decimal digits for decimal64

34 decimal digits for decimal128

Now it's 7 decimal digits.

MarkT · March 16, 2013, 9:21pm

The worse case precision of IEEE single floats is arguably one part in 2^23, not 2^24:

The mantissa is 23 bits and has an implicit '1' before the MSB. Thus at the lower end of the
range the ratio between successive values is 1/2 : 1/2+2^-24, at the higher end 1 : 1+2^-24 (values
represent 0.5 upto nearly 1.0). The lower end corresponds to 23 bits of accuracy (6.9 decimal
digits and the higher end 24 bits of accuracy (7.2 decimal digits).

So for example the next value after 1.0 is 1.00000011921,
the value prior to 2.0 is 1.99999989079,
the next one after 2.0 is 2.00000023842

Picking the nearest representation gives a worst case error of 1/2 the spacing of representable
values - so you can get +/-1 part in 2^24 by choosing the closest value (7.2 significant figures),
but note that quoting +/- shouldn't be glossed over, its still only 23 bits of precision, there is a
factor of 2 from the range -1..+1

All of this is the accuracy of the representation - once you do some floating point operations you
can get errors of 1/2 LSB on every operation - so in practice the error of any useful calculation
is several LSBs, so +/- one part in 2^22 or 23 is more likely, which is around 6 to 6.5 significant figures.

Remember we can talk about a fraction of a significant figures because we are actually talking about
the base 10 logarithm of the error.

GoForSmoke · March 16, 2013, 9:23pm

Even with integers, once I do division that has a remainder.. if I discard that and scale up, I lose places.

Just saying, it's not enough to simply be able to hold a value.

So maybe when I want 3 place decimals I would choose to work with 6.

liuzengqiang · March 16, 2013, 9:38pm

From 31 to 32, you are possibly changing exponents so the grains of number just doubles from the 1 to the 2.

nickgammon · March 16, 2013, 10:13pm

GoForSmoke:
So maybe when I want 3 place decimals I would choose to work with 6.

Which agrees with what they are saying on those pages, that to take an existing float, convert it to decimal, and then convert back, without loss of precision, you have to use 9 digits, not 6. So you throw in 3 extra ones to cope with rounding, etc.

Topic		Replies	Views
How good is my concept on floating point number presentation/display? Programming	113	319	November 25, 2025
RE: Decimal Digits of Precision Programming	6	1689	May 5, 2021
Strange float behaviour on Arduino Mega Programming	29	355	December 25, 2025
RE: Decimal Digits of Precision Programming	41	3893	May 5, 2021
Strange Float or Rounding error Programming	20	5476	May 5, 2021

Float precision 6-7 digits, is it 6 or 7?

Related topics