1. Let us say that we have the following two decimal (aka floating point number) numbers:
n1 = 3.12345678901234567890899;
n2 = 3.12343478901234567890996;
Which one of the above two numbers is more precise? Which one is more accurate?
Both numbers have 23 digits precision after the decimal points.
Each number is accurate as accurate it should be on its own respect.
2. The meaning of accuracy will be , as I understand, clear from the following discussion:
(a) Let us add the above two numbers of Step-1 manually. We will get --
sum, n = 6.24689157802469135781895
3. Let us add the above two numbers of Step-1 using UNO. We will get --
float n1 = 3.12345678901234567890899;
float n2 = 3.12343478901234567890996;
//By manual calculation, n1+n2 = 6.24689157802469135781895
Serial.println (n1 + n2, 23); //prints: 6.24689149856567382812500
4. Is it the man-made result (6.24689157802469135781895 of Step-2) or the machine-made result (6.24689149856567382812500 of Step-3) which is more accurate? The answers are --
(a) Both results have equal accuracy when we consider 6-digit accuracy after the decimal point.
(b) Man-made result (6.2468915) of Step-2 is more accurate than the machine-made result (6.2468914) of Step-3 if we consider 7-digit accuracy after the decimal point.
5. Where and how have we lost the accuracy?
To get the answer to this question, we need to understand the bit level representation of the floating point number (float) (number with integer and fractional part).
(a) When we declare/define a float number by: float n1 = 3.12345678901234567890899;, a 32-bit wide bit pattern (4047E6B7) is saved into 4 consecutive memory locations of the MCU. The bit pattern is determined based on the IEEE-754 (aka binary32 format) standard (Fig-1) where 23 bits have been allocated to contain the fractional part of the float number.
Figure-1: binary32 format for the representation of float number
6. How can we improve the accuracy of the machine-made result?
Fig-1 of Step-5 reveals that the accuracy could be improved by allocating more bits for the fractional part of the decimal number. Accordingly, the IEEE-754 standard (aka binary64 format) has come into action (Fig-2) where 52 bits have been allocated to contain the fractional part of the float number.
Figure-2: binary64 format for the representation of float number
(a) In binary64 format, 64-bit wide bit pattern is produced (as per Fig-2) for a float number, and it is saved in an 8 byte consecutive memory locations. For example: for the definition of double n1 = 3.12345678901234567890899;, the 64-bit pattern of 0x4008FCD6E9BA37B3 is saved into memory.
(b) The 64-bit float data type is supported by Arduino DUE with the keyword double.
7. Let us check the improvement of accuracy by dealing the float numbers of Step-1 using binary64 format and Arduino DUE.
double n1 = 3.12345678901234567890899;
double n2 = 3.12343478901234567890996;
Serial.println (n1 + n2, 53);
(a) Man-made result : 6.24689157802469135781895 (23-digit accuracy and 23-digit precision when compared to itself)
(b) binary32 format result : 6.24689149856567382812500 (6-digit accuracy and 23-digit precision when compared to (a))
(c) binary64 format result : 6.24689157802469097191533364821225404739379882812500000 (14-digit accuracy and 53-digit precision when compared to (a) and (b))
We observe that the accuracy has gone up upto 14-digit after decimal point when using binary64 format.
Arduino (DUE) indeed can deal with the situation where higher accuracy and higher precision are demanded in the process of floating point numbers.