# float & 128, 64, 32 bit integers in Atmel Studio

This is a post is for curiosity sakes only, but it was prompted by a question earlier today wherein someone tried to multiply a long integer by 1000 and not getting the appropriate result. This got me thinking that the limit of a long is 65,535, which makes perfect sense considering the design intent of the 328P.

I know how I'd implement unpacked BCD and signed/unsigned integers in assembly, to precisions only limited by memory (data space), but Studio 7 did not complain about this;

``````int main (void) {
float MC = 71073.73;

MC *= 192.72;
}
``````

but in the debugger I was unable to find where it was stored and hovering over it yielded nothing.

Is my assumption correct that the 8 bit processors are incapable of floating point?

TightCoderEx: This is a post is for curiosity sakes only, but it was prompted by a question earlier today wherein someone tried to multiply a long integer by 1000 and not getting the appropriate result. This got me thinking that the limit of a long is 65,535, which makes perfect sense considering the design intent of the 328P.

That is the limit for a 16 bit unsigned int. A long goes from -2,147,483,648 to +2,147,483,647 and an unsigned long goes from 0 up to 4,294,967,295.

TightCoderEx: Is my assumption correct that the 8 bit processors are incapable of floating point?

The 328 doesn't have hardware support for floating point. All the floating point math is taken care of with software. That's why a small program suddenly gets much bigger as soon as you add just one floating point math operation.

Delta_G: The 328 doesn't have hardware support for floating point. All the floating point math is taken care of with software. That's why a small program suddenly gets much bigger as soon as you add just one floating point math operation.

I suspected as much, this is why the one data logging project I'm working on will interface with an X86 system (USB or WiFi) for storage and data manipulation.

The ATmega328P is 8-bits, but that doesn’t hold back the avr-gcc compiler.

The avr-gcc compiler supports:
Integers : 8, 16, 32 or even 64 bits. Signed and unsigned.
Float : only 32 bits in software. The keyword ‘double’ is translated into ‘float’.

The ‘float’ library for the avr-gcc is highly optimized. It is very fast.

I think it should be possible to add a 64-bit floating point library, but I don’t know if that has been done for 8-bits ATmega chips.

There are libraries to do more.
For example BigNumber : Gammon Forum : Electronics : Microprocessors : Arbitrary precision (big number) library port for Arduino

Test if the avr-gcc compiler really support 64 bits integers (spoiler: yes, it does)

``````// Tested with Arduino IDE 1.6.10 and Arduino Micro (ATmega32U4).

void setup()
{
Serial.begin( 9600);
while(!Serial);       // wait for serial monitor for Leonardo/Micro

Serial.println("64 bit integers");

// 64 bits is 16 hex digits
//                       0123456789012345
int64_t            a = 0x7777777777777777LL;
uint64_t           b = 0xAAAAAAAAAAAAAAAAULL;
long long          c = 0x0123456789ABCDEFLL;
unsigned long long d = 0x212320242F252E26ULL;

print64BitsAsHex( a); Serial.println();
print64BitsAsHex( b); Serial.println();
print64BitsAsHex( c); Serial.println();
print64BitsAsHex( d); Serial.println();

uint64_t e = c * d;
print64BitsAsHex( e); Serial.println();

uint64_t f = b + d;
print64BitsAsHex( f); Serial.println();
}

void loop()
{
}

void print64BitsAsHex( uint64_t x)
{
char buffer[16];
for( int i=0; i<16; i++)
{
buffer[15-i] = x & 0x0F;    // get the lowest nibble
x >>= 4;
}

Serial.print("0x");
for( int i=0; i<16; i++)
{
byte b = buffer[i];
if( b < 0x0A)
b += '0';
else
b += 'A' - 0x0A;
Serial.write(b);
}
}
``````

Koepel: The avr-gcc compiler supports: Integers : 8, 16, 32 or even 64 bits. Signed and unsigned.

Also 24 bits: `__int24` and `__uint24`.

Oooh, didn't know 24 bits were supported, interesting.

I thought it was a joke, but it is not : https://gcc.gnu.org/wiki/avr-gcc It is an extension, and only for the avr-gcc.

Meanwhile, I have been searching for a double precision floating point library in 'c' or 'c++' to be used in a ATmega, but I could not find it. Maybe MPFR http://www.mpfr.org/ but that is made with a special purpose. The Berkeley SoftFloat might be adaptable for an ATmega : http://www.jhauser.us/arithmetic/SoftFloat.html

UPDATE: if the avr-gcc compiler did support 64-bit floating point, but there was no floating point library at all, then the Berkeley SoftFloat could be used. But since the avr-gcc compiler treats all floating point as 32-bit, also the constants, the SoftFloat library can not be used in a normal way. Adding a 64-bit floating point constant seems impossible.

``````int main (void) {
float MC = 71073.73;
``````

MC *= 192.72; }

``````

but in the debugger I was unable to find where it was stored and hovering over it yielded nothing.
``````

Well, as you didn't do anything with the result, it probably got optimized to this:

``````int main (void) {

}
``````

Koepel: I have been searching for a double precision floating point library in 'c' or 'c++' to be used in a ATmega

Why not just use fixed-point?

Actually, on my wish list would be speed-optimized library functions for either unpacked or packed BCD, preferably the latter. Why should we have to jump through hoops to get at individual digits of a number?

To do precise calculations and send the data to a computer in double float IEEE format.

Koepel: To do precise calculations and send the data to a computer in double float IEEE format.

How many good digits do you need? What are you ultimately trying to do? http://xyproblem.info/

TightCoderEx started this topic with : "This is a post is for curiosity sakes only...". Since it has my interest I tested if 64-bits integers actually could be used (yes), and the next obvious question would be if double precision could be added (no, not in a normal way). I have worked with embedded systems and software floating point libraries before, and I have always wondered if an ATmega could do double precision, in the first place for compatibility.

The gcc compiler defaults to double precision, but for the avr target that has been altered. I can understand that double precision for an ATmega8 is not useful, but it could be useful for an ATmega328P. That change in the avr-gcc compiler to default to single precision is perhaps not a good decision after all.

Koepel: TightCoderEx started this topic with : "This is a post is for curiosity sakes only...". Since it has my interest I tested if 64-bits integers actually could be used (yes), and the next obvious question would be if double precision could be added (no, not in a normal way). I have worked with embedded systems and software floating point libraries before, and I have always wondered if an ATmega could do double precision, in the first place for compatibility.

The gcc compiler defaults to double precision, but for the avr target that has been altered. I can understand that double precision for an ATmega8 is not useful, but it could be useful for an ATmega328P. That change in the avr-gcc compiler to default to single precision is perhaps not a good decision after all.

Hi Koepel

I' am a beginner in microcontrollers development. Thanks for your explanations of this topic.

I have developed a program for calculating astronomical events such as sunset and sunrise. I'am using a Atxmega 8-bit cpu MCU but the results of the calculations are not precise enough.

here my questions : 1/ Have you finally tried and/or succeeded in using Berkeley SoftFloat ? 2/ Would it be easier to have a 64 bit float library if i used a 32-bit cpu MCU ?

Thanks

64 bit floating point emulator C code for AVR can be downloaded here: https://www.mikrocontroller.net/topic/85256

I’ve also attached the zip file, as there are several corrections and the discussion is in German.

IEEE754_double.zip (1.03 MB)