Go Down

Topic: Floating Point Math? (Read 7826 times) previous topic - next topic


Hey, so saw this came out, too bad I can't get one for a while. Oh well.

So was wondering though. So I know that floating point math is murder on a regular arduino. And that lack of speed with FP math can cause some problems if people want quickly updating things thats doing a bit of math with angles and whatnot.

So its nice enough that the due is just over 5x faster. But the 32bit processor, does anyone know how much better it would fair doing large amounts of FP? (got this little GPS based project working around in the brain. Lotsa angle math to do, and not just on the lattitude and longitude)


Oct 23, 2012, 05:59 pm Last Edit: Oct 23, 2012, 06:08 pm by pYro_65 Reason: 1
The Due is 10x faster isn't it??
EDIT: Whoops yeah, got 8Mhz in my board.

The arduino UNO has no FPU so if the Due processor has one, it'll beat it hands down, EDIT: has none but there must be some fancy new way of dealing with them that the atmeaga doesn't have.

It would be able to control multiple FP external chips easily I'd imagine.


has none but there must be some fancy new way of dealing with them that the atmeaga doesn't have.

Well that is down to the compiler to use if it has one. Not much you can do about it unless you want to hack the compiler.

The simple fact of having 32 bit registers is going to give you a minimum of a 4X time bonus, so add that to the 5X processor speed and you get a minimum of 20X. 

Also ARM chips have a barrel shifter so that means if you want to shift something 10 places to the left that takes one instruction where as on an 8 bit machine it takes at least 10 instructions.


Note, I don't believe the Arm SAM3X8E that the DUE uses supports floating point, so you are still using an emulator.  As others have said, since the DUE has a faster clock rate and is a 32-bit processor, means the floating point emulation should be faster.  Note, Arduino floating point uses 32-bit values for float, double, and long double, while I believe the Arm boards use 64 bit.  This means the mantissa and exponent ranges are higher.

The recently released Teensy 3.0 uses a Cortex M4 board that also does not support floating point.  There is a varient of the Cortex M4 (M4F) that does have floating point, but the Teensy 3.0 doesn't use it.

The Raspberry Pi does support hardware floating point, and runs at a much higher clock rate (700Mhz), so it should have much better floating point.  The higher clock rate also means a higher current draw, so you will need to think about more batteries and recharging for long running aps.  However, at present, it is more setup to run Linux than to run embedded devices, though I'm sure people are working on it.

I don't recall off hand whether beaglebone or mbed support floating point or not.


though I'm sure people are working on it.

It is called Bare Metal
Here is the forum for it:-


..there are some Cortex M3 benchmarks in http://arduino.cc/forum/index.php/topic,121568.75.html
try to run it on an 8bitter, compare clock to clock ad you will see.. as a rule 64bit fp calcs are 2x slower than 32bit fp calcs in general..


Fixed point may be what you really need.
Floating point is designed to handle both very large numbers and very small numbers where precision is not that important.
If what you need is hundredths (or thousandths) of a degree for GPS readings, then use integers and write the code understanding that the values are hundredths of a degree (or volt or foot or temperature degree). This makes any math hugely faster. Perhaps it's only necessary to think of the value in full units when it is displayed for human viewing.


Fixed point is very handy for accumulating small bits to gain resolution.
An example using 16 bits, where the top 8 bits (H) are the integer component and the bottom 8 bits (L) are the fractional component.
Code: [Select]


It's quite easy if you define a new type using a union/struct...

Code: [Select]
typedef  union
      byte low;
      byte hi;
  };  unsigned  all;
} fixedPoint;

fixedPoint Accumulator;

Adding to Accumulator.all rolls from low to hi.
Access to the fraction component is by Accumulator.low
Access to the integer component is by Accumulator.hi



Looking through the data sheet I see the Due has instructions for:-
Single cycle 32 bit Multiply
Multiply and subtract
Multiply and add
Signed Divide
Signed Multiply (64 bit result)
Signed Multiply with accumulate (64 bit result)
Unsigned Divide
Unsigned Multiply with accumulate  (64 bit result)
Unsigned Multiply  (64 bit result)

So if the compiler uses those it will greatly speed things up.


Code: [Select]
long t = millis();
float a = 1.2;
float b = -7.8;
float c;
float d = 0;
for(long i = 0; i < 1000000; i++)
    c = a + b;
    c = c - a;
    c = a*c;
    c = c/a;
    d += c;
t = millis()-t;
Serial.print("Time: ");

runs in 10.1 seconds on a Melzi (RepRap Arduino controller with an ATmega1284P clocked at 16 Mhz), and 1.29 seconds on a Due.

Adrian Bowyer
RepRapPro Ltd

Go Up

Please enter a valid email to subscribe

Confirm your email address

We need to confirm your email address.
To complete the subscription, please click the link in the email we just sent you.

Thank you for subscribing!

via Egeo 16
Torino, 10131