I'm new to this forum, but i have been working with arduino's for quite a while.
I have a question about the execution time of floating point commands. I found this old thread: http://arduino.cc/forum/index.php/topic,40901.0.html mentioning the speed of execution and was wondering where those numbers where comming from...
I get different (very strange and confusing) results...
First i used the function micros() to time my operations, but i read that micros()'s resolution is 4µs?
For my second attempt, i'm using Timer1, with prescaler 1, on a duemillanove, so i should have 16 ticks / µs.
executing a sqrt() or sin() gives my a delta of 1, meaning 1/16th of a µs. This can't be right, but i can't figure out what i'm doing wrong...
Inserting a delayMicroseconds(10) function gives me (roughly) the correct delta of 156 (160 expected), so my timers seems to work correctly.
using the function micros() instead of timer1 gives me a delta of 4µs...
is an arduino really that fast in executing floating point calculations?
A much better method of benchmarking operations like this is to run the calculation in a loop for, say, 10,000 iterations. You take a timestamp before, and a timestamp after the loop (using micros or even millis if the number of iterations is high enough) and some simple math gets you the average time per operation. It's much more reliable, evens out hiccups due to things like interrupts, can time operations that take less time than the timer resolution, and would be portable to boards with different clock rates. You also avoid most problems with the compiler optimizing code in ways you don't expect. And you don't have to do all that messing around with timers, either.
I don't know why you are getting those results, but try it this way and see if you get more reasonable results.
Since you're calling a known function whose result only depends on its argument, and the argument is a compile-time constant, it's conceivable that the floating point calculation has been optimised out by the compiler. If this was happening then you might get a different execution time if you included a value which was not a compile-time constant.
DuaneB:
Can't understand why, your printing it out later in the code so it shouldn't be getting optimized out, but this suggests it is.
I don't think the calculation could have been eliminated completely - all I can think is that the compiler has reordered the code so that the calculation no longer occurs between the timing statements.
Thanks a lot for that link!! That's exactly what a was missing...
PeterH:
DuaneB:
Can't understand why, your printing it out later in the code so it shouldn't be getting optimized out, but this suggests it is.
I don't think the calculation could have been eliminated completely - all I can think is that the compiler has reordered the code so that the calculation no longer occurs between the timing statements.
using the volatile keyword to create the variable solved this problem too. I'm getting 20-40 clockpulses / floating point devision now (calculating the average of 10000 devisions as suggested)
Maybe 1 last question... Is there a way to remove the -Os compile option and get rid of all the optimizations? Just for the sake of this kind of exercises?
volatile says to the compiler you may not optimize this statement.
All volatile says is it may not alter the stores or loads to the variable. In particular, the compiler is allowed to optimize the sqrt value and save it away in a temporary value or compute it in the compiler, and then store the saved value in the loop.
The code sequences that replace the original code have implicit conversions from integer to floating point as well as the sqrt operation. In addition, if you ever move the code to a different processor, like say a Due that uses the Arm chip, storing the result of sqrt into a float variable will cause an implicit double to single conversion.
All volatile says is it may not alter the stores or loads to the variable.
It tells the compiler not to assume that the value in a variable, even if it does not appear that the variable has not been written to.
The issue in this code is due to the resulting variable is not used so the division was optimized away. You can force a use of that division by assigning it to a (volatile) variable.
zatalian:
Is there a way to remove the -Os compile option and get rid of all the optimizations? Just for the sake of this kind of exercises?
You are proposing to disable all optimisations in order to make it possible to measure the performance? Doesn't that render the performance measurements meaningless?
Which all goes to show what a nonsense measuring the performance and optimization of contrived sequences of code is.
If you have a real application - then it gets interesting.
Agree with you unless your goal is to learn about optimizations and how they are done. (there is always that other option
Well, disabling optimizations can be useful to compare floating point operations versus integer operations. The whole point of this exercise was to know - before i have a complete project - if the arduino will be fast enough and if i will be able to use floating points or if I will have to do all the calculations with integers.
But in the end... these measurements will indeed be estimates and real measurements can only be taken in real programs. I totally agree with that statement.
Thanks to everybody for this very informative discussion.