Hello there. I was wondering whether there's a way to write custom trigenomotry function (sin, cos, atan2) that work using integers only (e.g. in 1000th of degrees). I need to calculate directions out of two Latitudes/Longitudes as well as a compass reading and floats seem to slow down the program a lot.
Converting degrees to radians should be easy enough but I can't find anything on the internal logic of those trigenomotry function.
In general games programming is a good place to look for highly optimized algorithms, although they often trade large memory usage for speed and so might not be suitable for embedded systems.
// Tell the library to use a 32bit container
#define FIXED_POINT_BITS 32
// Set that 32bit fixed-point container to Q8.23
// so we can use numbers ranging [-256, 255]
#define FIXED_POINT_QN 23
// Include fixed-point library
#include <FixedPoint.h>
void setup() {
}
void loop() {
// Convert 90 into its fixed-point representation
fixed angle = FixedPoint::to(90);
// Convert that to radian
fixed radian = FixedPoint::rad(angle);
// Calculate sine of 90
fixed cosine;
fixed sine = FixedPoint::sinCos(radian, &cosine);
// Fixed-point addition/subtraction is easy
fixed sum = sine+cosine;
// Multiplication/devision requires a helper method (You can't use * and / operators)
fixed tangent = FixedPoint::devide(sine, cosine);
// ...
}
Again, thanks for pointing me in the right directions guys!
good question.. is there no micros() function anymore though? it's in the reference but the compiler throws an error.. anyway:
calculating 100,000,000 sin(), atan() and atan2()'s - first using fixed-point and then using floating-point i get the following result in milliseconds:
so.. well done.. no speed improvement whatsoever.. i wonder why the program slowed down so much last time i used floating points.. ah well.. i guess you get the higher range (but worse accuracy) using fixed-point..
Something is very wrong with your timing code. The ATMEGA168 tops out at 20Mhz. At the very best that's 20,000,000 ops a second which means that 100,000,000 single cycle ops should still take 5 seconds at least. So your results should show a milliseconds time of at least 5000 for each. Since it isn't one can assume that the timing code is totally broken.
I might add that 100,000,000 iterations of the code on a 20MHz controller is not a very good idea... If things were working right I'd assume it would take a long LONG time for either the floating or fixed point tests to complete. Neither is bound to run anywhere close to one cycle per iteration.
My Windows 0013 Arduino IDE generates a binary that is 988 bytes big. Now take the same sketch and add a call to the cos function:
void setup()
{}
void loop()
{
for (int i=0; i<100; ++i)
{
float f = cos(i);
}
}
This sketch is exactly the same size. Why? Because the compiler is smart enough to realize that cos() has no side effects and therefore can be safely optimized away!
I think you have to pretty smart to outsmart those compilers.
How dare the compiler take out any of my code :> I did another test that takes the sum of all calculated angles and outputs it after the time has been stopped. The result is a lot more promsing:
Why would you use cordics for fixed point multiplication and divisions? A fixed point multiply/divide is the same as an integer multiply/divide with a bit of "decimal point" adjustment before and after the actual multiply... (you may need to find a way to access the 64bit intermediate result of a 32x32 multiply, which could be easier said than done...)
well.. as mentioned on my blog i've tried both variants.. one using cordics and the other one using 64bit intermediate results which is simply:
inline static fixed multiply(fixed a, fixed b) {
//return ((int64_t)a*(int64_t)b)>>FIXED_POINT_QN;
fixed x = a, y = 0, z = b;
cordicLinear(&x, &y, &z, -1);
return y;
}
inline static fixed divide(fixed a, fixed b) {
//return ((int64_t)a<<FIXED_POINT_QN)/b;
fixed x = b, y = a, z = 0;
cordicLinear(&x, &y, &z, 0);
return z;
}
(i thought it might be quicker to use cordics rather than extending the container which in fact it is in some cases but overall the speed gain doesnt seem to be worth the hassle imo)