speeding up function: storing temporary value instead of multiple calculations?

halfdome · August 18, 2015, 5:04pm

I have just written a function to calculate the distance between two points using the WGS84 ellipsoid.

However, calculation takes quite a "long" time, somewhere around 1.5 ms.
I suspect that the (multiple and repeated) calculations of sin(x), cos(x) are the reason.

So I was wondering if it would make sense to replace this code

float calculate_WGS84_distance(float latt0, float long0, float latt1, float long1){
    do some math with sin(latt0), cos(latt0), ...
    do some more math with sin(latt0), cos(latt0), ...

    return some value
}

in favor of one, where the trig functions are calculated once and stored. (Maybe also some smaller foot print of the sketch?)

float calculate_WGS84_distance(float latt0, float long0, float latt1, float long1){
    float sin_latt0 = sin(latt0);
    float cos_latt0 = cos(latt0);

    do some math with sin_latt0, cos_latt0 , ...
    do some more math with sin_latt0, cos_latt0, ...

    return some value;
}

Or does the compiler take care of it by itself?
On a side note: Would it be useful to pass the latt / long variables by reference only?

econjack · August 18, 2015, 5:44pm

Obviously, give it a try and see if you can observe any difference. Depending on the complexity, the modified version that avoids multiple calls to the trig functions should help.

Peter_n · August 18, 2015, 5:56pm

Give a working sketch that calculates the time with micros()
Arduino sets the compiler to optimize for size. If you want to optimize for speed, then a compiler flag can be added with a file "platform.local.txt".

Rearranging the code might not work. Perhaps the compiler knows whats going on and already does re-use the calculations.

What is it for ? I hope you are not trying to do that in a interrupt ?

robtillaart · August 18, 2015, 6:21pm

doing math only once improves speed for the costs of some memory. And yes what you propose should speed up the math. As econjack said the proof is in the pudding test (compare the code and its timing).

time measurement can be done by calling micros() before and after the function call. It gives a good first order timing.

  uint32_t start = micros();
  x = calculate_WGS84_distance(latt0, long0, latt1, long1);
  uint32_t stop = micros();
  Serial.println(stop - start);

If you post the whole function (or sketch) we can help you to optimize it.

You might check - fastHaverSine an approximation of HaverSine for short distances - Libraries - Arduino Forum - if it is relevant.

holmes4 · August 18, 2015, 7:05pm

What application do you have for that calculation with a Arduino?

Mark

halfdome · August 19, 2015, 10:09am

Here is the code.
The function "distance_WGS" with four float's as parameter (lattitude and longitude of start point & target point in degree) and an optional float variable to store the calculated distance in.
The actual math is taken from here Distance Orthodrome (German).

Note:
I have used "double"s here instead of "float" but that should not make any differenceon an Arduino UNO.

main function:

double distance_WGS84(const double & latt_A, const double & long_A, const double & latt_B, const double & long_B, double *distance=0){ 
    //Start und Target coordinates in degree, returns distance in km
 
    const double f = 1/298.257223563;//Abplattung 
    const double a = 6378.137;//Erdradius in km
    double F = degree_to_radian(latt_A+latt_B)/2;
    double G = degree_to_radian(latt_A-latt_B)/2; 
    double l = degree_to_radian(long_A - long_B)/2;
    
    //calculating rough distance
    double S = pow(sin(G), 2) * pow(cos(l), 2) + pow(cos(F), 2)*pow(sin(l), 2);
    double C = pow(cos(G), 2) * pow(cos(l), 2) + pow(sin(F), 2)*pow(sin(l), 2);
    double w = atan(sqrt(S/C));
    double D = 2*w*a;    //grober Abstand
    //correcting rough distance
    double R = sqrt(S*C)/w;
    double H_1 = (3*R-1)/(2*C);
    double H_2 = (3*R+1)/(2*S);
    double s = D*(1+f*H_1* pow(sin(F), 2) * pow(cos(G), 2) - f*H_2 *pow(cos(F), 2) * pow(sin(G), 2));
 //s = s/1.852; //in nautical miles
    *distance = s;
 return s;
}

extra function

const double PI = 3.141592653;

inline double degree_to_radian(double degree){
	double radian = degree*PI/180;
	return radian;
}

Though I want to optimize the code, I do not want to sacrifice accuracy of the calculation.

system · August 19, 2015, 10:34am

    const double f = 1/298.257223563;//Abplattung

Why would you want to do this every time the function is called? Get your calculator out and do it once.

    double S = pow(sin(G), 2) * pow(cos(l), 2) + pow(cos(F), 2)*pow(sin(l), 2);

Pissing away time and memory using pow() to calculate x * x is ... What's the word? Starts with s. 6 letters. ends with tupid.

Why does the function take a reference to a place to store the result AND return the result? Do you wear a belt and suspenders, too?

halfdome · August 19, 2015, 11:01am

PaulS:
    const double f = 1/298.257223563;//Abplattung
Why would you want to do this every time the function is called? Get your calculator out and do it once.
    double S = pow(sin(G), 2) * pow(cos(l), 2) + pow(cos(F), 2)*pow(sin(l), 2);
Pissing away time and memory using pow() to calculate x * x is ...

I am not sure, you are right here. But I am open for correction.
I intended to do it the way you suggested but I have read elsewhere that the first constant calculation is done only once -that is during compilation- and also that x*x and pow(x,2) are handled exactly the same by the compiler / microprocessor. Wouldn't the compiler be really dumb to not optimize these two issues by itself?

Therefore I decided to go with the above posted code for better readability.

PaulS:
Why does the function take a reference to a place to store the result AND return the result?

You have a valid point there. Initially I thought it would be useful to have the option for either belt or suspender, but I agree with you. Instead of preparing for both and then discarding one (most of the time), I will stick with one and convert only for the occasion.

anon57585045 · August 19, 2015, 12:04pm

I rearranged the float calculations to eliminate redundant calculations. This should run faster:

double distance_WGS84(const double & latt_A, const double & long_A, const double & latt_B, const double & long_B, double *distance=0){
    //Start und Target coordinates in degree, returns distance in km
 
    const double f = 1/298.257223563;//Abplattung
    const double a = 6378.137;//Erdradius in km
    double F = degree_to_radian(latt_A+latt_B)/2;
    double G = degree_to_radian(latt_A-latt_B)/2;
    double l = degree_to_radian(long_A - long_B)/2;
   
   double sin_G = sin(G);
   double sin_G2 = sin_G*sin_G;
   
   double cos_G = cos(G);
   double cos_G2 = cos_G*cos_G;

   double sin_F = sin(F);
   double sin_F2 = sin_F*sin_F;
   
   double cos_F = cos(F);
   double cos_F2 = cos_F*cos_F;
   
   double sin_l = sin(l);
   double sin_l2 = sin_l*sin_l;
   
   double cos_l = cos(l);
   double cos_l2 = cos_l*cos_l;
   
   
    //calculating rough distance
    double S = sin_G2 * cos_l2 + cos_F2*sin_l2;
    double C = cos_G2 * cos_l2 + sin_F2*sin_l2;

    double w = atan(sqrt(S/C));
    double D = 2*w*a;    //grober Abstand
    //correcting rough distance
    double R = sqrt(S*C)/w;
    double H_1 = (3*R-1)/(2*C);
    double H_2 = (3*R+1)/(2*S);
    double s = D*(1+f*H_1* sin_F2 * cos_G2 - f*H_2 *cos_F2 * sin_G2);
 //s = s/1.852; //in nautical miles
    *distance = s;
 return s;
}

Peter_n · August 19, 2015, 12:13pm

The compiler might calculate the constant. When constant values are used for the parameters of the function, the compiler might even calculate the complete function.
We need a sketch to test, but the code won't compile and is wrong. How could you have measured the time if the code did not compile ?

This is a sketch with the original calculations. Talk is cheap, let's see who is the first to cut the time in half.

// Test sketch for Arduino forum.
// http://forum.arduino.cc/index.php?topic=342651.0
// With Arduino 1.6.5
// Using an Arduino Uno board (no double, only float)

// No need to declare PI, that is already somewhere in the headers.

// To predefine a parameter, this prototyping is needed.
// But I don't understand the pointer to the result anyway.
//    double distance_WGS84(double & latt_A, double & long_A, double & latt_B, double & long_B, double *distance = NULL);

void setup()
{
  Serial.begin(9600);
  Serial.println("\nDuration of floating point calculations");
}

void loop()
{
  double latA, longA, latB, longB;

  latA = random(0, 500);
  longA = random(0, 500);
  latB = random(0, 500);
  longB = random(0, 500);
  
  uint32_t startTime = micros();
  double result = distance_WGS84(latA, longA, latB, longB);
  uint32_t stopTime = micros();

  Serial.print("Result : ");
  Serial.print(result);
  Serial.print("     That took : ");
  Serial.print(stopTime - startTime);
  Serial.print(" us");
  Serial.println();
  delay(5000);
}


double distance_WGS84(const double & latt_A, const double & long_A, const double & latt_B, const double & long_B) 
{
  //Start und Target coordinates in degree, returns distance in km

  const double f = 1 / 298.257223563; //Abplattung
  const double a = 6378.137;  //Erdradius in km
  double F = degree_to_radian(latt_A + latt_B) / 2;
  double G = degree_to_radian(latt_A - latt_B) / 2;
  double l = degree_to_radian(long_A - long_B) / 2;

  //calculating rough distance
  double S = pow(sin(G), 2) * pow(cos(l), 2) + pow(cos(F), 2) * pow(sin(l), 2);
  double C = pow(cos(G), 2) * pow(cos(l), 2) + pow(sin(F), 2) * pow(sin(l), 2);
  double w = atan(sqrt(S / C));
  double D = 2 * w * a; //grober Abstand

  //correcting rough distance
  double R = sqrt(S * C) / w;
  double H_1 = (3 * R - 1) / (2 * C);
  double H_2 = (3 * R + 1) / (2 * S);
  double s = D * (1 + f * H_1 * pow(sin(F), 2) * pow(cos(G), 2) - f * H_2 * pow(cos(F), 2) * pow(sin(G), 2));
  
  //s = s/1.852; //in nautical miles

  // The extra parameter is no longer used.
  // When it was NULL, the contents should not be written.
  // if( distance != NULL)
  //  *distance = s;
    
  return s;
}


inline double degree_to_radian(double degree) 
{
  double radian = degree * PI / 180;
  return radian;
}

Original code by halfdome : 1368 to 1428 us.
New code by aarg : 1356 to 1420 us

anon57585045 · August 19, 2015, 12:18pm

The code compiled for me. I didn't optimize everything, I just looked for low hanging fruit.

Peter_n · August 19, 2015, 12:43pm

The code by aarg with compiler options : -Ofast -fira-loop-pressure -ffast-math -flto
Result : 1240 to 1300 us.

Well, I guess it does take more than 1ms to calculate it after all...

Let's return to the use of all this:
halfdome, how do you use this function and for what ?
I hope you are not trying to do this in an interrupt ?
Do you want the distance every millisecond ? why ?

system · August 19, 2015, 12:49pm

but I have read elsewhere that the first constant calculation is done only once -that is during compilation-

Possible. But, still, why? 1 / 298.257223563 means nothing more to me than 0.003352819664745.

and also that x*x and pow(x,2) are handled exactly the same by the compiler / microprocessor.

I don't believe that.

Wouldn't the compiler be really dumb to not optimize these two issues by itself?

Why would you call a compiler that lets you do stupid shit dumb?

robtillaart · August 19, 2015, 1:15pm

not cut into half but ~30% off the time 8) [ IDE 1.5.8, UNO ]

// Test sketch for Arduino forum.
// http://forum.arduino.cc/index.php?topic=342651.0
// With Arduino 1.6.5
// Using an Arduino Uno board (no double, only float)

// No need to declare PI, that is already somewhere in the headers.

// To predefine a parameter, this prototyping is needed.
// But I don't understand the pointer to the result anyway.
//    double distance_WGS84(double & latt_A, double & long_A, double & latt_B, double & long_B, double *distance = NULL);

void setup()
{
  Serial.begin(9600);
  Serial.println("\nDuration of floating point calculations");
}

void loop()
{
  double latA, longA, latB, longB;

  latA = random(0, 500);
  longA = random(0, 500);
  latB = random(0, 500);
  longB = random(0, 500);

  uint32_t startTime = micros();
  double result = distance_WGS84(latA, longA, latB, longB);
  uint32_t stopTime = micros();

  Serial.print("Result : ");
  Serial.print(result);
  Serial.print("     That took : ");
  Serial.print(stopTime - startTime);
  Serial.print(" us");
  Serial.println();

  startTime = micros();
  result = distance_WGS84_2(latA, longA, latB, longB);
  stopTime = micros();

  Serial.print("Result : ");
  Serial.print(result);
  Serial.print("     That took : ");
  Serial.print(stopTime - startTime);
  Serial.print(" us");
  Serial.println();
  delay(5000);
}


double distance_WGS84(const double & latt_A, const double & long_A, const double & latt_B, const double & long_B)
{
  //Start und Target coordinates in degree, returns distance in km

  const double f = 1 / 298.257223563; //Abplattung
  const double a = 6378.137;  //Erdradius in km
  double F = degree_to_radian(latt_A + latt_B) / 2;
  double G = degree_to_radian(latt_A - latt_B) / 2;
  double l = degree_to_radian(long_A - long_B) / 2;

  //calculating rough distance
  double S = pow(sin(G), 2) * pow(cos(l), 2) + pow(cos(F), 2) * pow(sin(l), 2);
  double C = pow(cos(G), 2) * pow(cos(l), 2) + pow(sin(F), 2) * pow(sin(l), 2);
  double w = atan(sqrt(S / C));
  double D = 2 * w * a; //grober Abstand

  //correcting rough distance
  double R = sqrt(S * C) / w;
  double H_1 = (3 * R - 1) / (2 * C);
  double H_2 = (3 * R + 1) / (2 * S);
  double s = D * (1 + f * H_1 * pow(sin(F), 2) * pow(cos(G), 2) - f * H_2 * pow(cos(F), 2) * pow(sin(G), 2));

  //s = s/1.852; //in nautical miles

  // The extra parameter is no longer used.
  // When it was NULL, the contents should not be written.
  // if( distance != NULL)
  //  *distance = s;

  return s;
}

double distance_WGS84_2(const double & latt_A, const double & long_A, const double & latt_B, const double & long_B) {
  //Start und Target coordinates in degree, returns distance in km

  const double f = 1 / 298.257223563; //Abplattung
  const double a = 6378.137;//Erdradius in km
  double F = (latt_A + latt_B) * 0.00872664625997164788461845384244; // PI /180 / 2
  double G = (latt_A - latt_B) * 0.00872664625997164788461845384244;
  double l = (long_A - long_B) * 0.00872664625997164788461845384244;

  double sin_G = sin(G);
  double sin_G2 = sin_G * sin_G;
  double cos_G2 = 1 - sin_G2; // cos_G * cos_G;
  double cos_G = sqrt(cos_G2); // cos(G);

  double sin_F = sin(F);
  double sin_F2 = sin_F * sin_F;
  double cos_F2 = 1 - sin_F2; // cos_F * cos_F;
  double cos_F = sqrt(cos_F2); // cos(F);

  double sin_l = sin(l);
  double sin_l2 = sin_l * sin_l;
  double cos_l2 = 1 - sin_l2;
  double cos_l = sqrt(cos_l2); // cos(l);


  //calculating rough distance
  double S = sin_G2 * cos_l2 + cos_F2 * sin_l2;
  double C = cos_G2 * cos_l2 + sin_F2 * sin_l2;

  double w = atan(sqrt(S / C));
  double D = 2 * w * a; //grober Abstand
  //correcting rough distance
  double R = sqrt(S * C) / w;
  double H_1 = (3 * R - 1) / (2 * C);
  double H_2 = (3 * R + 1) / (2 * S);
  double s = D * (1 + f * H_1 * sin_F2 * cos_G2 - f * H_2 * cos_F2 * sin_G2);
  //s = s/1.852; //in nautical miles
  // *distance = s;
  return s;
}




inline double degree_to_radian(double degree)
{
  double radian = degree * PI / 180;
  return radian;
}

output:
Duration of floating point calculations
Result : 15551.30 That took : 1428 us
Result : 15551.31 That took : 972 us
Result : 6044.35 That took : 1372 us
Result : 6044.36 That took : 932 us
Result : 2332.87 That took : 1368 us
Result : 2332.87 That took : 936 us
Result : 2786.04 That took : 1380 us
Result : 2786.03 That took : 944 us
Result : 11123.37 That took : 1408 us
Result : 11123.37 That took : 964 us
Result : 10408.16 That took : 1408 us
Result : 10408.16 That took : 960 us
Result : 2470.32 That took : 1372 us
Result : 2470.32 That took : 928 us
Result : 15646.64 That took : 1424 us
Result : 15646.64 That took : 984 us
Result : 14077.65 That took : 1412 us
Result : 14077.65 That took : 980 us
Result : 2026.55 That took : 1388 us
Result : 2026.55 That took : 952 us
Result : 4544.83 That took : 1372 us
Result : 4544.83 That took : 936 us
Result : 18062.88 That took : 1416 us
Result : 18062.88 That took : 992 us
Result : 3123.42 That took : 1388 us
Result : 3123.42 That took : 968 us
Result : 6644.40 That took : 1368 us
Result : 6644.40 That took : 928 us

econjack · August 19, 2015, 1:20pm

@robtillaart: What does the compiler do with something like:

0.00872664625997164788461845384244

If a float has 6-7 digits of precision, how is this constant factored? (Not being a smartass...I really don't know.)

RayLivingston · August 19, 2015, 1:49pm

PaulS:
    const double f = 1/298.257223563;//Abplattung
Why would you want to do this every time the function is called? Get your calculator out and do it once.

Get your facts straight before insulting someone. It will be done only one - at compile time.

Regards,
Ray L.

system · August 19, 2015, 1:52pm

If a float has 6-7 digits of precision, how is this constant factored?

There is a number of bits reserved for the exponent, and a number reserved for the mantissa. The value is stored in binary, which is why the number of significant digits is vague.

The value would be stored as (approximately) 0.8726646e-2.

robtillaart · August 19, 2015, 3:26pm

The value came from Calc.exe (PI/180/2).
The compiler will parse it into a float, something like 0.8726646E-2

just do a Serial.println(0.00872664625997164788461845384244, 20);

and you see where the float starts fantasizing numbers...

econjack · August 19, 2015, 3:42pm

...you see where the float starts fantasizing numbers...

I never understood that. When the function runs out of "good" numbers and then starts slinging BS, why go on? I'd rather see zeroes than guesses. Even worse, some people might think the number actually is accurate to 20 places.

RayLivingston · August 19, 2015, 4:31pm

econjack:
I never understood that. When the function runs out of "good" numbers and then starts slinging BS, why go on? I'd rather see zeroes than guesses. Even worse, some people might think the number actually is accurate to 20 places.

It's not "slinging BS". Any non-integer calculation will introduce errors. The compiler cannot protect you from that. YOU have to protect yourself, by understanding the limitations inherent in the use of floating point numbers, and write your code so that, despite those limitations, your final result is accurate to within the limits you require. Exceedingly few application REQUIRE such extreme accuracy, and there are many ways to achieve high accuracy, but ALL require understanding what you are doing, and coding accordingly. No compiler can do that for you.

Regards,
Ray L.

Topic		Replies	Views
Fastest way to do sin(), cos() atan2() Programming	59	41965	May 5, 2021
Festkunktmagie mehr als Abra Kadabra? Deutsch	24	3276	May 6, 2021
Konstanten im Flash Deutsch	32	5111	May 6, 2021
[Résolu] Problème avec la précision de sin,cos, etc... Français	16	5738	May 6, 2021
Maledette funzioni!! Software	64	19824	May 7, 2021

speeding up function: storing temporary value instead of multiple calculations?

Related topics