atan2 in library

I have a c++ library I am writing, within the library I am doing some atan2 calculations.

If I do the same calculations in the ino sketch directly in the loop, the calculation is approximately 20% faster.

Can someone explain why this might be? Is it something to do with the math.h library in use is different in the library than the sketch? If so, how do I tell my c++ library to use the arduino math.h library?

sliccster: If I do the same calculations in the ino sketch directly in the loop, the calculation is approximately 20% faster.

I cannot believe that.

Most likely, the optimizer will sort out some code in the sketch as it is not needed. For example, if you are calculating atan2 from a constant (or a value that is already known at compile time), the compiler will recognise that the expression can be calculated at compile time and then wiil insert the result into the code of the sketch instead of the actual function call.

Please send simple example library and simple example sketch for demonstration if you think that you are not watching a compiler optimization of expressions that can already be calculated at compile time.

No problem, here is the code in my library (for reading from a magnetometer over I2C):

float HMC5883L::GetHeading()
{
MagnetometerRaw scaled = ReadRawAxis(); //scaled values from compass.
float heading = atan2(scaled.YAxis, scaled.XAxis);

return heading;
}

MagnetometerRaw HMC5883L::ReadRawAxis()
{
uint8_t* buffer = Read(DataRegisterBegin, 6);
MagnetometerRaw raw = MagnetometerRaw();
raw.XAxis = (buffer[0] << 8) | buffer[1];
raw.ZAxis = (buffer[2] << 8) | buffer[3];
raw.YAxis = (buffer[4] << 8) | buffer[5];
return raw;
}

In the ino loop I have:

 //this call does approx 760 reads reads per second
 float heading = mag.GetHeading();

 //this approach does approx 900 reads per second
 MagnetometerRaw scaled = mag.ReadRawAxis(); //scaled values from compass.
 float heading = atan2(scaled.YAxis, scaled.XAxis);

Anything jumping out at you?

Thanks.

Moderator edit:
</mark> <mark>[code]</mark> <mark>

</mark> <mark>[/code]</mark> <mark>
tags added.

sliccster:
Anything jumping out at you?

Create a test sketch that also uses the “ZAxis” in the loop of your ino sketch and try again!
Any differences?

If your ino test sketch uses only XAxis and YAxis and never ZAxis, then most likely this line will be removed from the optimizer:

raw.ZAxis = (buffer[2] << 8) | buffer[3];

As ZAxis is never accessed in your ino sketch, it is not needed for the program and the optimizer may remove all code from the sketch and the library that is related to ZAxis.

I don’t believe that atan2 causes different execution time.

So I reduced everything down, in the library I now have:

float HMC5883L::GetHeading()
{
  float heading = atan2(rand(), rand());

  return heading;
}

and in the loop I have:

  //this gives approx 8000 per sec
  float heading = atan2(rand(), rand());

  //this gives approx 3500 per sec
  float heading = mag.GetHeading();

sliccster: and in the loop I have:

  //this gives approx 8000 per sec
  float heading = atan2(rand(), rand());

 //this gives approx 3500 per sec  float heading = mag.GetHeading();

Without looking at a complete source, I see:

You get 8000 per sec with three function calls: - rand() - rand() - atan2()

And you get 3500 per sec with four function calls: - rand() - rand() - atan2() - mag.GetHeading();

So the single function call "mag.GetHeading()" costs about as much time as rand(),rand(),atan2() together. I don't know why there is such a big difference. But perhaps the "heading" result is never ever used in the sketch, so the function calls of rand() and atan2() that are called from the ino-sketch can be eliminated from the executable.

For better comparison (and slower execution, less optimization possible) perhaps declare "heading" as a "volatile" variable, add up the results during timing and use the result.

  volatile float heading1, heading2; // make results "volatile"

  //use for timing this:
  heading1 += atan2(rand(), rand()); // add up during timing

  //and for the other timing that:
  heading2 += mag.GetHeading();  // add up during timing

How much difference will you see then?

It looks like you are correct, I changed to code to print out the heading every 1000 readings and now both methods give the same performance.

So as you say, the ino version was obviously being optimised out.

Thanks.