Mathematical performance optimization with chipKIT uC32

Hi there!

I am doing a project related with real-time feedback.

I need to acquire and analyze the data in a few tens of microseconds. However, using the build-in mathematical operations, it takes too much.

I am not going to put all my code because of its extension, just the part that I need to optimize.

int ADC1After[] ;

float  A_Fourier;
float  B_Fourier;
float Amplitude;
float Phase;

    A_Fourier=2.0/3.0*(ADC1After[0]*cos(0.0*PI/3.0)+ADC1After[1]*cos(2.0*PI/3.0)+ADC1After[2]*cos(4.0*PI/3.0));

    B_Fourier=2.0/3.0*(ADC1After[0]*sin(0.0*PI/3.0)+ADC1After[1]*sin(2.0*PI/3.0)+ADC1After[2]*sin(4.0*PI/3.0));

    Amplitude=sqrt(pow(A_Fourier,2)+pow(B_Fourier,2))/1023*3.3;

    Phase=atan2(B_Fourier,A_Fourier) * 180 / PI;

the ADC1After vector is some data vector I had previously acquired (in uint32_t datatype).

As you can see, it involves sum, multiplication, division,sqrt, pow, and a trigonometric function.

This whole 4 lines of code takes 120 microseconds ( I have estimated it with an oscilloscope ), and breaking it down, each line code takes:
1st line (10 uS), 2nd line (10 uS), 3r line (25 us), and 4th line (75 us).

As expected, the phase, as it involves a division and trigonometric functions takes the longest.

My objective is to reduce all the computation time up to 25-30 uS.

After some research I found that in general there are methods to substitute the division by multiplication and shift, and the trig. function to Taylor expansions. However, I don't think I can use Taylor, as my phase can take values from 0 to 2pi.

Any suggestions?

I have also read that programming in assembly language can improve the speed of performance. However, as it is not an easy task I'll try to avoid it if possible.

Division is costly; precalculate all terms like 2.0/3.0 and store them.

Cos(0.0*PI/3.0) is one, sin of the same term is zero.

pow is expensive too, just multiply the number by itself instead.

Consider precalculating tables for the trig functions and store the data in an array, in progmem if necessary. Create some accessor functions e.g. myCos to take care of the table lookup. It may be faster than the built ins.

Forget assembler, the gcc compiler is likely doing just as well as you could manage by hand.

Finally, remember the most effective way to get better performance out of your code: buy faster hardware :wink: