Fast fixed-point asin(). Can I improve this further? Tips? Other approaches?

Hi,

I am creating a library of fast fixed-point trigonometric functions that I will use quite often later on in my project. My plan is to work with angles with 0.01º resolution using variables of type uint16.

My approach is to have a big look-up-table (9000 uint16_t elements, one for each possible angle between 0 and 90º). With this I successfully coded fast cos() and sin() that require just a table look-up.

I wanted to use this LUT for the asin() and acos() as well, to save memory. My idea is to do binary search on the LUT to get the closest index, which would be directly the desired angle. It would take ceil(log2(9000)) = 14 reads from the LUT to get the result.

The test code looks like this:

#include <avr/pgmspace.h>
#include "LUT_sin.h"

unsigned long t1, t2;
volatile float key = 0.5f, dummy = 0.0f, resultStd;
volatile uint16_t idxMin, idxMax, idxMid;
volatile uint16_t result;

void setup() {
  Serial.begin(115200);
}

void loop() {
  idxMin = 0;
  idxMax = LUT_SIN_TABLE_SIZE - 1;
  key = key + dummy; 
  uint16_t keyU = (uint16_t)(key * 65535); // Scale to match LUT
  
  // Fast asin() using binary search  
  t1 = micros();
 
  while( idxMin <= idxMax)  
  {
      idxMid = (idxMax + idxMin)/2;
      
      if(pgm_read_word_near(LUT_sin + idxMid) > keyU)
        idxMax = idxMid - 1;
      else
        idxMin = idxMid + 1;
      asm("");
  }  
  result = idxMid;
      
  t2 = micros();
  
  Serial.print("Time fast asin: "); Serial.println((float)(t2 - t1));

  t1 = micros();
    resultStd = asin(key);
  t2 = micros();
  Serial.print("Time std asin: "); Serial.println((float)(t2 - t1));
  
  delay(1000);

}

The results are:

Time fast asin: 48
Time std asin: 180

It’s quite good, but I was expecting it to be even smaller, since the pgm_read_word takes around 500 ns and everything else is just sums, comparisons and assignments of uint16_t.

Do you think this can be pushed further or did I reach the limit? I can’t go to uint8_t since that’s too poor resolution. Are there better approaches to computing the asin() function?

Thanks!

Could you show us the "LUT_sin.h" ? Your table has the sine for a certain angle. So you search the sine, and the index is the angle. Why not use a table that contains the angle for a certain sine ?

When you have many tables, it will no longer fit in an Arduino Uno. With the Arduino Mega 2560, you probably can't use the pgm...near functions, but you have to use pgm...far, if the PROGMEM size is over 64kbyte. That will make it a little slower.

Using fixed floating point and sine functions is a subject that has been extensively explored. Could you use an existing library ? https://en.wikibooks.org/wiki/Floating_Point/Fixed-Point_Numbers#Values_stored_in_sine_table http://www.mathworks.com/help/fixedpoint/examples/calculate-fixed-point-sine-and-cosine.html http://liballeg.org/stabledocs/en/alleg032.html

check my sinus investigation here - http://forum.arduino.cc/index.php?topic=69723.0 -

LUT_sin.h just contains the table in PROGMEM, where for each angle I compute the sin and multiply by 0xFFFF. Something like:

const uint16_t LUT_sin [9001] PROGMEM = {0, 11, 23, 34, 46, 57, 69, 80, 92, 103, 114, 126, 137, 149, 160, 172, 183, 194, 206,...};

The plan is to put it in an Atmega2560 so I have enough space for it, there shouldn't be a problem.

I did think about having another LUT doing the opposite: for each sine I compute the angle. However due to the non-linearities of asin() I would need many more points in the regions close to 90º to keep achieving an accuracy of 0.01º while still using linear indexing. Otherwise I would have to compute the index using some complex function and that would take more time. That's why I decided to do reverse look-up.

As said I'm quite happy with my cos() and sin() implementations, it takes practically no time. However the links about CORDIC and stuff are very useful!

I guess for asin() I will have to create multiple LUTs with different resolution so that it can accurately handle all inputs between 0 and 1. Hope it fits in memory!

Rob Tillaart and I found a way to get sin() to 9 (yes, nine) decimal places, and it seemed to run reasonably fast for the amount of precision. http://forum.arduino.cc/index.php?topic=339402.30 I know that you want asin() and not sin(), but you might still be able to get some useful ideas from what he and I made.