Go Down

Topic: A faster sin() lookup function. (Read 5 times) previous topic - next topic

robtillaart

Aug 18, 2011, 01:49 am Last Edit: Aug 20, 2011, 05:58 pm by robtillaart Reason: 1
Today I did some investigations in a lookup function for sin(x).  Sin(x) takes approx 120 micros and I wanted more speed, and I knew the price was precision.

As I worked only in whole degrees a lookup table was the way to go. A straightforward float[361]  (1444 bytes!) was not acceptable so folding was applied, and a mapping to bring the table back to uint8_t[91].

The code shows two tables: a 16bit and an 8bit table,
The function float isin(long x) is the core, it folds the angle, does a lookup and convert that to a float.   // isin() stands for integer_sinus()
There are straightforward implementations of icos(x) and itan(x).

The function float fsin(float x) is an interpolation function, that runs on top of isin(),  

Code: [Select]

// 91 x 2 bytes ==> 182 bytes
unsigned int isinTable16[] = {
 0, 1144, 2287, 3430, 4571, 5712, 6850, 7987, 9121, 10252, 11380,
 12505, 13625, 14742, 15854, 16962, 18064, 19161, 20251, 21336, 22414,
 23486, 24550, 25607, 26655, 27696, 28729, 29752, 30767, 31772, 32768,

 33753, 34728, 35693, 36647, 37589, 38521, 39440, 40347, 41243, 42125,
 42995, 43851, 44695, 45524, 46340, 47142, 47929, 48702, 49460, 50203,
 50930, 51642, 52339, 53019, 53683, 54331, 54962, 55577, 56174, 56755,

 57318, 57864, 58392, 58902, 59395, 59869, 60325, 60763, 61182, 61583,
 61965, 62327, 62671, 62996, 63302, 63588, 63855, 64103, 64331, 64539,
 64728, 64897, 65047, 65176, 65286, 65375, 65445, 65495, 65525, 65535,
 };

// 91 bytes
uint8_t isinTable8[] = {
 0, 4, 9, 13, 18, 22, 27, 31, 35, 40, 44,
 49, 53, 57, 62, 66, 70, 75, 79, 83, 87,
 91, 96, 100, 104, 108, 112, 116, 120, 124, 128,

 131, 135, 139, 143, 146, 150, 153, 157, 160, 164,
 167, 171, 174, 177, 180, 183, 186, 190, 192, 195,
 198, 201, 204, 206, 209, 211, 214, 216, 219, 221,

 223, 225, 227, 229, 231, 233, 235, 236, 238, 240,
 241, 243, 244, 245, 246, 247, 248, 249, 250, 251,
 252, 253, 253, 254, 254, 254, 255, 255, 255, 255,
 };


float isin(long x)
{
 boolean pos = true;  // positive - keeps an eye on the sign.
 if (x < 0)
 {
   x = -x;
   pos = !pos;  
 }  
 if (x >= 360) x %= 360;  
 if (x > 180)
 {
   x -= 180;
   pos = !pos;
 }
 if (x > 90) x = 180 - x;
//  if (pos) return isinTable8[x] * 0.003921568627; // = /255.0
//  return isinTable8[x] * -0.003921568627 ;
 if (pos) return isinTable16[x] * 0.0000152590219; // = /65535.0
 return isinTable16[x] * -0.0000152590219 ;
}

float icos(long x)
{
 return isin(x+90);
}

float itan(long x)
{
 return isin(x) / icos(x);
}

float fsin(float d)
{
 float a = isin(d);
 float b = isin(d+1);
 return a + (d-int(d)) * (b-a);
}

Did some first tests and isin(x) is approx 6x faster than the original sin(x) for both tables. The 16 bit table is more precise with a max error of 0.000008 compared to sin(x); The 8bit Table scores a max error of 0.0002. [whole angles only]

I think it can be done better still, so all remarks, improvements or ideas are - as allways - welcome.

Rob
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Anachrocomputer

Did you consider putting the big arrays into program memory and reading them back with 'pgm_read_word_near()'?  I've done this sort of thing with audio samples for a MIDI synthesiser on the Arduino.

robtillaart

Yes, considered that, but I keep it as a final optimization, as the RAM is just used elsewhere.

What I'm looking for is smart tricks that use only a few statements that will fold the table again, or increases the precision with the same amount of data.

I've considered storing only the mantisse of floats and add the exponent later. That would be 3 bytes per value, but that did not work as the exponents differ.

urrently I'm thinking what can be done with

sin(2x) = 2sin(x)cos(x)
If I have a table for 0..45, I could generate 46..90 (at least the even ones directly, and the odd by interpolation) but there is a recursion in it.

Next trick to be investigated is that the the steepness around x   ==>   sin(90-x)   (use radians for this)
So the table 0-45 holds enough values for the whole range.




Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

CrossRoads

heck, just approximate the whole thing with a triangle wave, so y is just a linear version of x ;)
Designing & building electrical circuits for over 25 years. Check out the ATMega1284P based Bobuino and other '328P & '1284P creations & offerings at  www.crossroadsfencing.com/BobuinoRev17.
Arduino for Teens available at Amazon.com.

robtillaart

afraid that the error margin is a bit too high, but performance will be flabbergasting ;)
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Go Up