Go Down

Topic: Fastest possible search in small arrays with different methods (Read 3 times) previous topic - next topic

M_Sander

Hello forum.

I do have a small project, there is two arrays. Array no#1 holding ascending sorted values. This table is the input data
Array no#2 is the output data. It contains other values that are not linear but corresponds to Array no#1 index position.
Unfortunately Array no#2 is not possible to calculate with the microprocessor, so I have to use arrays for the moment.
The code example below is only showing test data and not real values. Worth to mention the arrays only hold around 60-80 values each.

So by timer 1, I find a value that is equal or greater in the Array no#1, we save the position (index) of found value. Then apply the output data from Array #2 with the corresponding index found in Array no#1. Very easy… but there is a BUT, it has to go lightning fast. need the result within 2-3 uSec. Below I have made some tests, with different negative results

1. I have following result normal 'for loop'. Result to slow for values in the end of the array
Result: // 4uS beginning of array,  24uS middle, 44uS end of array
Code: [Select]
int array [60] = {
10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
470,480,490,500,510,520,530,540,550,560,570,580,590,600};

int index;
int x=600;
unsigned long start = 0;
unsigned long longest = 0;

void setup ()
  {
  Serial.begin (115200);
  unsigned long start = micros ();
  for (index = 0; x > array[index]; index++){}
  unsigned long finish = micros ();
  Serial.print("Search Value :");
  Serial.println(x);
  Serial.print("Index Value :");
  Serial.println(index);
  Serial.print("Array Value Found :");
  Serial.println(array[index]);
  longest = max (longest, finish - start);
  Serial.print ("Time taken = ");
  Serial.print (longest);
  Serial.println (" uS");
  } 
void loop () {}


2. Binary search made with a while for loop, strange result, cannot really say if it is ok or not. It can be so that I have made fault in the code., Result: end of array = 88 uS, middle of array = 4 uS, beginning of array= 72 uS

Code: [Select]
int first, last, middle;
int array [] = {
  10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
  250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
  470,480,490,500,510,520,530,540,550,560,570,580,590,600};
int search=600;    // search value
unsigned long finish = 0;
unsigned long start = 0;
unsigned long longest = 0;

void setup(){
  Serial.begin (115200);
  unsigned long start = micros ();
  first = 0;
  last = 60 - 1;
  middle = (first+last)/2;
  while( first <= last )
  {
    if ( array[middle] < search )
      first = middle + 1;   
    else if ( array[middle] == search )
    {
      break;
    }
    else
      last = middle - 1;
    middle = (first + last)/2;
  }
  unsigned long finish = micros ();
  Serial.print ("Time taken = ");
  Serial.print("Search Value :");
  Serial.println(search);
  longest = max (longest, finish - start);
  Serial.print ("Time taken = ");
  Serial.print (longest);
  Serial.println (" uS");
}
void loop (){
}


3. some kind of prediction, looking at previous input data value and then jump into the array at best possible location. It is expected that the sensor do not swing in value so much. This code is not written yet, I think this is the only solution. 

4. using STL Binary Search. Maybe better for big arrays but on problem with binary search only looking for exact match, and output is either TRUE or FALSE. So this method cannot be used, very long search result worthless because i do not really understand how to use it, and yes I have used the "google" to find some code example, no way for Arduino.
Code: [Select]
#include <iterator>
#include <algorithm>
const int NUMBERS = 60;

int array [] = {
10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
470,480,490,500,510,520,530,540,550,560,570,580,590,600};

int outdata [] = {
  11,21,31,41,51,61,71,81,91,111,111,121,131,141,151,161,171,181,191,211,211,221,231,241,
  251,261,271,281,291,311,311,321,331,341,351,361,371,381,391,411,411,421,431,441,451,461,
  471,481,491,511,511,521,531,541,551,561,571,581,591,611};

int index;
int x=150;
boolean z;
unsigned long start = 0;
unsigned long longest = 0;

void setup ()
{
  Serial.begin (115200);
      unsigned long start = micros ();

  for ( index=1 ; z == false ; index++ ){
    z = std::binary_search(array, &array[index], x);
  }
  index = index -2;
  unsigned long finish = micros ();
  Serial.print("Search Value :");
  Serial.println(x);
  Serial.print("Index Value :");
  Serial.println(index);
  Serial.print("Array Value Found :");
  Serial.println(outdata[index]);
  longest = max (longest, finish - start);
  Serial.print ("Time taken = ");
  Serial.print (longest);
  Serial.println (" uS");
}
void loop () {}


5
My humble question to you experts out there,
Is it practically possible to improve any of the above methods? Or have I reached the limits of Arduino C++. If limits reached, can I improve with AVR code, meaning combine C and assembly? But it looks sooooo darn difficult http://www.nongnu.org/avr-libc/user-manual/group__asmdemo.html. There is a function with the bsearch http://www.nongnu.org/avr-libc/user-manual/group__avr__stdlib.html looks interesting but need examples.

Best Regards
Mikael

robtillaart

#1
Dec 17, 2013, 09:40 pm Last Edit: Dec 17, 2013, 09:50 pm by robtillaart Reason: 1
make the search variables of the type uint8_t
Code: [Select]

Time taken = Search Value :600
Time taken = 12 uS

looking for some other optimizations

refactored the loop (no big gain)
Code: [Select]

  unsigned long start = micros ();
  first = 0;
  last = 60 - 1;
  while( first <= last )
  {
    middle = (first + last)/2;
    if ( array[middle] == search ) break;
    if ( array[middle] < search )
      first = middle+1;   
    else
      last = middle;
  }
  unsigned long finish = micros ();
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

robtillaart

#2
Dec 17, 2013, 10:10 pm Last Edit: Dec 17, 2013, 10:33 pm by robtillaart Reason: 1
a better test as it test all 60 values.
Code: (snippet) [Select]
uint8_t first, last, middle;

int array [] = {
 10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
 250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
 470,480,490,500,510,520,530,540,550,560,570,580,590,600};

int search = 600;    // search value
unsigned long finish = 0;
unsigned long start = 0;
unsigned long longest = 0;

void setup()
{
 Serial.begin (115200);
 unsigned long start = micros ();
 for (search = 10; search < 600; search +=10)
 {
   first = 0;
   last = 60 - 1;

   while( first <= last )
   {
     middle = (first + last)/2;
     if ( array[middle] == search ) break;
     if ( array[middle] < search )
       first = middle+1;    
     else
       last = middle;
   }
 }
 unsigned long finish = micros ();

 Serial.print ("Time taken = ");
 Serial.print("Search Value :");
 Serial.println(search);
 longest = max (longest, finish - start);
 Serial.print ("Time taken = ");
 Serial.print (longest);
 Serial.println (" uS");
}
void loop (){
}


output
Code: [Select]

Time taken = Search Value :600
Time taken = 560 uS

560 / 60 values = 9.33 uSec on average
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

robtillaart

Code: [Select]
  unsigned long start = micros ();
  for (search = 10; search < 600; search +=10)
  {
    first = 0;
    last = 60 - 1;

    while( first <= last )
    {
      middle = (first + last)/2;
      if ( array[middle] == search) break;
      if ( array[middle] < search )
        first = middle+1;   
      else
        last = middle-1;
    }
  }
  unsigned long finish = micros ();


tiny bit faster
Code: [Select]
Time taken = Search Value :600
Time taken = 548 uS

9.14 usec on average
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

KeithRB

Sedgwick "Algorithms in C" is the the bible here. Binary search should only take log(N) + 1 searches.

You might try a tree approach with left and right branches, with each node a struct with the value of the second array. That way, when you find your data, you have what you need.

holmes4

Quote
need the result within 2-3 uSec. (sic)
With a 16MHz processor that gives you at best 48 instructions and that's if there all instructions that can be executed in on instruction. A tall order and one you will never do with a search.

However you may be able to reformat your data. Your problem is to find a match for an input in array 1 and then lookup in array 2 the output you need. IS this correct?

If so and if the input has a limited set of values (say less than 1k) then create an array with one entry for each possible input. In that entry is the correct output now just index the array with your input and the jobs done.  
You trade space for speed not as you have done speed for space.

The array can be place in code space so as not to use up your ram.

Mark

PS If you wanted to look up the phone number of some one whose name started with F by going to the middle of the directory or would you start about a 1/4 of the way in.

M

robtillaart

for completeness linear search  (smallest footprint)
Code: [Select]

 unsigned long start = micros ();
 for (search = 10; search < 600; search +=10)
 {  
   for (uint8_t i=0; i< 60; i++)
   {
     if (array[i] == search) break;
   }
 }
 unsigned long finish = micros ();


Time taken = Search Value :600
Time taken = 1572 uS  ====> 16.2 uSec on average
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

robtillaart

last idea
Code: [Select]

    do {
      middle = (first + last)/2;
      if ( array[middle] == search) break;
      if ( array[middle] < search )
      {
        first = middle+1;   
        continue;
      }
      else
        last = middle-1;
    } while( first <= last );


bit smaller by doing the loop different (one compare less per search)
Code: [Select]
Time taken = Search Value :600
Time taken = 536 uS  ===> 8.93 uSec per number on average

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

wildbill

Quote
Unfortunately Array no#2 is not possible to calculate with the microprocessor


Why not? If that outdata array contains your actual values it doesn't look particularly hard.

jasmine2501


I do have a small project, there is two arrays. Array no#1 holding ascending sorted values. This table is the input data
Array no#2 is the output data. It contains other values that are not linear but corresponds to Array no#1 index position.


As mentioned by Holmes4, that is almost a textbook description of a lookup table. I do not know if there is an accepted name for that design pattern, but the hash table is very similar. That's going to be the fastest search, but it does require all possible inputs to be available, and have some sort of mathematical relationship to the array indices, so you can calculate array indices directly from the input.

Quote
A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.

http://en.wikipedia.org/wiki/Hash_table

KeithRB

wildbill:
From the OP:
Quote
The code example below is only showing test data and not real values. Worth to mention the arrays only hold around 60-80 values each.


You might try a large switch/Case. If it implements as a jump table it could be pretty fast.

You might try precalculating the indexes for the binary search and use an if/else chain to figure out the starting point for binary search.  (Or use the top 3 bits to index into a table).

robtillaart

Quote
PS If you wanted to look up the phone number of some one whose name started with F by going to the middle of the directory or would you start about a 1/4 of the way in.


That is  a weighted binary search, and it uses knowledge about the values in the array,
or the frequency the searched numbers occur.
Calculating that the first step should be ~1/4 takes just 2 binary search steps.

As the array is linear, the value X is  around index = map(X, min,  max, first, last) - 1  ....   map(X, min,  max, first, last) +1;
problem this takes an expensive division..

Another strategy that might be fast is using "skiplist strategy", go linear through the list is steps of sqrt(size).
if bigger go back one step and start using stepsize = stepsize / 2;

Another strategy that might help is the "cache" the last value searched  and its index and use that to get a better upper/lower limit.
in worst case it is as fast as binary search as the cached value is the middle one.

finally check - http://playground.arduino.cc/Main/MultiMap - :)
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

holmes4

I'm really, really going to regret this but you could speeding things up be using -- pointer Arithmetic for example

Code: [Select]
addr=&array1[n] gets you the address of the n'th element of the array then instead of
Code: [Select]
n++;
array[n]
you use
Code: [Select]
addr=addr+ sizeof(the arrays type); to get the next element much faster!

Why because to get the n'th element in a one D  array we take the base address of the array and add n* sizeof(arraytype).
Use pointer arithmetic and you only do an addition to get the next element.

Mark

holmes4

Code: [Select]
As the array is linear, the value X is  around index = map(X, min,  max, first, last) - 1  ....   map(X, min,  max, first, last) +1;
problem this takes an expensive division..


No need for all the multplication and division of the map function. Just use (pseudo code)
Code: [Select]

if input < 0.5 of its max then start 1/4 of the way in else start 3/4 of the way in
and calc 0.5 of max input in setup().

But yes it does assume some understanding of the data.

Mark

wildbill


wildbill:
From the OP:
Quote
The code example below is only showing test data and not real values. Worth to mention the arrays only hold around 60-80 values each.



Oops, missed that. I'd still like to see the real data though.

Go Up