Fastest possible search in small arrays with different methods

Hello forum.

I do have a small project, there is two arrays. Array no#1 holding ascending sorted values. This table is the input data
Array no#2 is the output data. It contains other values that are not linear but corresponds to Array no#1 index position.
Unfortunately Array no#2 is not possible to calculate with the microprocessor, so I have to use arrays for the moment.
The code example below is only showing test data and not real values. Worth to mention the arrays only hold around 60-80 values each.

So by timer 1, I find a value that is equal or greater in the Array no#1, we save the position (index) of found value. Then apply the output data from Array #2 with the corresponding index found in Array no#1. Very easy… but there is a BUT, it has to go lightning fast. need the result within 2-3 uSec. Below I have made some tests, with different negative results

1. I have following result normal ‘for loop’. Result to slow for values in the end of the array
Result: // 4uS beginning of array, 24uS middle, 44uS end of array

int array [60] = {
10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
470,480,490,500,510,520,530,540,550,560,570,580,590,600};

int index;
int x=600;
unsigned long start = 0;
unsigned long longest = 0;

void setup ()
  {
  Serial.begin (115200);
  unsigned long start = micros ();
  for (index = 0; x > array[index]; index++){}
  unsigned long finish = micros ();
  Serial.print("Search Value :");
  Serial.println(x);
  Serial.print("Index Value :");
  Serial.println(index);
  Serial.print("Array Value Found :");
  Serial.println(array[index]);
  longest = max (longest, finish - start);
  Serial.print ("Time taken = ");
  Serial.print (longest);
  Serial.println (" uS");
  }  
void loop () {}

2. Binary search made with a while for loop, strange result, cannot really say if it is ok or not. It can be so that I have made fault in the code., Result: end of array = 88 uS, middle of array = 4 uS, beginning of array= 72 uS

int first, last, middle;
int array [] = {
  10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
  250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
  470,480,490,500,510,520,530,540,550,560,570,580,590,600};
int search=600;    // search value
unsigned long finish = 0;
unsigned long start = 0;
unsigned long longest = 0;

void setup(){
  Serial.begin (115200);
  unsigned long start = micros ();
  first = 0;
  last = 60 - 1;
  middle = (first+last)/2;
  while( first <= last )
  {
    if ( array[middle] < search )
      first = middle + 1;    
    else if ( array[middle] == search ) 
    {
      break;
    }
    else
      last = middle - 1;
    middle = (first + last)/2;
  }
  unsigned long finish = micros ();
  Serial.print ("Time taken = ");
  Serial.print("Search Value :");
  Serial.println(search);
  longest = max (longest, finish - start);
  Serial.print ("Time taken = ");
  Serial.print (longest);
  Serial.println (" uS");
}
void loop (){
}

3. some kind of prediction, looking at previous input data value and then jump into the array at best possible location. It is expected that the sensor do not swing in value so much. This code is not written yet, I think this is the only solution.

4. using STL Binary Search. Maybe better for big arrays but on problem with binary search only looking for exact match, and output is either TRUE or FALSE. So this method cannot be used, very long search result worthless because i do not really understand how to use it, and yes I have used the “google” to find some code example, no way for Arduino.

#include <iterator>
#include <algorithm>
const int NUMBERS = 60;

int array [] = {
10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
470,480,490,500,510,520,530,540,550,560,570,580,590,600};

int outdata [] = {
  11,21,31,41,51,61,71,81,91,111,111,121,131,141,151,161,171,181,191,211,211,221,231,241,
  251,261,271,281,291,311,311,321,331,341,351,361,371,381,391,411,411,421,431,441,451,461,
  471,481,491,511,511,521,531,541,551,561,571,581,591,611};

int index;
int x=150;
boolean z;
unsigned long start = 0;
unsigned long longest = 0;

void setup ()
{
  Serial.begin (115200);
      unsigned long start = micros ();

  for ( index=1 ; z == false ; index++ ){
    z = std::binary_search(array, &array[index], x);
  }
  index = index -2;
  unsigned long finish = micros ();
  Serial.print("Search Value :");
  Serial.println(x);
  Serial.print("Index Value :");
  Serial.println(index);
  Serial.print("Array Value Found :");
  Serial.println(outdata[index]);
  longest = max (longest, finish - start);
  Serial.print ("Time taken = ");
  Serial.print (longest);
  Serial.println (" uS");
}
void loop () {}

5.
My humble question to you experts out there,
Is it practically possible to improve any of the above methods? Or have I reached the limits of Arduino C++. If limits reached, can I improve with AVR code, meaning combine C and assembly? But it looks sooooo darn difficult avr-libc: Combining C and assembly source files. There is a function with the [bsearch avr-libc: <stdlib.h>: General utilities](http://bsearch avr-libc: <stdlib.h>: General utilities) looks interesting but need examples.

Best Regards
Mikael

make the search variables of the type uint8_t

Time taken = Search Value :600
Time taken = 12 uS

looking for some other optimizations

refactored the loop (no big gain)

  unsigned long start = micros ();
  first = 0;
  last = 60 - 1;
  while( first <= last )
  {
    middle = (first + last)/2;
    if ( array[middle] == search ) break;
    if ( array[middle] < search )
      first = middle+1;    
    else
      last = middle;
  }
  unsigned long finish = micros ();

a better test as it test all 60 values.

uint8_t first, last, middle;

int array [] = {
  10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
  250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
  470,480,490,500,510,520,530,540,550,560,570,580,590,600};

int search = 600;    // search value
unsigned long finish = 0;
unsigned long start = 0;
unsigned long longest = 0;

void setup()
{
  Serial.begin (115200);
  unsigned long start = micros ();
  for (search = 10; search < 600; search +=10)
  {
    first = 0;
    last = 60 - 1;

    while( first <= last )
    {
      middle = (first + last)/2;
      if ( array[middle] == search ) break;
      if ( array[middle] < search )
        first = middle+1;    
      else
        last = middle;
    }
  }
  unsigned long finish = micros ();

  Serial.print ("Time taken = ");
  Serial.print("Search Value :");
  Serial.println(search);
  longest = max (longest, finish - start);
  Serial.print ("Time taken = ");
  Serial.print (longest);
  Serial.println (" uS");
}
void loop (){
}

output

Time taken = Search Value :600
Time taken = 560 uS

560 / 60 values = 9.33 uSec on average

  unsigned long start = micros ();
  for (search = 10; search < 600; search +=10)
  {
    first = 0;
    last = 60 - 1;

    while( first <= last )
    {
      middle = (first + last)/2;
      if ( array[middle] == search) break;
      if ( array[middle] < search )
        first = middle+1;    
      else
        last = middle-1;
    }
  }
  unsigned long finish = micros ();

tiny bit faster

Time taken = Search Value :600
Time taken = 548 uS

9.14 usec on average

Sedgwick "Algorithms in C" is the the bible here. Binary search should only take log(N) + 1 searches.

You might try a tree approach with left and right branches, with each node a struct with the value of the second array. That way, when you find your data, you have what you need.

need the result within 2-3 uSec. (sic)

With a 16MHz processor that gives you at best 48 instructions and that's if there all instructions that can be executed in on instruction. A tall order and one you will never do with a search.

However you may be able to reformat your data. Your problem is to find a match for an input in array 1 and then lookup in array 2 the output you need. IS this correct?

If so and if the input has a limited set of values (say less than 1k) then create an array with one entry for each possible input. In that entry is the correct output now just index the array with your input and the jobs done. You trade space for speed not as you have done speed for space.

The array can be place in code space so as not to use up your ram.

Mark

PS If you wanted to look up the phone number of some one whose name started with F by going to the middle of the directory or would you start about a 1/4 of the way in.

M

for completeness linear search (smallest footprint)

  unsigned long start = micros ();
  for (search = 10; search < 600; search +=10)
  {   
    for (uint8_t i=0; i< 60; i++)
    {
      if (array[i] == search) break;
    }
  }
  unsigned long finish = micros ();

Time taken = Search Value :600
Time taken = 1572 uS ====> 16.2 uSec on average

last idea

    do {
      middle = (first + last)/2;
      if ( array[middle] == search) break;
      if ( array[middle] < search )
      {
        first = middle+1;    
        continue;
      }
      else
        last = middle-1;
    } while( first <= last );

bit smaller by doing the loop different (one compare less per search)

Time taken = Search Value :600
Time taken = 536 uS  ===> 8.93 uSec per number on average

Unfortunately Array no#2 is not possible to calculate with the microprocessor

Why not? If that outdata array contains your actual values it doesn't look particularly hard.

M_Sander: I do have a small project, there is two arrays. Array no#1 holding ascending sorted values. This table is the input data Array no#2 is the output data. It contains other values that are not linear but corresponds to Array no#1 index position.

As mentioned by Holmes4, that is almost a textbook description of a lookup table. I do not know if there is an accepted name for that design pattern, but the hash table is very similar. That's going to be the fastest search, but it does require all possible inputs to be available, and have some sort of mathematical relationship to the array indices, so you can calculate array indices directly from the input.

A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.

http://en.wikipedia.org/wiki/Hash_table

wildbill: From the OP:

The code example below is only showing test data and not real values. Worth to mention the arrays only hold around 60-80 values each.

You might try a large switch/Case. If it implements as a jump table it could be pretty fast.

You might try precalculating the indexes for the binary search and use an if/else chain to figure out the starting point for binary search. (Or use the top 3 bits to index into a table).

PS If you wanted to look up the phone number of some one whose name started with F by going to the middle of the directory or would you start about a 1/4 of the way in.

That is a weighted binary search, and it uses knowledge about the values in the array, or the frequency the searched numbers occur. Calculating that the first step should be ~1/4 takes just 2 binary search steps.

As the array is linear, the value X is around index = map(X, min, max, first, last) - 1 .... map(X, min, max, first, last) +1; problem this takes an expensive division..

Another strategy that might be fast is using "skiplist strategy", go linear through the list is steps of sqrt(size). if bigger go back one step and start using stepsize = stepsize / 2;

Another strategy that might help is the "cache" the last value searched and its index and use that to get a better upper/lower limit. in worst case it is as fast as binary search as the cached value is the middle one.

finally check - http://playground.arduino.cc/Main/MultiMap - :)

I'm really, really going to regret this but you could speeding things up be using -- pointer Arithmetic for example

addr=&array1[n]gets you the address of the n'th element of the array then instead of

n++;
array[n]

you use addr=addr+ sizeof(the arrays type); to get the next element much faster!

Why because to get the n'th element in a one D array we take the base address of the array and add n* sizeof(arraytype). Use pointer arithmetic and you only do an addition to get the next element.

Mark

As the array is linear, the value X is  around index = map(X, min,  max, first, last) - 1  ....   map(X, min,  max, first, last) +1;
problem this takes an expensive division..

No need for all the multplication and division of the map function. Just use (pseudo code)

if input < 0.5 of its max then start 1/4 of the way in else start 3/4 of the way in

and calc 0.5 of max input in setup().

But yes it does assume some understanding of the data.

Mark

KeithRB: wildbill: From the OP:

The code example below is only showing test data and not real values. Worth to mention the arrays only hold around 60-80 values each.

Oops, missed that. I'd still like to see the real data though.

M_Sander: I do have a small project, there is two arrays. Array no#1 holding ascending sorted values. This table is the input data Array no#2 is the output data. It contains other values that are not linear but corresponds to Array no#1 index position. Unfortunately Array no#2 is not possible to calculate with the microprocessor, so I have to use arrays for the moment. The code example below is only showing test data and not real values. Worth to mention the arrays only hold around 60-80 values each.

What is the relative importance of speed versus accuracy? Are you prepared to compromise accuracy at all to improve speed, or vice versa?

Can you post your actual input and output data?

holmes4: If addr is the correct type, the sizeof() is not required and will in fact cause problems. The compiler is smart enough to know that when using pointer arithmetic the jump is scaled to the size of the array, so if you add 2 to a pointer it will point to the second object whether a byte *, long * or float *.

This MUST be done with HashTables to achieve the fastest way possible.
Since you listed steps of 10 in array1, we assume them to be like that.

#include <math.h>
#include <hashmap.h>

... // some initialization going on here

int key, val;

// get the integer index, which is the closest. 0 will be mapped to 10
key = round(timer1/10) * 10 + 10 ;

if(hashMap.hasKey(key))
   val = hashMap.getValue(key);
else  // cover the cases key > 600 and key < 0 
   if(key > 600)
      val = hashMap.getValue(600);
    else
      val = hashMap.getValue(10);

That should work fastest.

smoes: This MUST be done with HashTables to achieve the fastest way possible. Since you listed steps of 10 in array1, we assume them to be like that.

That's pretty much what I thought too. There is probably a mathematical relationship that can be used here, but the OP is being kinda secretive about what the data really is.

If you could tell us what this data represents in the real world, and what real world problem you're solving, we could surely tell you the best way. We need to know more about the data, to see if there's some property we can take advantage of.

First of all, is the array search really in the critical path? Or are you doing optimization on what you think is the critical path?

If you have something that is rather time critical, have you considered getting a faster processor? AVR's are rather slow processors. It might be simpler in the end to upgrade to an embedded Arm processor, than trying to optimize an AVR processor. Above the embedded Arm processors that typically run under 100 Mhz, you have the Linux based processors, which run even faster, but those processors consume a lot more power and don't do embedded stuff as well.

Hashes, splay trees, tries, etc. typically would only be beneficial if you have thousands of entries. In addition, if you use division to do the hash, it will make it even slower on the AVR.

For 60 or so entries if you have a sorted array, just do a binary tree search. For fewer entries, it isn't worth the trouble to keep the array sorted, and just use a linear search.