Pages: [1] 2 3 ... 5   Go Down
Author Topic: Fastest possible search in small arrays with different methods  (Read 3392 times)
0 Members and 1 Guest are viewing this topic.
Sweden
Offline Offline
Newbie
*
Karma: 0
Posts: 11
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Hello forum.
 
I do have a small project, there is two arrays. Array no#1 holding ascending sorted values. This table is the input data
Array no#2 is the output data. It contains other values that are not linear but corresponds to Array no#1 index position.
Unfortunately Array no#2 is not possible to calculate with the microprocessor, so I have to use arrays for the moment.
The code example below is only showing test data and not real values. Worth to mention the arrays only hold around 60-80 values each.
 
So by timer 1, I find a value that is equal or greater in the Array no#1, we save the position (index) of found value. Then apply the output data from Array #2 with the corresponding index found in Array no#1. Very easy… but there is a BUT, it has to go lightning fast. need the result within 2-3 uSec. Below I have made some tests, with different negative results
 
1. I have following result normal ‘for loop’. Result to slow for values in the end of the array
Result: // 4uS beginning of array,  24uS middle, 44uS end of array
Code:
int array [60] = {
10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
470,480,490,500,510,520,530,540,550,560,570,580,590,600};

int index;
int x=600;
unsigned long start = 0;
unsigned long longest = 0;

void setup ()
  {
  Serial.begin (115200);
  unsigned long start = micros ();
  for (index = 0; x > array[index]; index++){}
  unsigned long finish = micros ();
  Serial.print("Search Value :");
  Serial.println(x);
  Serial.print("Index Value :");
  Serial.println(index);
  Serial.print("Array Value Found :");
  Serial.println(array[index]);
  longest = max (longest, finish - start);
  Serial.print ("Time taken = ");
  Serial.print (longest);
  Serial.println (" uS");
  } 
void loop () {}

2. Binary search made with a while for loop, strange result, cannot really say if it is ok or not. It can be so that I have made fault in the code., Result: end of array = 88 uS, middle of array = 4 uS, beginning of array= 72 uS

Code:
int first, last, middle;
int array [] = {
  10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
  250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
  470,480,490,500,510,520,530,540,550,560,570,580,590,600};
int search=600;    // search value
unsigned long finish = 0;
unsigned long start = 0;
unsigned long longest = 0;

void setup(){
  Serial.begin (115200);
  unsigned long start = micros ();
  first = 0;
  last = 60 - 1;
  middle = (first+last)/2;
  while( first <= last )
  {
    if ( array[middle] < search )
      first = middle + 1;   
    else if ( array[middle] == search )
    {
      break;
    }
    else
      last = middle - 1;
    middle = (first + last)/2;
  }
  unsigned long finish = micros ();
  Serial.print ("Time taken = ");
  Serial.print("Search Value :");
  Serial.println(search);
  longest = max (longest, finish - start);
  Serial.print ("Time taken = ");
  Serial.print (longest);
  Serial.println (" uS");
}
void loop (){
}

3. some kind of prediction, looking at previous input data value and then jump into the array at best possible location. It is expected that the sensor do not swing in value so much. This code is not written yet, I think this is the only solution. 
 
4. using STL Binary Search. Maybe better for big arrays but on problem with binary search only looking for exact match, and output is either TRUE or FALSE. So this method cannot be used, very long search result worthless because i do not really understand how to use it, and yes I have used the "google" to find some code example, no way for Arduino.
Code:
#include <iterator>
#include <algorithm>
const int NUMBERS = 60;

int array [] = {
10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
470,480,490,500,510,520,530,540,550,560,570,580,590,600};

int outdata [] = {
  11,21,31,41,51,61,71,81,91,111,111,121,131,141,151,161,171,181,191,211,211,221,231,241,
  251,261,271,281,291,311,311,321,331,341,351,361,371,381,391,411,411,421,431,441,451,461,
  471,481,491,511,511,521,531,541,551,561,571,581,591,611};

int index;
int x=150;
boolean z;
unsigned long start = 0;
unsigned long longest = 0;

void setup ()
{
  Serial.begin (115200);
      unsigned long start = micros ();

  for ( index=1 ; z == false ; index++ ){
    z = std::binary_search(array, &array[index], x);
  }
  index = index -2;
  unsigned long finish = micros ();
  Serial.print("Search Value :");
  Serial.println(x);
  Serial.print("Index Value :");
  Serial.println(index);
  Serial.print("Array Value Found :");
  Serial.println(outdata[index]);
  longest = max (longest, finish - start);
  Serial.print ("Time taken = ");
  Serial.print (longest);
  Serial.println (" uS");
}
void loop () {}

5
My humble question to you experts out there,
Is it practically possible to improve any of the above methods? Or have I reached the limits of Arduino C++. If limits reached, can I improve with AVR code, meaning combine C and assembly? But it looks sooooo darn difficult http://www.nongnu.org/avr-libc/user-manual/group__asmdemo.html. There is a function with the bsearch http://www.nongnu.org/avr-libc/user-manual/group__avr__stdlib.html looks interesting but need examples.
 
Best Regards
Mikael
Logged

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 217
Posts: 13742
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

make the search variables of the type uint8_t
Code:
Time taken = Search Value :600
Time taken = 12 uS
looking for some other optimizations

refactored the loop (no big gain)
Code:
  unsigned long start = micros ();
  first = 0;
  last = 60 - 1;
  while( first <= last )
  {
    middle = (first + last)/2;
    if ( array[middle] == search ) break;
    if ( array[middle] < search )
      first = middle+1;   
    else
      last = middle;
  }
  unsigned long finish = micros ();
« Last Edit: December 17, 2013, 03:50:41 pm by robtillaart » Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 217
Posts: 13742
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

a better test as it test all 60 values.
Code: (snippet)
uint8_t first, last, middle;

int array [] = {
  10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,
  250,260,270,280,290,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,
  470,480,490,500,510,520,530,540,550,560,570,580,590,600};

int search = 600;    // search value
unsigned long finish = 0;
unsigned long start = 0;
unsigned long longest = 0;

void setup()
{
  Serial.begin (115200);
  unsigned long start = micros ();
  for (search = 10; search < 600; search +=10)
  {
    first = 0;
    last = 60 - 1;

    while( first <= last )
    {
      middle = (first + last)/2;
      if ( array[middle] == search ) break;
      if ( array[middle] < search )
        first = middle+1;    
      else
        last = middle;
    }
  }
  unsigned long finish = micros ();

  Serial.print ("Time taken = ");
  Serial.print("Search Value :");
  Serial.println(search);
  longest = max (longest, finish - start);
  Serial.print ("Time taken = ");
  Serial.print (longest);
  Serial.println (" uS");
}
void loop (){
}

output
Code:
Time taken = Search Value :600
Time taken = 560 uS
560 / 60 values = 9.33 uSec on average
« Last Edit: December 17, 2013, 04:33:23 pm by robtillaart » Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 217
Posts: 13742
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Code:
  unsigned long start = micros ();
  for (search = 10; search < 600; search +=10)
  {
    first = 0;
    last = 60 - 1;

    while( first <= last )
    {
      middle = (first + last)/2;
      if ( array[middle] == search) break;
      if ( array[middle] < search )
        first = middle+1;   
      else
        last = middle-1;
    }
  }
  unsigned long finish = micros ();

tiny bit faster
Code:
Time taken = Search Value :600
Time taken = 548 uS
9.14 usec on average
Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Offline Offline
Edison Member
*
Karma: 33
Posts: 1447
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Sedgwick "Algorithms in C" is the the bible here. Binary search should only take log(N) + 1 searches.

You might try a tree approach with left and right branches, with each node a struct with the value of the second array. That way, when you find your data, you have what you need.
Logged

Poole, Dorset, UK
Offline Offline
Edison Member
*
Karma: 52
Posts: 2318
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
need the result within 2-3 uSec. (sic)
With a 16MHz processor that gives you at best 48 instructions and that's if there all instructions that can be executed in on instruction. A tall order and one you will never do with a search.

However you may be able to reformat your data. Your problem is to find a match for an input in array 1 and then lookup in array 2 the output you need. IS this correct?

If so and if the input has a limited set of values (say less than 1k) then create an array with one entry for each possible input. In that entry is the correct output now just index the array with your input and the jobs done.  
You trade space for speed not as you have done speed for space.

The array can be place in code space so as not to use up your ram.

Mark

PS If you wanted to look up the phone number of some one whose name started with F by going to the middle of the directory or would you start about a 1/4 of the way in.

M
Logged

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 217
Posts: 13742
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

for completeness linear search  (smallest footprint)
Code:
 unsigned long start = micros ();
  for (search = 10; search < 600; search +=10)
  {  
    for (uint8_t i=0; i< 60; i++)
    {
      if (array[i] == search) break;
    }
  }
  unsigned long finish = micros ();

Time taken = Search Value :600
Time taken = 1572 uS  ====> 16.2 uSec on average
Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 217
Posts: 13742
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

last idea
Code:
    do {
      middle = (first + last)/2;
      if ( array[middle] == search) break;
      if ( array[middle] < search )
      {
        first = middle+1;   
        continue;
      }
      else
        last = middle-1;
    } while( first <= last );

bit smaller by doing the loop different (one compare less per search)
Code:
Time taken = Search Value :600
Time taken = 536 uS  ===> 8.93 uSec per number on average
Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

New Jersey
Offline Offline
Faraday Member
**
Karma: 67
Posts: 3702
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Unfortunately Array no#2 is not possible to calculate with the microprocessor

Why not? If that outdata array contains your actual values it doesn't look particularly hard.
Logged

Offline Offline
Sr. Member
****
Karma: 8
Posts: 260
Supernova Software - Quality Software Since 1985!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I do have a small project, there is two arrays. Array no#1 holding ascending sorted values. This table is the input data
Array no#2 is the output data. It contains other values that are not linear but corresponds to Array no#1 index position.

As mentioned by Holmes4, that is almost a textbook description of a lookup table. I do not know if there is an accepted name for that design pattern, but the hash table is very similar. That's going to be the fastest search, but it does require all possible inputs to be available, and have some sort of mathematical relationship to the array indices, so you can calculate array indices directly from the input.

Quote
A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
http://en.wikipedia.org/wiki/Hash_table
Logged

Offline Offline
Edison Member
*
Karma: 33
Posts: 1447
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

wildbill:
From the OP:
Quote
The code example below is only showing test data and not real values. Worth to mention the arrays only hold around 60-80 values each.

You might try a large switch/Case. If it implements as a jump table it could be pretty fast.

You might try precalculating the indexes for the binary search and use an if/else chain to figure out the starting point for binary search.  (Or use the top 3 bits to index into a table).
Logged

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 217
Posts: 13742
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
PS If you wanted to look up the phone number of some one whose name started with F by going to the middle of the directory or would you start about a 1/4 of the way in.

That is  a weighted binary search, and it uses knowledge about the values in the array,
or the frequency the searched numbers occur.
Calculating that the first step should be ~1/4 takes just 2 binary search steps.

As the array is linear, the value X is  around index = map(X, min,  max, first, last) - 1  ....   map(X, min,  max, first, last) +1;
problem this takes an expensive division..

Another strategy that might be fast is using "skiplist strategy", go linear through the list is steps of sqrt(size).
if bigger go back one step and start using stepsize = stepsize / 2;

Another strategy that might help is the "cache" the last value searched  and its index and use that to get a better upper/lower limit.
in worst case it is as fast as binary search as the cached value is the middle one.

finally check - http://playground.arduino.cc/Main/MultiMap - smiley
Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Poole, Dorset, UK
Offline Offline
Edison Member
*
Karma: 52
Posts: 2318
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I'm really, really going to regret this but you could speeding things up be using -- pointer Arithmetic for example

Code:
addr=&array1[n]
gets you the address of the n'th element of the array then instead of
Code:
n++;
array[n]
you use
Code:
addr=addr+ sizeof(the arrays type);
to get the next element much faster!

Why because to get the n'th element in a one D  array we take the base address of the array and add n* sizeof(arraytype).
Use pointer arithmetic and you only do an addition to get the next element.

Mark
Logged

Poole, Dorset, UK
Offline Offline
Edison Member
*
Karma: 52
Posts: 2318
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Code:
As the array is linear, the value X is  around index = map(X, min,  max, first, last) - 1  ....   map(X, min,  max, first, last) +1;
problem this takes an expensive division..

No need for all the multplication and division of the map function. Just use (pseudo code)
Code:
if input < 0.5 of its max then start 1/4 of the way in else start 3/4 of the way in
and calc 0.5 of max input in setup().

But yes it does assume some understanding of the data.

Mark
Logged

New Jersey
Offline Offline
Faraday Member
**
Karma: 67
Posts: 3702
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

wildbill:
From the OP:
Quote
The code example below is only showing test data and not real values. Worth to mention the arrays only hold around 60-80 values each.

Oops, missed that. I'd still like to see the real data though.
Logged

Pages: [1] 2 3 ... 5   Go Up
Jump to: