Go Down

Topic: Fastest possible search in small arrays with different methods (Read 4032 times) previous topic - next topic

PeterH


I do have a small project, there is two arrays. Array no#1 holding ascending sorted values. This table is the input data
Array no#2 is the output data. It contains other values that are not linear but corresponds to Array no#1 index position.
Unfortunately Array no#2 is not possible to calculate with the microprocessor, so I have to use arrays for the moment.
The code example below is only showing test data and not real values. Worth to mention the arrays only hold around 60-80 values each.


What is the relative importance of speed versus accuracy? Are you prepared to compromise accuracy at all to improve speed, or vice versa?

Can you post your actual input and output data?

I only provide help via the forum - please do not contact me for private consultancy.

KeithRB

holmes4:
If addr is the correct type, the sizeof() is not required and will in fact cause problems. The compiler is smart enough to know that when using pointer arithmetic the jump is scaled to the size of the array, so if you add 2 to a pointer it will point to the second object whether a byte *, long * or float *.


smoes

#17
Dec 18, 2013, 09:34 am Last Edit: Dec 18, 2013, 09:39 am by smoes Reason: 1
This MUST be done with HashTables to achieve the fastest way possible.
Since you listed steps of 10 in array1, we assume them to be like that.

Code: [Select]

#include <math.h>
#include <hashmap.h>

... // some initialization going on here

int key, val;

// get the integer index, which is the closest. 0 will be mapped to 10
key = round(timer1/10) * 10 + 10 ;

if(hashMap.hasKey(key))
  val = hashMap.getValue(key);
else  // cover the cases key > 600 and key < 0
  if(key > 600)
     val = hashMap.getValue(600);
   else
     val = hashMap.getValue(10);



That should work fastest.

jasmine2501


This MUST be done with HashTables to achieve the fastest way possible.
Since you listed steps of 10 in array1, we assume them to be like that.



That's pretty much what I thought too. There is probably a mathematical relationship that can be used here, but the OP is being kinda secretive about what the data really is.

If you could tell us what this data represents in the real world, and what real world problem you're solving, we could surely tell you the best way. We need to know more about the data, to see if there's some property we can take advantage of.

MichaelMeissner

First of all, is the array search really in the critical path?  Or are you doing optimization on what you think is the critical path?

If you have something that is rather time critical, have you considered getting a faster processor?  AVR's are rather slow processors.  It might be simpler in the end to upgrade to an embedded Arm processor, than trying to optimize an AVR processor.  Above the embedded Arm processors that typically run under 100 Mhz, you have the Linux based processors, which run even faster, but those processors consume a lot more power and don't do embedded stuff as well.

Hashes, splay trees, tries, etc. typically would only be beneficial if you have thousands of entries.  In addition, if you use division to do the hash, it will make it even slower on the AVR.

For 60 or so entries if you have a sorted array, just do a binary tree search.  For fewer entries, it isn't worth the trouble to keep the array sorted, and just use a linear search.

smoes

#20
Dec 18, 2013, 06:39 pm Last Edit: Dec 18, 2013, 06:41 pm by smoes Reason: 1
In the given example you can directly calculate the index of an array by:

Code: [Select]

int index = round(timer1/10) -10 ;


... and adding a range check. The the array elements would be sorted as

Code: [Select]

a[0] = elementInOtherArrayFor10;
a[1] = elementInOtherArrayFor11;
a[2] = elementInOtherArrayFor12;
a[3] = elementInOtherArrayFor13;
...


Which then would be definitely the fastest way to do this.
Else, binary search would be my first choice, too.
If you would need to sort the array online, linear search should be prefered as mentioned by MichaelMeissner

MichaelMeissner


In the given example you can directly calculate the index of an array by:

Code: [Select]

int index = round(timer1/10) -10 ;


Round is a floating point function that takes a floating point argument.  Since the AVR and ARM embedded do not have native floating point, it will take hundreds of instructions to simulate this.  It is better to do arithmetic in integer, rather than using floating point.  I would expect this would do the same thing in integer mode (I haven't looked at the example, to see if the code is maps onto the example):

Code: [Select]

int index = ((timer + 9) / 10) - 10;

robtillaart

OK, lots of proposals, but the proof is in the pudding test ...
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

robtillaart

A pointer implementation of linear search
Code: [Select]

  unsigned long start = micros ();

  int pos = -1;
  for (search = 10; search < 600; search +=10)
  {
    int * p;
    p = &array[0];
    pos = -1;
    while (((*p) < search) ) p++;   // assumes search is <= last element.
    if ((*p) == search) pos = (p - &array[0]);
  }
  unsigned long finish = micros ();

  Serial.println(pos);


output:
Code: [Select]
58
Time taken = Search Value :600
Time taken = 1176 uS   ==>  ~20uSec per search,

so not faster than binary search on average...(or proof me wrong :)
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

CrossRoads

Why not make a simpler table?
array[] = {a,a1,b,b1,c,c1,d,d1,e,e1,f,f1...,};

then search the array in 2s:
Code: [Select]

for (x=0, x<upperLimit; x=x+2){
if (array[x] == searchTerm){output = array[x+1];}
}

Only have to go thru list once and you have the result.
Designing & building electrical circuits for over 25 years.  Screw Shield for Mega/Due/Uno,  Bobuino with ATMega1284P, & other '328P & '1284P creations & offerings at  my website.

robtillaart


Why not make a simpler table?
array[] = {a,a1,b,b1,c,c1,d,d1,e,e1,f,f1...,};

then search the array in 2s:
Code: [Select]

for (x=0, x<upperLimit; x=x+2){
if (array[x] == searchTerm){output = array[x+1];}
}

Only have to go thru list once and you have the result.

interesting idea, but
*  the amount of math to calculate the addresses is identical 
    (you go through n steps of search followed by 1 in the 'odd-er' array.
* it assumes that the elements of both arrays are the same type.
* maintaining the array contents is a bit more difficult
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

KeithRB

smoes:
you can ignore the sample data. It is just an example and does not relate to the real data, as the OP says.

It will work fine for a search algorithm, but not a hashing one like you provided.

westfw

What's the actual range of the input data?  I don't think you're going to get "a couple microseconds" doing anything more complex than a single stage lookup table, but that easily doable if your range is 0..1000 as per the example...

CrossRoads

What if you did the search where you compare to the midpoint, then 1/4 up or down, then 1/8 up or down, then 1/16 up or down, then 1/32 up or down, then 1/64 if there's still room to go.
I don't recall what that search is called.
Designing & building electrical circuits for over 25 years.  Screw Shield for Mega/Due/Uno,  Bobuino with ATMega1284P, & other '328P & '1284P creations & offerings at  my website.

MichaelMeissner

#29
Dec 19, 2013, 05:10 am Last Edit: Dec 19, 2013, 05:14 am by MichaelMeissner Reason: 1

What's the actual range of the input data?  I don't think you're going to get "a couple microseconds" doing anything more complex than a single stage lookup table, but that easily doable if your range is 0..1000 as per the example...

In the context of most of the microprocessors using the Arduino libraries, you likely won't have 1,000 entries until you step up to the higher end arm processors, since you don't have that much memory (unless you are reading the data from a SD file).


What if you did the search where you compare to the midpoint, then 1/4 up or down, then 1/8 up or down, then 1/16 up or down, then 1/32 up or down, then 1/64 if there's still room to go.
I don't recall what that search is called.

It is called a binary search.

Go Up