Finding a number in a large array

Hi,

I want to find the index of 2 values A and B in a large array, from an input X, with the following conditions:

  • the array will have values in an increasing order, and the values are all ints.
  • the spacing of the values in the array is unknown
  • X will be an input in the range of coverage of the array.
  • A will be the closest number to X such that A < X
  • B will be the closest number to X such that B > X
  • if X is exactly equal to a number in the array, the index of that number will be returned.

I sort of know how to code this in arduino, but I was wondering what is the most efficient way and less clock cycles as possible way to do this...

Thanks

Hash table would be most efficient, otherwise Binary Search.

How large is the array? It can't be very large if it fits into ATmega328P's SRAM... :slight_smile:

If I understand you correctly (the array is in numeric size order) then...
subdivide the array (take the midpoint of the array) and check if it value is higher or lower than your target.
if the array value is lower subdivide the upper half again
if the array value is higher subdivide the lower half again
and so on until you get a match or the array will divide no more.
There are variations on this

Thanks for the reply

I have the arduino mega 2560, and i will be storing the arrays in program memory...

Iterative method was what I was thinking as well. I just though there might be a faster way to do it...

Thanks

hi
may be...

  1. take first byte of desired int;
  2. threat array of ints as array of bytes (see c union keyword)
  3. iterate trought array of bytes
  4. if match , compare second byte etc..

Iterative method was what I was thinking as well. I just though there might be a faster way to do it..

If it is fixed in ROM, you may as well sort it first before compilation, then use the binary search.

Could you elaborate on this a little more...

Thanks

Could you elaborate on this a little more...

Certainly

All good advice so far but the quick- and binary- searches are designed to return the exact match for what you are looking for.

As I understand you want to find the next lowest and the next highest to the value you are looking for, so you will need to modify the algorithms presented.

Binary search is extremely efficient. I think it operates in log(n) where n is the number of entries.

And this is such a common problem that there is already a function in stdlib.h even for AVR.

See avr-libc: <stdlib.h>: General utilities
http://www.cplusplus.com/reference/cstdlib/bsearch/

And then just google bsearch for more example code.

BW: What is the original problem? My guess is that this is part of a larger problem and you have already done a design choice. As mentioned above a hashtab might be a better solution but it depends...

Cheers!

BTW, if you really want to learn about tis stuff, I recommend Sedgewick "Algorithms in C". You learn more than you thought possible about linked lists, searches, sorts and big Os.

And this is such a common problem that there is already a function in stdlib.h even for AVR.

Is there a "bsearch_P" ?

AWOL:

And this is such a common problem that there is already a function in stdlib.h even for AVR.

Is there a "bsearch_P" ?

The problem with the AVR bsearch is it will only return a match but the OP wants to return a match or closest value(s) so they will need to code a similar version but one that suits there needs.

How many elements are there in the array, how low are you trying to get the runtime cost of the search, and how much development/test time are you prepared to trade off to achieve that?

Binary search is extremely efficient. I think it operates in log(n) where n is the number of entries.

Bsearch works best when the list is sorted :wink:

The problem the OP describes can be solved in O(n) I think
just go through the array and remember the value closest to X it is like searching the min and max. The main difference is if the searrched value is larger (smaller) than any element in the array.
The you cannot define a range. To detect such a case one typical adds a sentinel to the array and do a test afterwards.

something like this

bool linearSearch(int arr[], int X, int & lower, int & index, int & upper)
{
  lower = upper = index = -1; // not found
  int lowerMin = MAXINT;
  int upperMIN = MAXINT;
  for (int i = 0; i< sizeOfArray; i++)  // pseudo code
  {
    if (arr[i] == X)
    {
      index = i;
      return true;  // if this is removed you can find all 3...
    }
    if (arr[i] < X && X - arr[i] < lowerMin)
    {
      lower = i;
      lowerMin = X - arr[i];
    }
    if (arr[i] > X && arr[i] -X < upperMin)
    {
      upper = i;
      upperMin = arr[i] - X;
    }
  }
  return false;
}

if this function returns true, only the value of index is relevant (I guess)
if this function returns false, lower and upper gives the index of the value closest to X from both directions unless
if lower == -1 => X < min(array)
if upper == -1 => X > max(array)

Thank you all for the replys

This project is like this:
I have 3 data tables, 2 of them are 1D and 1 is 2D.
The 1D tables have maximum of 50 rows...and the 2D tables have maximum of 50 rows and 20 columns.
These tables act as special lookup tables, such that for a given input within the range of the table I have to find the closest upper and lower points in that table. Then I have to do linear interpolation for the 1D tables and Bilinear interpolation for the 2D table.
So for 1D tables I have to find 1 upper and 1 lower values and perform 1 interpolation
for 2D table I have to find 2 upper and 2 lower values and perform 3 interpolation (a bilinear interpolation consists of 3 linear interpolation)

Now the problem is that I have to do all this as fast as possible... hence my question at the beginning...

Just for improving speed, I have redone my lookup tables so they have fixed increments in the input axis...
Now that the tables have fixed increments I dont think I have to do any type of search to find the upper and lower values... (correct me if I am wrong)
for example, in one of my tables I have input axis: 100 to 600 in increments of 50 (10 by 2 tables... this will increase to 50 by 2 in the future)
to get my indexes of upper and lower values I can do this:
lower = (input/50) -2
upper = (input/50) -1

Thank you all...
Your responses were some really good learning experience for me...

as special lookup tables

All lookup tables are special :wink:

Now the problem is that I have to do all this as fast as possible.

as fast as possible is no requirement.
How fast do you want it? 1 search per second 10000 searches per second? What is needed? Hard numbers are requirements.
What are the other things your application need to do?

fast as possible typically means keep the tables sorted and do (weigthed) binary search - read sedgewick algorithms book or one of the others)

Can the tables changes during runtime or are they fixed?

Are the searched values distributed evenly over the table or not?
If not how are they distributed?

(and there are far more questions to

FardinB:
Just for improving speed, I have redone my lookup tables so they have fixed increments in the input axis...

Unless you need the lookup to be reversible, this is an obvious first optimisation. Now you don't need to find anything. Does that resolve your question?