How to setup a associative data structure in Arduino

Hi there,

I'm just developing a small remote control which lets me control my IR controlled devices via internet. All is working but I still need to store the codes in a way that I can easily access them. So what I would like to have is a way in which I can have 2 variables:

  • A 'char rc_name[]' which holds the name of the remote control eg. TV, SAT, AMPLIFIER
  • A 'char rc_function[]' which holds the name of the function to execute eg POWER, VOLUME_UP, BUTTON_1

Given these two variable I then want to receive all the necessary data

  • 'int code_type_num'
  • 'char code_value[]' eg "400555AA"
  • 'int code_type_bits

This all seems simple in general, but I could not really find a way which would let me access the data by the string key rather than a integer inex.

In PHP I would use something like this:

$remotes = array(
   'tv' => array(
                    'POWER' => array(1, '400555AA', 32),
                    'MUTE'   => array(1, '400557AA', 32),
            ),
);

$rc_name = 'tv';
$rc_function = 'power';
$rc_data = $remotes[$rc_name][$rc_function];

What would be the best way to setup such a structure in Arduino?

Thanks for any help and ideas.

Thorsten

Normally you'd either just use 2 arrays with the two variables in each and iterate one to find the key number and use that to look up the second, or you'd use either an array of structs containing both values, or a linked list of structs.

All methods will require iteration.

If the items are all in order you can speed up the iteration by using a "divide and conquer" search.

A more efficient method for large data sets, though harder to program, is a binary tree. Top node is the "middle" value, and the two children are the middle values of the data subset either side of it - recurse.

Also, c++ has map and vector types, but if those work on the Arduino I have no idea.

Thanks for the quick answers. My current approach is also to iterate through it. I was hoping there is a more elegant way.
In case someone else needs it, here is how I do it for now:

typedef struct {
   char* remote_name;
   char* function_name;
   int code_type;
   int code_bits;
   char* code_value;
} IR_Code;

int num_codes = 58;
IR_Code ir_codes[58] = {
   { "amp", "power", 1, 32, "400501FE" }, 
   ...
   { "tv", "power", 1, 32, "20DF10EF" },
};


void send_defined_code(char* remote_name, char* function_name) {
  for( int i=0; i < num_codes; i++) {
     if(strcmp(remote_name, ir_codes[i].remote_name) == 0 && strcmp(function_name, ir_codes[i].function_name) == 0) {
       long unsigned int code = strtoul( ir_codes[i].code_value, 0, 16);
       Serial.print(F(" Type:"));
       Serial.print(ir_codes[i].code_type);
       Serial.print(F(" Value:0x"));
       Serial.print(code, HEX);
       Serial.print(F(" Bits:"));
       Serial.print(ir_codes[i].code_bits);
       return;
     }
  } 
}

Ensuring that your keys are in alphabetic (ASCII) order and doing a divide and conquer search might yield quicker results.

For those that don't know, a divide and conquer search involves looking at the center of a list and deciding which side it falls in, then repeating with just the data from that side. For example, say you want to find the letter "Q" in a list of all the alphabetic letters. You might search like this:

("Middle" is actual middle character, or left-hand character in middle pair)

  1. ABCDEFGHIJKLMNOPQRSTUVWXYZ

Middle = M. Q > M, so use N-Z

  1. NOPQRSTUVWXYZ

Middle = T. Q < T, so use N-S

  1. NOPQRS

Middle = P. Q > P, so use Q-S

  1. QRS

Middle is R. Q < R, so use Q-Q

  1. Q

Middle is Q. Found it.

That's just 5 iterations, whereas a sequential search would take 17 iterations. Of course, the iterations of a sequential search may be faster...

Another thing to consider is "prefiltering", where instead of iterating through the whole list checking each string, you first check only the first character of the string - that way you can do the more intensive full string check only on the ones that might match, and discard ones that definitely won't match. So, if you're looking for "20DF10EF", you only do the strcmp() if the first character of the string being tested is a 2.

58 devices...really!
Majenko probably has the most elegant solution with a binary tree. However, if the remote_name is unique, which it appears it is, then you could shorten the compare times using strncmp(). (Most compilers implement the function so it checks the string length first, before checking the characters in the string, for a match.) If that compare is logic true, then fetch the control string you're interested in. If you only need a specific control function (e.g., POWERON) once the control is found in the list, I'd make a series of enum's or #define's for all of the features, and then just pass in the function you're interested in.

//...
 int target_code;
 int target_string_length = strlen(remote_name);

 for( int i=0; i < num_codes; i++) {
     if(strncmp(remote_name, ir_codes[i].remote_name, target_string_length) ) {
       long unsigned int code = strtoul( ir_codes[i].code_value, 0, 16);
       target_code = findTarget(i, POWERON);                                 // Use i to index the device, and POWERON to return the requested feature. Return proper code or -1 on error
       if (target_code > 0)
           return target_code;
/*
       Serial.print(F(" Type:"));
       Serial.print(ir_codes[i].code_type);
       Serial.print(F(" Value:0x"));
       Serial.print(code, HEX);
       Serial.print(F(" Bits:"));
       Serial.print(ir_codes[i].code_bits);
       return;
*/
     }
  }

Part of the answer depends upon what you're trying to do, but this would be pretty simple to code.

Given that you won't be expecting to receive input commands at a huge rate, and the time taken to do a 'brute force' comparison against all possible commands for will be tiny (the number of commands that an Arduino could realistically support), I wouldn't worry about optimising it. I would recommend storing it in progmem, though, unless the number of commands you're supporting is tiny.

I also went with the approach I posted as in the application I'm using with a set of currently 58 rows, which may grow to something like 200 if I add the remotes from the whole house seems to be pretty small and the way you use a remote is not really very time sensitive.

So at this point I decided to go for readability and an easy maintenance for adding new code over speed.

Does any of you think with such a limited set of data, speed could become an issue when iterating through all the rows?

One other thing I would recommend is to replace the "tv", "amp", etc, and even the function types, with numeric constants...

#define DEV_TV 0
#define DEV_AMP 1
#define DEV_DVD 2
#define DEV_SKY 3
//...
#define FUNC_POWER 0
#define FUNC_VOLUP 1
#define FUNC_VOLDOWN 2
//...
typedef struct {
   char remote_name;
   char function_name;
   int code_type;
   int code_bits;
   char* code_value;
} IR_Code;

IR_Code ir_codes[58] = {
   { DEV_AMP, FUNC_POWER, 1, 32, "400501FE" }, 
   ...
   { DEV_TV, FUNC_POWER, 1, 32, "20DF10EF" },
};

etc. It'll use considerably less RAM (or flash if you used PROGMEM).

Majenko's last suggestion of using numeric constants would also allow you to get rid of the string compare call, which would likely save some code space and make a marginal improvement in speed.

Great suggestion. That's awesome and makes things much easier. Thanks everyone.

If you're using INT #defines, and passing them as the device / function pair to execute, could you do:

#define DEV_TV 0
#define DEV_AMP 1
#define DEV_DVD 2
#define DEV_SKY 3
//...
#define FUNC_POWER 0
#define FUNC_VOLUP 1
#define FUNC_VOLDOWN 2
//...

typedef struct {
// int device_id; is the index of the device_list array
// int function_id; is the index of the IR_Code function_list array
   int code_type;
   int code_bits;
   char* code_value;
} IR_Code;


typedef struct {
//   int device_id; -- not required
   IR_Code function_list[];
} Device;

Device device_list[] = 
{  {  // DEV_AMP,  -- not required
      { /*FUNC_POWER not required either ,*/ 1, 32, "400501FE" }, 
      { /* FUNC_VOLUP no required ,*/ 1, 32, "400501FF" },
      ...
   }
...
   {  // DEV_TV,  -- not required
      { /* FUNC_POWER not required either ,*/ 1, 32, "400501FE" }, 
      { /* FUNC_VOLUP, not required */1, 32, "400501FF" },
      ...
   }
}


...

when the call comes in, you just need

char *value_to_send = device_list[thisDevice].function_list[thisFunction].code_value;

sizeof(device_list[i].function_list)/sizeof(IR_code) would tell you how many functions a device had.

May not compile, but I think you get the general idea?

(I have essentially "normalised' your data structure - a db process I go through all the time).

This reduces your data overhead further, and makes the device+function == IR_Code process a simple array lookup with no searching required at all - assuming each list of functions defined for each device is a list of consecutive integers starting at 0.

That's neat.

I still need to get one of my remotes (Sky+ SRC-40) to work correctly and then will recode this according to your suggestion.

Thanks

Implementing something similar myself, and it would appear my C is still lagging severely. ie there are mistakes in the suggestion.

aarondc:
If you're using INT #defines, and passing them as the device / function pair to execute, could you do:

...

typedef struct {
//   int device_id; -- not required
  IR_Code function_list[];              // <---- this array needs to be supplied a size
} Device;
...

// this is wrong, given the array size is specified above
sizeof(device_list[i].function_list)/sizeof(IR_code) would tell you how many functions a device had.

Apologies for the misinformation. People with more experience / knowledge than me may be able to extend the suggestion or make it more gooder.