dpswt:
it’s a simple lookup: if the input matches a certain string on one of the arrays, do something with it
Then it sounds like putting those arrays in PROGMEM is the way to go, there is no need to have them to RAM. (Of course, they are already in PROGREM, that’s how they are currently copied to RAM in the first place, so you won’t be taking a significant PROGRMEM hit to store them there.)
The issue with PROGMEM (and why strings are generally copied to RAM in the first place) is that the AVR processors use a Harvard architecture where code space and data space are two distinctly different address domains, and are accessed differently by the processor. While the architecture provides some significant speed benefits, it makes it hard to access program space memory as data.
There are some functions (like print() and println()) that are overloaded and have special versions to handle strings that are in PROGMEM. These will copy the data out of PROGMEM a byte at a time and then process the data. As far as I know, there is no string compare function that can compare a string in RAM with a string in PROGMEM, you will probably have to write one yourself. That shouldn’t be too bad: a function that takes a const char*
parameter to indicate the RAM string, and a const __FlashStringHelper*
parameter to indicate the PROGMEM string. The function would look something like this: (I did not try to compile or test it, it probably still needs some work)
boolean compare(const char* a, const __FlashStringHelper *b)
{
unsigned char c = 0xff;
PGM_P p = reinterpret_cast<PGM_P>(b);
while (*a++) // Repeat until end of string a
{
// Get the next character from PROGMEM
unsigned char c = pgm_read_byte(p++);
// Strings are different if current character is different, or end of string b
// Don't need to bother checking for end of string a, as the while loop will catch that
if ((*a != c) || (c == 0) )
return false;
}
// Hit end of string a. Strings are equal if also end of string b (c==0).
// Strings are different if c is not zero (there is more in string b)
return (c == 0);
}
I’ve not tested it, but hopefully it’s close. There is a limitation to the way this is coded: if string a is empty on entry (a points to a NULL character) it will never execute the while loop, and will never read the first character from the PROGMEM string. Because c is initialized non-zero, the function will return false in this case, as it assumes b is not empty. This will be the wrong if string b is indeed empty. While it’s certainly possible that string a is empty (if you’re comparing user input they may have entered nothing) you probably won’t be using this function to compare to an empty PROGMEM string, so this may not be a serious limitation. (After all, you will already know what you put in PROGMEM, so why would you use this function with an empty PROGMEM string? The only reason to do so would be to check if string a was empty, and there are far more efficient ways of doing that.)
The reason there is 2 different arrays is essentially because every string in it has a common keyword (e.g. one array has “cheese”, other has “milk” in every string), hence the possibility to rule out one of the arrays before I loop it (strstr function).
So it sounds like you are concerned about speed, so you broke things up into two lists, so you can immediately eliminate many of the possibilities. Because you need to copy each character from PROGMEM, this function won’t be as fast as a stock strncmp() function. One way to improve efficiency is to put the most common options at the beginning of the list.
Other ways to improve efficiency are to sort the list alphabetically and use a binary search to quickly get you to the right portion of the list, or ditch the array of strings completely and go with something like a tree data structure that is faster to search.