String to function: any best practices?

Hi forum,
I need to execute functions from strings, so "somestring" should call function somestring();
It seems as though this is a classical 'problem', and it has a boatload of solutions. Some of them are cumbersome and ugly.
The problem would be easy if arduino could do a String Switch() but it can't, so i'm in doubt about the following solutions:

  • Create a map of function pointers and Strings, and directly call the function mapped to a certain string.
    (this seems like the nicest solution. but how do i create a 'map' in arduino? would this just be an array, and then call the function like array0; ? )

  • Have a map of ENUM values and Strings. Create ENUMS for all functions. switch the ENUMS to execute a function with the same name.

  • Just use a huge if/else block. This sounds ridiculous but it would be the fastest solution would it not?

Did anyone else out there had the same problem and how did you solve it?

This sounds ridiculous but it would be the fastest solution would it not?

Not to mention the most obvious and easiest to implement.

Of course, the statement:

I need to execute functions from strings, so "somestring" should call function somestring();

leaves one wondering why you need to do this. Having to know the name of the function to execute means that expanding the Arduino application can only happen in concert with changes to the application that is sending data to the Arduino.

A menu of options ("execute item #2") would require sending much less data, and make the mapping of data to function much simpler.

I usually use option 1 - create a map.

Here's a block of code from a digital emulator project I am working on:

struct opcode {
        unsigned short opcode;
        unsigned short mask;
        void (*function)(struct device *, unsigned short, char *);
};

struct opcode opcodes[] = {
        {0b000000000000, 0b111111111111, nop},
        {0b000001000000, 0b111111111111, clrw},
        {0b000000000100, 0b111111111111, clrwdt},
        {0b000000000010, 0b111111111111, option},
        {0b000000000011, 0b111111111111, _sleep},

        {0b000000000000, 0b111111111000, tris},
   ...
        {0,0,0}
};

The first two variables from each entry are what I use to select the entry. This could be your string (or char* as it is far far more efficient). The third is the function to call.

All these functions have to have the same prototype for it to work.

I then loop through looking for a match:

struct opcode *scan;

for(scan=opcodes; scan->function; scan++)
{
  if((instruction & scan->mask) == scan->opcode)
  {
    scan->function(dev,instruction,debugBuffer);
    break;
  }
}

So the whole thing consists of:

  1. Define a structure to contain the data.
  2. Populate a static array with the data and function pointers.
  3. Iterate the array looking for a match.
  4. If a match is found, call the function.

but how do i create a 'map' in arduino? would this just be an array,

Two parallel arrays, one of strings and the other of function pointers, or a single array of structs each with these elements.

Just use a huge if/else block. This sounds ridiculous but it would be the fastest solution would it not?

No faster than the above approach, either way you have compare strings and call functions. There may be a small difference either way but not enough IMO to justify a huge ugly if/else block.


Rob

Wow, thanks for all the superfast replies!

@PaulS Yes, this is completely true. however, i'm creating a proof of concept project, so all is in my own control (app-side and arduino-side coding).
Actually i'm trying to create a really badly written kind of RPC protocol (which contains functions i choose of course). Why you ask? well, because i cannot use a real RPC-like protocol, it's just plain text what gets sent from- and recieved on arduino, so i just have to work with Strings (or char* which apparently is more effective, will change that).
Oh bytheway - you reacted on a question i posted earlier about my project needing Wireless communication with an iPad app: My project now works with webSockets, using Per Ejeklint's arduino webSocketServer (he updated it to the latest spec and it now works in chrome, and hopefully also iPad, need to test that still. As webSockets just transfer plain text.. well, there you go :>)

@Majenko Yes! This is exactly what i'm looking for. I was hoping it would be less code, but this seems the least ugly solution!

@Graynomad Ah yes, that would be another way. I think one array with structs would be less error-prone and more clear. But i think i'll just copy paste Majenko's solution as my C++ experience is completely worthless. I understand the code when i see it, but i can't write it myself.

No faster than the above approach, either way you have compare strings and call functions.

A linear search will (on average, for random data) always be longer than a binary search - sort those keys!

AWOL:

No faster than the above approach, either way you have compare strings and call functions.

A linear search will (on average, for random data) always be longer than a binary search - sort those keys!

Absolutely! Order matters. For the stuff I posted there it has to be in "mask" order to work properly. I would suggest you build your array in descending order of popularity of the commands sent - the most common ones at the top.

Okay, i now have this:

// command structure (function name, parameter, actual function)
struct command {
char* functionName;
void (*function)(String, String);
};

// array of command structures
struct command commands[] = {
{"GET_Speed", GET_Speed},
{"SET_Mode", SET_Mode}
};

The rest now looks like this:

// extracting function and parameters (yeah ugly i know).
String data = String(dataString);
String command;
String param1;
String param2;

command = data.substring(0, data.indexOf(' '));
param1 = data.substring(data.indexOf(' '));
param2 = data.substring(data.lastIndexOf(' '));

struct command *scan;

for(scan=commands; scan->function; scan++)
{
if(command == scan->functionName)
{
scan->function(param1, param2);
break;
}
}

void GET_Speed(String param1, String param2) {
Serial.println("function GET_Speed executing with params: ");
Serial.println(param1);
Serial.println("\t and ");
Serial.println(param2);
Serial.println("\n\n");
}

void SET_Mode(String param1, String param2) {
Serial.println("function SET_Mode executing with params: ");
Serial.println(param1);
Serial.println("\t and ");
Serial.println(param2);
Serial.println("\n\n");
}

As you can see i still use String instead of char*, but that's because i don't know C++ and how to do subString(), indexOf() and lastIndexOf() on char (assuming that's even possible)

It compiles, let's see if it works. Thanks for all your help! :slight_smile:

It won't work :stuck_out_tongue:

First, you need to terminate your array. Have an entry at the very end of all 0, then you look for that 0 to know when you have reached the end of your array.

Secondly, you're not comparing the string contents, but the string addresses.

You need to use

if(!strcmp(command, scan->functionName))
{
   ...
}

instead of if(command == scan->functionName)

Ah thanks, i've added the {0,0} struct.

As everything is now a String, i'm using the arduino String compareTo function:

if(command.compareTo(scan->functionName))

shit yeah, it even works:

http://s17.postimage.org/vfiukqj7j/Screen_Shot_2012_06_13_at_17_17_36_PM.png

Ugh... String... You may well end up with fragmented memory and an arduino that can no longer handle the incoming requests. Much better to use statically allocated char *.

Coming from higher level languages, i think a String object is way more nice to work with then a 'character array'.
But for lower level programming i can understand performance and memory saving means everything.
So, how would i approach a character array substring comparison? if you tell me, i can change everything
to char* :> anyway, it's all proof of concept.

You have a number of tools at your disposal.

To match an entire string, you use strcmp():

match = strcmp(str1,str2)

match is <0 or >0 for a mismatch (str1 < str2, or str1 > str2), or 0 for a match, so you can use

if(!strcmp(command,"run"))
{
  ...
}

If you know the length of the substring, then you have the strncmp() function.

match = strncmp(str1, str2, size);

match is <0 or >0 for a mismatch (str1 < str2, or str1 > str2), or 0 for a match.

If you don't know where the substring is, then there is the strstr() function.

pos = strstr(str1,str2);

which returns the position of str2 within str1.

Then there is sscanf, which can be used to find parameters within a string:

int a,b,c;
if(sscanf(command,"set_coords %d %d %d",&a,&b,&c))
{
  ...
}

will take the string "set_coords 34 2 4983" and assign the numbers to the variables a b and c. sscanf returns the number of parameters matched.

And there are loads more.

Accessing individual characters in a string can be done with the [] suffix:

char onechar = mystring[4];

An offset within a string can be created with simple maths:

char mystring[] = "This is my string";
char *offset;

offset = mystring + 7;
// offset points to "my string"

supermaggel:
Coming from higher level languages, i think a String object is way more nice to work with then a 'character array'.
But for lower level programming i can understand performance and memory saving means everything.
So, how would i approach a character array substring comparison? if you tell me, i can change everything
to char* :> anyway, it's all proof of concept.

Coming from more ram than you need into ... well it depends on what AVR you use how tight the environment may be. Some have 16k or more on-chip and can address external ram while the UNO gives you 2k total. In high-level land you might use yacc and lex.

Performance is not worth ram on the small ram chips. Constants including text arrays can be stored in the more copious flash ram using PROGMEM and accessed directly or through buffer(s) for manipulation at a speed penalty.

I don't know how much text your proof of concept involves but perhaps an object class that has function pointer, PROGMEM link and maybe the first 1 to 4 characters of the match string as data members (to speed up search) along with functions to find text matches would simplify your code? I would recommend making an array of such objects even if only to dispense with need of links that 'new' objects would require. In practice you lose a little flexibility but that's like saying you can't open an umbrella in a phone booth.

Really, so much depends on the number of words to match and act upon.

@GoForSmoke Haha, yeah that is true as well. It's just that i don't understand why Arduino's String implementation is so much more RAM intensive, why don't they just use the char* functions but wrap them in a human-readable format - like string.compareTo instead of strcmpstrtoantrhstr() (to exacerbate what i'm trying to say here). I'm no C++ programmer, so i don't have these methods 'baked' into my mind, but i have used PHP which inherits a lot of the same function names and also there i found they weren't really human-readable. But this creates a whole different discussion which is not my point here).

Question regarding sscanf

I'm now using the sscanf function (which is awesome and saves a lot of roll-your-own-parser-time) this way:

void dataReceivedAction(WebSocket &socket, char* dataString, byte frameLength) {

// below debug print is a leftover from the example webSocketServer library.
#ifdef DEBUG
Serial.print("Got data: ");
Serial.write((unsigned char*)dataString, frameLength);
//Serial.println("\n");
#endif

// first, extract command and params if there are any
String command;
String param1;
String param2;

int numExtracted = sscanf(dataString, "%s %s %s", &command, &param1, &param2);

switch(numExtracted) {
case 0:
Serial.println("ERROR, ERROR, ERROR! NO VALID REQUEST FOUND.");
break;

case 1:
Serial.println("extraction yielded only a function call.");
Serial.println("command: ");
Serial.println(command);
Serial.println("\n\n");
break;

case 2:
Serial.println("extraction yielded a function call + 1 parameter.");
Serial.println("command: ");
Serial.println(command);
Serial.println("\n");
Serial.println("param1: ");
Serial.println(param1);
Serial.println("\n\n");
break;

case 3:
Serial.println("extraction yielded a function call + 2 parameters.");
Serial.println("command: ");
Serial.println(command);
Serial.println("\n");
Serial.println("param1: ");
Serial.println(param1);
Serial.println("\n");
Serial.println("param2: ");
Serial.println(param2);
Serial.println("\n\n");
break;
}

}

But, when i run this, i get a lot of gibberish AFTER the debug line (see screenshot)
Something is not right with the way i use sscanf, can anyone point out what?
Also, i know my switch-case debug could probably written in a tenth the amount of lines so proposals are greatly welcomed!

[EDIT: i've now removed all the command and param serial.println's.]
no weird gibberish and also allmost real-time communication with my app, instead of a 3 second delay.

i've changed my switch statement to the following:

Serial.println("Extraction yielded a function call + " + String(numExtracted-1) + " parameter(s).");

which just only lets me know what the signature is of what is called, enough for now but i'd like to
show the name of the function and the parameters as well..

IP adress:
192.168.1.200
Got data: SET_Mode bla test

extraction yielded a function call + 2 parameters.
function GET_Speed called...

String command;
  String param1;
  String param2;
  
  int numExtracted = sscanf(dataString, "%s %s %s", &command, &param1, &param2);

sscanf parses strings (char*) not Strings.

Hmmmm, I just examined zoomed through the string class. Seems some benefits can be gained here.

As I can tell,

char *buffer;	        // the actual char array

is the first element. So something like the following is valid:

int numExtracted = sscanf( dataString, "%s %s %s", &command, ( char* ) &param1, ( char* ) &param2);

but there is more appropriate versions to this. The best being a core update:

class String{

  public:

    operator char*(){ return this->buffer; }  //Add this

    //Blah

  private:

    char *buffer;	        // the actual char array

    //Blah
};

or without changing the core:

inline char *operator =( String &s_Str ){ return ( char* ) &s_Str; }

^^Have to cast due to buffer being private.

C++ String class, when you change a String (at least when the size changes) makes a copy of itself including overhead bytes and deletes the original which leaves a hole in the heap. One of the 'conveniences' being that you don't 'see' it happen.
Maybe the next allocation will fit in the hole. Might even fill it. If it doesn't then it goes on the end and bumps the heap size up. At the same time, the stack grows and shrinks from the top of ram down. If heap and stack collide then what happens isn't something you can find through simple examination of your source logic.

Here are some signs and observations of the road ahead for beginners:

All the C string commands boil down to simple enough ideas that you can roll on your own. You need to be aware of your own buffer lengths and do bounds checks even if you use the built-ins. I really think from having rolled my own that making your own teaches enough to lose the mystery and fear of of commands like strcmp (string compare), to know that strncpy differs from strcpy by how each deals with number of characters copied and terminating zero. Strncpy does not place a terminating zero at the end of the copied section which is essential when changing characters in the middle of a string -- it is equal to the BASIC MIDSTR$() command.

Why prefer strcmp to stringCompareTo? Because I'm not a masochist. Because getting used to the abbreviated code takes a short while and it takes less thought-space too. That's why we shorten names and words, especially technical words, or have "big words" to replace whole sentences. It's the difference between in many cases English and German, two syllables vs many. OTOH many technical names in German tell just what a thing is or does in detail, like long variable names. But say them ten times fast, you can't. Long words are harder to think.

Setting up a flag and while loop to compare two arrays is drop dead simple once you've done it a time or two yet for those who have not there's some kind of dread. It takes more mental work to constantly come up with meaningful comments!

A pointer is a pointer, it's not the data, it only points to data. You add 1 to a byte pointer and it points to the next byte. You can walk through an array by setting a pointer to the start and incrementing the pointer as you go, it's like working with an index where you don't have to use the array name and brackets, it saves typing. There's even pointers to pointers, pointers to functions and pointer tables. Take a week or more to get the basics down and do the homework because pointers are one of the most versatile and powerful things in C.

The best part of high level for me means not having to know what CPU registers are being used. I use libraries when I need them but I know that behind every convenient function there are a couple to many lines of code and maybe (probably) some stack variables in addition to passed args.

If you use sscanf or atoi or any functions that can return an error (feed atoi "123foo") then check for error returned. It's easier to code for everything going right but that's wishful non-thinking unless you made sure in previous code. If you're that good then all of the above is old hat anyway.

feed atoi "123foo"

It converts it to an integer containing 123. atoi() converts the beginning of the string, so that is a perfectly valid string. "foo123" would throw it more, in that it will return 0 :wink: