match keywords at 9us/char on average with small RAM reqd. See reply 1.

GoForSmoke · January 26, 2019, 3:19am

note- speed test in next post

This post: User Command Line Interpreter Example

You feed the match function words separated by delimiters and it identifies which ones matched which list words. It's a parser that also id's known words and like the last one it runs on the fly up to some speed, it does not buffer then figure out what it's got because I does the figuring out as it reads. There is no buffer required, very little RAM, the word list and associated data are kept in flash, number of words is limited to flash space and uint16_t count limit of 65535. I want this to be able to take advantage of 1284P chips.

Note that the same list can be used for auto-complete typing at least on ANSI terminals and emulators.

Last time I did this it took one program to turn a word list into source code for the actual target sketch. There is no way that anyone would want to generate the matching link offset table by hand for more than a very few words and then only as some kind of punishment for curiosity. It runs great, keeps up with 250000 baud input but not so coder-friendly.

I figured try for slower but simpler. Not simple, simpl-er.

So here's a word list and the data required:

const int keys = 11; // put this on a 1284P, there could be > 256 words in the list

// the keys must be in alpha order, all start with lowercase alpha
const char key000[] PROGMEM  = "blinkOffMs";
const char key001[] PROGMEM  = "blinkOnMs";
const char key002[] PROGMEM  = "blinkPin";
const char key003[] PROGMEM  = "blinkStartMs";
const char key004[] PROGMEM  = "blinkState";
const char key005[] PROGMEM  = "help";
const char key006[] PROGMEM  = "print";
const char key007[] PROGMEM  = "reset";
const char key008[] PROGMEM  = "set";
const char key009[] PROGMEM  = "start";
const char key010[] PROGMEM  = "stop";

const byte charsMatchingPrevWord[ keys ] PROGMEM = { 0, 6, 5, 5, 8, 0, 0, 0, 0, 1, 2 };

const int a_z_starts[ 26 ] PROGMEM =
  // a, b,  c,  d,  e,  f,  g, h,  i,  j,  k,  l,  m,  n,  o, p,  q, r, s,  t,  u,  v,  w,  x,  y,  z
{   -1, 0, -1, -1, -1, -1, -1, 5, -1, -1, -1, -1, -1, -1, -1, 6, -1, 7, 8, -1, -1, -1, -1, -1, -1, -1 }; // a-z

PGM_P const key_list[ keys ] PROGMEM =
{ // these are pointers to the words in the list
  key000, key001, key002, key003, key004, key005, key006, key007,
  key008, key009, key010
};

a_z_starts array does make starting letters equal, a nice cheap optimization over start at zero and find 't'.

charsMatchingPrevWord array turned out to be a real goody. The function matches input to list word chars and changes words when an input does not match the current list character. So it looks at the next list word that may start with some of the same chars as the one before it, the number for that word in the charsMatchingPrevWord array.

// we have chars matched and how many chars of the current word the next word matches
// if that number is smaller than already matched chars then no-match
// if that number is bigger then try the next list word
// if that number is the same then try to match the next char to text,
// ---- if they match then new current word and charsMatched++
// ---- else try the next list word

The optimization only checks later in the list words that start the same and differ in the same spot. The list is alpha ordered just to make it easier to know when to quit looking.

The match routine might take 50 or 60 lines of actual code, 129 with whitespace and comments. It keeps up with 115200 fed through serial monitor. That should be good for user entries?

I have to make the run the user function part for this example as cut & dry as possible.

User_commands_beginner_v1.ino (14.4 KB)

GoForSmoke · February 12, 2019, 4:35am

So I got the word match going and made a working user command interpreter example and thought of how to see how fast it is.

I made a word list of 32 key words and for speed the sketch feeds the chars of each word in the list over and over making sure that the input matches properly. This isn't limited by serial speed.

So I run the whole list 1000 times in 1764 millis on an Uno. The list has 196 letters in it, every one had to be matched.

196000 / 1764 = 111.111111.... chars searched per milli, 9 micros per char matched on average.

At 115200 baud the chars arrive 86.8 micros apart; at 250000 baud, 40 micros. At either rate, the sketch has a little time to do something about any matches made but I'd prefer the wider window if I have any printing or much processing to do.

serial_word_match_speed_test.ino (9.34 KB)

Topic		Replies	Views
Speed vs Size, and speed won this time. Bar Sport	4	693	May 6, 2021
Decisions.... Bar Sport	1	734	May 6, 2021
How much use for a librry with serial word match? Bar Sport	2	577	May 6, 2021
One way to find words in text. Programming Questions	24	6143	May 5, 2021
Extra-simple example, free serial text match tool. Exhibition / Gallery	1	1100	May 6, 2021

match keywords at 9us/char on average with small RAM reqd. See reply 1.

Related Topics