Can't translate a regex pattern to match arduino's regexp syntax

Here is what I want to achieve in terms of regex matching :

If I am correctly translated the regex expression to match the syntax of Nick Gammon's regexp library (v1.2), it would look like this : "&([%w%d%-]*)=([%w%d%.,%-]*)"

But this does not work somehow. It is only matching on the first occurence of the pattern and the rest are ignored.

To debug what is matched here is my simple code:

	MatchState ms;
	ms.Target(topic);
	// Escape char for this library is %
	char result = ms.Match("&([%w%d%-_]*)=([%w%d%.,%-_]*)");
	if (result == REGEXP_MATCHED)
	{
		SerialPrintln("C2D topic parsing result :");
		char buf[100];
		for (int j = 0; j < ms.level; j++)
		{
			Serial.print("Capture number: ");
			Serial.println(j, DEC);
			Serial.print("Text: '");
			Serial.print(ms.GetCapture(buf, j));
			Serial.println("'");
		}
	}

For the test string included in regex101's link above, this is the output :

Capture number: 0
Text: 'p2'
Capture number: 1
Text: '3.3'

Why the rest of the occurences are not matched ?

can you provide examples of the text you want to process and the keywords you want to match

  • ampersand does not need to be escaped (it's not here, but it is at your link at regex101)
  • "word" \w includes both digits and underscore
  • in a character class [ ], hyphen indicates a range; to include it, just put it last (or first)
  • you probably want at least one character to make a word before the equal sign

So the regex with % is a lot easier to read

&([%w-]+)=([%w.,-]*)

With this library, to do multiple matches, follow the GlobalMatch example with a callback

#include <Regexp.h>

const char *topic = "devices/VK_GATEWAY/messages/devicebound/%24.to=%2Fdevices%2FVK_GATEWAY%2Fmessages%2FdeviceBound&%24.ct=application%2Fjson&%24.ce=utf-8&p2=3.3&p1=12&p3=4.4&messageId=6c519ec3-6d60-463c-9a47-2159a72d91f9";

void match_callback  (const char * match,         // matching string (not null-terminated)
                      const unsigned int length   // length of matching string
                      const MatchState & ms)      // MatchState in use (to get captures)
{
  char buf[100];   // must be large enough to hold captures
  for (int j = 0; j < ms.level; j++)
  {
    Serial.print("Capture number: ");
    Serial.println(j, DEC);
    Serial.print("Text: '");
    Serial.print(ms.GetCapture(buf, j));
    Serial.println("'");
  }
}

void setup() {
  Serial.begin(115200);
  MatchState ms;
  ms.Target(topic);
  // Escape char for this library is %
  unsigned count = ms.GlobalMatch("&([%w-]+)=([%w.,-]*)", match_callback);
  Serial.print(count);
  Serial.println(" matches");
}

void loop() {}

prints

Capture number: 0
Text: 'p2'
Capture number: 1
Text: '3.3'
Capture number: 0
Text: 'p1'
Capture number: 1
Text: '12'
Capture number: 0
Text: 'p3'
Capture number: 1
Text: '4.4'
Capture number: 0
Text: 'messageId'
Capture number: 1
Text: '6c519ec3-6d60-463c-9a47-2159a72d91f9'
4 matches

Alternatively, you could send the optional second index argument to the Match call to skip over the text that has been matched previously.

Thank you @kenb4, it is indeed the solution, I missed the callback.

All included with the regex101 link.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.