ESP32 unable to extract string from string based on Regexp mask

have done a lot of research but cannot find proper formating for Regexp mask in order to extract a string from another string.

Suppose I have the following string: "The quick brown fox ABC3D97 jumps over the lazy wolf" I need to extract the "ABC3D97" based on the mask: /[A-Z]{3}\d{1}[A-Z]{1}\d{2}/ but I just cannot find the proper syntax as the one above and variations of it return no match.

My test code is as below:

#include <Regexp.h>

void setup ()    {
  Serial.begin (115200);

  // match state object
  MatchState ms;

  // what we are searching (the target)
  char buf [100] = "The quick brown fox ABC3D97 jumps over the lazy wolf";
  ms.Target (buf);  // set its address
  Serial.println (buf);

  char result = ms.Match ("d{3}");   <-- returns no match. 
  
  if (result > 0)    {
    Serial.print ("Found match at: ");
    int matchStart = ms.MatchStart;
    int matchLength = ms.MatchLength;
    Serial.println (matchStart);        // 16 in this case     
    Serial.print ("Match length: ");
    Serial.println (matchLength);       // 3 in this case
    String text = String(buf);
    Serial.println(text.substring(matchStart,matchStart+matchLength));
    }
  else
    Serial.println ("No match.");
    
}  // end of setup  

void loop () {}

Assistance welcome.

You don't need third-party libraries for regular expressions, they are built into the standard library: https://godbolt.org/z/W6z9337EG

#include <regex>
#include <string>
#include <iostream>

int main() {
    std::string str = "The quick brown fox ABC3D97 jumps over the lazy wolf";
    std::regex rgx {R"([A-Z]{3}\d{1}[A-Z]{1}\d{2})"};
    std::smatch match;
    if(std::regex_search(str, match, rgx))
        for (size_t i = 0; i < match.size(); ++i) 
            std::cout << i << ": '" << match[i] << "'\n";
}

Output:

0: 'ABC3D97'

Either use another backslash to escape the backslashes in your pattern, or use a raw string literal.

std::regex rgx {"[A-Z]{3}\\d{1}[A-Z]{1}\\d{2}"};  // escape
std::regex rgx {R"([A-Z]{3}\d{1}[A-Z]{1}\d{2})"}; // raw string literal

Hi, Excuse me for my ignorance, but how do I get the output 'ABC3D97' into a variable?
This code is entirely new to me but it works great. Thanks

void setup() {
    std::string str = "The quick brown fox ABC3D97 jumps over the lazy wolf";
    std::regex rgx {R"([A-Z]{3}\d{1}[A-Z]{1}\d{2})"};
    std::smatch match;
    if (!std::regex_search(str, match, rgx))
        return; // fail, didn't match
    auto output = match[0];
    Serial.println(output);
}

It might be more efficient to start with simpler code. You should always look at the documentation and the examples:

Hi PieterP, you are right, I will have to dedicate some time to understand it, looks like a new language to me.
But for now it is close to solve my problem.
I used your last code but the compiler complains of the:
Serial.println(output);

It says:
no matching function for call to 'String::String(std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<const char*, std::__cxx11::basic_string > >&)'

Thanks for your assistance.

My bad, I forgot that Serial.print cannot print these things. You can either use std::cout as in the first example, or convert them to a C-style string like this:

Serial.println(output.str().c_str());

This is unfortunate because it creates a copy of the substring (necessary for null termination).

1 Like

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.