Web parsing for entering data

I am working with some old code not developed by me. I want to enhance the web page this ESP8266 project does and don't know much about HTML. The web page is a bunch of entries that are written to a config file. The page itself is fine, but it's parsing where I feel it could be done more cleanly.

Are there any library packages to make it easier to figure out what they entered? This is a small snip of many entries that are made in this code. If someone knows of a project on Github that uses web code that would also be great to use as example so I have a better idea on how to redo this.

    httprsp += "<br>Postal code<br>";
    httprsp += "<form action='/postal/'>"
               "http://<input type='text' name='postal' value='"
               + String(c_vars[EV_POSTAL]) + "'>(Zip Code, Country Code)<br>"
                                             "<input type='submit' value='set Post'></form><br>";



    if (httprq.indexOf("GET /postal/") != -1) {
      pidx = httprq.indexOf("?postal=");
      int pidx2 = httprq.indexOf(" HTTP/", pidx);
      if (pidx2 > 0) {
        String location = urldecode(httprq.substring(pidx + 8, pidx2).c_str());
  
      }
    }

This is more about HTTP than HTML (same HT for both). HTTP "commands" start with a "verb", which is officially called a method. By far the most popular one is GET; it's the one used when you type an address in a browser or click a link. For example, the command to get this page is like

GET /t/web-parsing-for-entering-data/1253142 HTTP/1.1

Notice that the "website", forum.arduino.cc, is not included; the command has already gotten there. When you click a link on a page, the next command is like GET /about

When you submit an HTML form -- to for example: sign up, login, or pay -- the most common method is POST, which requires method="POST" in the form element; which is not present here. Without it, the browser does another GET, with each of the input values as a query parameter after a question mark. YouTube notably does this:
www.youtube.com/watch?v=q5oXNkIZNVU instead of
www.youtube.com/watch/q5oXNkIZNVU (which also works)
For a form GET request, the path comprises

  1. The form's action location
  2. A question mark
  3. The input name
  4. An equal sign
  5. The input value submitted; for example with 98765
GET /postal/?postal=98765 HTTP/1.1

If it had them, multiple query parameters are separated by & (ampersand). Each of the values are URL-encoded in case the value contains a space or a character that "means something" like the aforementioned &. The code you posted calls an urldecode function, presumably defined elsewhere, to reverse it.

In case no one has an actual suggested library, at this point, you should be able to make sense of what the code is trying to do. With most other programming languages or platforms, all of this is readily available. But on Arduino, you are "closer to the metal" where the capabilities are more variable. Even using String is frowned upon in some cases. However, using String should make it easy to write a few general functions where you can pass the whole request (httprq) to extract the desired named value.

It's not that I don't understand what it's doing it's more I don't like the clumsy way it does this pidx, pidx2 to extract the data entered into the page. Pidx + 8 is because "?postal" is 8 characters. It has to move past that to get the actual data. I was hoping for some more elegant function calls to get the value back that they entered without all that manual coding.

if you've located the appropriate line containing the information

and there is only one value identified by the equal sign, why not just use

int val = atoi (1+strstr (str, "="));

OP said there are "many entries" -- how many values on how many pages total? postal= on /postal/ is one value on one page. And the rest may not be numbers like that.

Typically libraries will parse the whole URL-path and present the resulting query parameters in a map, which you could then check with something like params["postal"]. However, a dynamic memory structure like that may not be appropriate on more constrained devices, where the STL is not even available.

I imagine an Arduino-targeted library could

  • take a mutable (non-const) char buffer with the whole URL-path; either because you already have it in a mutable buffer, or you copied into it
  • that could URL-decode in place, since such decoding never makes the string longer
  • with an array of desired parameters names (const char *) as args
  • in a single pass of the buffer, replace the name pointers to point into the decoded buffer for each match
  • NUL-terminate each parameter value in the buffer, at the latest replacing the separating &
  • and replace non-matches in the array with nullptr

So for example with

const char *params[] = {"postal", "lat", "long"};
parseQueryString(buf, sizeof(buf), 3, params);

postal is hard-coded as param zero, so you then check params[0] to see if it is null; otherwise it is a pointer to a valid C-string, perhaps an empty one.

This is the current web code in the project. I need to add another 7-10 more config options. I just feel there has to be a better way to parse the options when a var is changed.

void web_server() {
  httpcli = httpsvr.available();
  if (httpcli) {
    int svf = 0, rst = 0;
    //Read what the browser has sent into a String class and print the request to the monitor
    String httprq = httpcli.readString();
    // Looking under the hood
    // Serial.println (httprq);
    int pidx = -1;
    //
    String httprsp = "HTTP/1.1 200 OK\r\n";
    httprsp += "Content-type: text/html\r\n\r\n";
    httprsp += "<!DOCTYPE HTML>\r\n<html>\r\n";

    if ((pidx = httprq.indexOf("GET /datetime/")) != -1) {
      int pidx2 = httprq.indexOf(" ", pidx + 14);
      if (pidx2 != -1) {
        String datetime = httprq.substring(pidx + 14, pidx2);
        //display.setBrightness (bri.toInt ());
        int yy = datetime.substring(0, 4).toInt();
        int MM = datetime.substring(4, 6).toInt();
        int dd = datetime.substring(6, 8).toInt();
        int hh = datetime.substring(8, 10).toInt();
        int mm = datetime.substring(10, 12).toInt();
        int ss = 0;
        if (datetime.length() == 14) {
          ss = datetime.substring(12, 14).toInt();
        }
        //void setTime(int hr,int min,int sec,int dy, int mnth, int yr)
        setTime(hh, mm, ss, dd, MM, yy);
        ntpsync = 1;
      }
    }

    else if (httprq.indexOf("GET /ota/") != -1) {
      //GET /ota/?otaloc=192.168.2.38%3A8000%2Fespweather.bin HTTP/1.1
      pidx = httprq.indexOf("?otaloc=");
      int pidx2 = httprq.indexOf(" HTTP/", pidx);
      if (pidx2 > 0) {
        strncpy(c_vars[EV_OTA], httprq.substring(pidx + 8, pidx2).c_str(), LVARS);
        //debug_print (">ota1:");
        //debug_println (c_vars[EV_OTA]);
        char *bc = c_vars[EV_OTA];
        int ck = 0;
        //debug_print (">ota2:");
        //debug_println (bc);
        //convert in place url %HH excaped chars
        while (*bc > 0 && ck < LVARS) {
          if (*bc == '%') {
            //convert URL chars to ascii
            c_vars[EV_OTA][ck] = hexchar2code(bc + 1) << 4 | hexchar2code(bc + 2);
            bc += 2;
          } else
            c_vars[EV_OTA][ck] = *bc;
          //next one
          //debug_println (c_vars[EV_OTA][ck]);
          bc++;
          ck++;
        }
        c_vars[EV_OTA][ck] = 0;
        svf = 1;
      }
    }
    //location
/*
        httprsp += "<br>Postal code<br>";
    httprsp += "<form action='/postal/'>"
               "http://<input type='text' name='postal' value='"
               + String(c_vars[EV_POSTAL]) + "'>(Zip Code, Country Code)<br>"
                                             "<input type='submit' value='set Post'></form><br>";

*/
    else if (httprq.indexOf("GET /postal/") != -1) {
      pidx = httprq.indexOf("?postal=");
      int pidx2 = httprq.indexOf(" HTTP/", pidx);
      if (pidx2 > 0) {
        String location = urldecode(httprq.substring(pidx + 8, pidx2).c_str());
       // location.trim();
        strncpy(c_vars[EV_POSTAL], location.c_str(), LVARS);
        getWeather();
        draw_weather_conditions();
        svf = 1;
      }
    }
    //API key
    else if (httprq.indexOf("GET /apikey/") != -1) {
      pidx = httprq.indexOf("?apikey=");
      int pidx2 = httprq.indexOf(" HTTP/", pidx);
      if (pidx2 > 0) {
        strncpy(c_vars[EV_APIKEY], httprq.substring(pidx + 8, pidx2).c_str(), LVARS);
        getWeather();
        draw_weather_conditions();
        svf = 1;
      }
      //
    } else if (httprq.indexOf("GET /wifi/") != -1) {
      //GET /wifi/?ssid=ssid&pass=pass HTTP/1.1
      pidx = httprq.indexOf("?ssid=");
      int pidx2 = httprq.indexOf("&pass=");
      String ssid = httprq.substring(pidx + 6, pidx2);
      pidx = httprq.indexOf(" HTTP/", pidx2);
      String pass = httprq.substring(pidx2 + 6, pidx);
      if (connect_wifi(ssid.c_str(), pass.c_str()) == 0) {
        strncpy(c_vars[EV_SSID], ssid.c_str(), LVARS);
        strncpy(c_vars[EV_PASS], pass.c_str(), LVARS);
        svf = 1;
        //   rst = 1;
      } else {
        Serial.println("Wifi Connect failed, will try prior SSID and Password");
        if (connect_wifi(c_vars[EV_SSID], c_vars[EV_PASS]) == 1)
          ESP.restart();  //Give up reboot
      }

    } else if (httprq.indexOf("GET /daylight/on ") != -1) {
      strcpy(c_vars[EV_DST], "true");
      NTP.begin(ntpsvr, String(c_vars[EV_TZ]).toInt(), toBool(String(c_vars[EV_DST])));
      httprsp += "<strong>daylight: on</strong><br>";
      svf = 1;
    } else if (httprq.indexOf("GET /daylight/off ") != -1) {
      strcpy(c_vars[EV_DST], "false");
      NTP.begin(ntpsvr, String(c_vars[EV_TZ]).toInt(), toBool(String(c_vars[EV_DST])));
      httprsp += "<strong>daylight: off</strong><br>";
      svf = 1;
    } else if (httprq.indexOf("GET /metric/on ") != -1) {
      strcpy(c_vars[EV_METRIC], "Y");
      httprsp += "<strong>metric: on</strong><br>";
      getWeather();
      draw_weather_conditions();
      svf = 1;
    } else if (httprq.indexOf("GET /metric/off ") != -1) {
      strcpy(c_vars[EV_METRIC], "N");
      httprsp += "<strong>metric: off</strong><br>";
      getWeather();
      draw_weather_conditions();
      svf = 1;
    } else if ((pidx = httprq.indexOf("GET /brightness/")) != -1) {
      int pidx2 = httprq.indexOf(" ", pidx + 16);
      if (pidx2 != -1) {
        String bri = httprq.substring(pidx + 16, pidx2);
        strcpy(c_vars[EV_BRIGHT], bri.c_str());
        display_updater();
        ntpsync = 1;  //force full redraw
        svf = 1;
      }
    } else if ((pidx = httprq.indexOf("GET /timezone/")) != -1) {
      int pidx2 = httprq.indexOf(" ", pidx + 14);
      if (pidx2 != -1) {
        String tz = httprq.substring(pidx + 14, pidx2);
        strcpy(c_vars[EV_TZ], tz.c_str());
        NTP.begin(ntpsvr, String(c_vars[EV_TZ]).toInt(), toBool(String(c_vars[EV_DST])));
        httprsp += "<strong>timezone:" + tz + "</strong><br>";
        svf = 1;
      } else {
        httprsp += "<strong>!invalid timezone!</strong><br>";
      }
    } else if (httprq.indexOf("GET /weather_animation/on ") != -1) {
      strcpy(c_vars[EV_WANI], "Y");
      httprsp += "<strong>Weather Animation: on</strong><br>";
      TFDrawText(&display, "        ", wtext_x, wtext_y, 0);
      getWeather();
      draw_weather_conditions();
      ntpsync = 1;
      svf = 1;
    } else if (httprq.indexOf("GET /weather_animation/off ") != -1) {
      strcpy(c_vars[EV_WANI], "N");
      httprsp += "<strong>Weather Animation: off</strong><br>";
      getWeather();
      draw_weather_conditions();
      ntpsync = 1;
      svf = 1;
    } else if (httprq.indexOf("GET /military/on ") != -1) {
      strcpy(c_vars[EV_24H], "Y");
      httprsp += "<strong>Military Time: on</strong><br>";
      prevhh = -1;
      svf = 1;
    } else if (httprq.indexOf("GET /military/off ") != -1) {
      strcpy(c_vars[EV_24H], "N");
      httprsp += "<strong>Military Time: off</strong><br>";
      prevhh = -1;
      svf = 1;
    }
    //Reset Config file
    else if (httprq.indexOf("GET /reset_config_file ") != -1) {
      init_config_vars();
      httprsp += "<strong>Config file resetted</strong><br>";
    } else if ((pidx = httprq.indexOf("GET /colorpalet/")) != -1) {
      int pidx2 = httprq.indexOf(" ", pidx + 16);
      if (pidx2 != -1) {
        String pal = httprq.substring(pidx + 16, pidx2);
        strcpy(c_vars[EV_PALET], pal.c_str());
        httprsp += "<strong>Color Palet:" + pal + "</strong><br>";
        svf = 1;
        rst = 1;
      }
    }

    //
    httprsp += "<br>MORPH CLOCK CONFIG<br>";
    httprsp += "<br>Use the following configuration links<br>";
    httprsp += "<a href='/daylight/on'>Daylight Savings on</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/daylight/off'>Daylight Savings off</a><br><br>";
    httprsp += "<a href='/military/on'>Military Time on</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/military/off'>Military Time off</a><br><br>";
    httprsp += "<a href='/metric/on'>Metric System</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/metric/off'>Imperial System</a><br><br>";
    httprsp += "<a href='/weather_animation/on'>Weather Animation on</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/weather_animation/off'>Weather Animation off</a><br><br>";

    httprsp += "<a href='/timezone/-5'>East Coast USA</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/timezone/-6'>Central USA</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/timezone/-7'>Mountain USA</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/timezone/-8'>Pacific USA</a><br>";
    httprsp += "use /timezone/x for timezone 'x'<br><br>";

    httprsp += "<a href='/colorpalet/1'>Clock Color Cyan</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/colorpalet/2'>Clock Color Red</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/colorpalet/3'>Clock Color Blue</a>&nbsp &nbsp &nbsp<br>";
    httprsp += "<a href='/colorpalet/4'>Clock Color Yellow</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/colorpalet/5'>Clock Color Bright Blue</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/colorpalet/6'>Clock Color Orange</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/colorpalet/7'>Clock Color Green</a>&nbsp &nbsp &nbsp<br><br>";

    httprsp += "<a href='/brightness/70'>Brightness 70</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/brightness/35'>Brightness 35</a>&nbsp &nbsp &nbsp";
    httprsp += "<a href='/brightness/0'>Turn off display</a><br>";
    httprsp += "Use /brightness/x for display brightness 'x'<br>";

    //Weather
    httprsp += "<br>Weather API key<br>";
    httprsp += "<form action='/apikey/'>"
               "http://<input type='text' size=\"35\" name='apikey' value='"
               + String(c_vars[EV_APIKEY]) + "'>(hex string)<br>"
                                           "<input type='submit' value='set API key'></form><br>";

    //Post/Zip code
    httprsp += "<br>Postal code<br>";
    httprsp += "<form action='/postal/'>"
               "http://<input type='text' name='postal' value='"
               + String(c_vars[EV_POSTAL]) + "'>(Zip Code, Country Code)<br>"
                                             "<input type='submit' value='set Post'></form><br>";


    //
    //OTA
       httprsp += "<br>OTA update configuration (every minute)<br>";
       httprsp += "<form action='/ota/'>" \
       "http://<input type='text' name='otaloc' value='" + String(c_vars[EV_OTA]) + "'>(ip address:port/filename)<br>" \
       "<input type='submit' value='set OTA location'></form><br>";

    httprsp += "<br>wifi configuration<br>";
    httprsp += "<form action='/wifi/'>"
               "ssid:<input type='text' name='ssid'>"
               + String(c_vars[EV_SSID]) + "<br>"
                                           "pass:<input type='text' name='pass'>"
               + String(c_vars[EV_PASS]) + "<br>"
                                           "<input type='submit' value='set wifi'></form><br>";

   
    //Reset config file  (You probably will never need to but it's really handy for debugging)
    httprsp += "<a href='/reset_config_file'>Reset Config file to defaults</a><br><br>";

    httprsp += "Current Configuration<br>";
    httprsp += "Daylight: " + String(c_vars[EV_DST]) + "<br>";
    httprsp += "Military: " + String(c_vars[EV_24H]) + "<br>";
    httprsp += "Metric: " + String(c_vars[EV_METRIC]) + "<br>";
    httprsp += "Timezone: " + String(c_vars[EV_TZ]) + "<br>";
    httprsp += "Weather Animation: " + String(c_vars[EV_WANI]) + "<br>";
    httprsp += "Color palette: " + String(c_vars[EV_PALET]) + "<br>";
    httprsp += "Brightness: " + String(c_vars[EV_BRIGHT]) + "<br>";
    httprsp += "<br><a href='/'>home</a><br>";
    httprsp += "<br>"
               "<script language='javascript'>"
               "var today = new Date();"
               "var hh = today.getHours();"
               "var mm = today.getMinutes();"
               "if(hh<10)hh='0'+hh;"
               "if(mm<59)mm=1+mm;"
               "if(mm<10)mm='0'+mm;"
               "var dd = today.getDate();"
               "var MM = today.getMonth()+1;"
               "if(dd<10)dd='0'+dd;"
               "if(MM<10)MM='0'+MM;"
               "var yyyy = today.getFullYear();"
               "document.write('set date and time to <a href=/datetime/'+yyyy+MM+dd+hh+mm+'>'+yyyy+'.'+MM+'.'+dd+' '+hh+':'+mm+':00</a><br>');"
               "document.write('using current date and time '+today);"
               "</script>";
    httprsp += "</html>\r\n";
    httpcli.flush();         // Clear previous info in the stream
    httpcli.print(httprsp);  // Send the response to the client
    delay(1);
    //save settings?
    if (svf) {
      if (vars_write() > 0)
        Serial.println("Variables stored");
      else
        Serial.println("Variables storing failed");
    }

    if (rst)
      resetclock();
  }
}

It's not in the code that's posted, but is httpsvr actually just a protocol-less WiFiServer? Instead of "manually performing HTTP" with that, why not use an actual web server like ESP8266WebServer? It automatically parses and decodes the query parameters. Here's a whittled down example I got running on ESP32

#include <WebServer.h>  // very similar to ESP8266WebServer, with easier-to-read source
WebServer websvr(80);

void setup() {
  // connect to WiFi, then
  websvr.on("/setap", []() {
    String ssid = websvr.arg("ssid");
    String pass = websvr.arg("pass");
    websvr.send(200, "text/html", "<h4>" + ssid + "</h4>" + millis());
  });
  websvr.begin();
}

void loop() {
  websvr.handleClient();
}

Having a separate handler for each path -- which can apparently support globs to support the /datetime/ for example -- is better than a series of else if; which BTW...

readString reads everything available: the first request-line, which is what you're interested in; plus any headers, which could easily be a hundred bytes or more; and any body present. Then for each attempted match

because it's indexOf, it will scan those few hundred bytes looking for a match anywhere. Instead, it should use startsWith, since the HTTP request will start with the method, like GET.

Just a general hint. I may have a few more if you can't get ESP8266WebServer working.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.