Help retrieving specific text from HTTP query string!?

Hiya,

First off, apologise if my title for this question is not correct or clear!

Second off my question! I am currently trying to setup a little project that will allow me to open my browser, browse to my Arduino via the Ethernet shield (192.168.0.20), get a htm page displayed which is sent from the Arduino, on that page be able to type into a forms "text input" some text e.g. ExampleText, and then have this sent in the top URL bar to the Arduino so that it can then take that text from the query string and print it to the attached LCD display!!

OK so far I have all the server part working, the page loads, I can enter the text into the box and hit submit. This text is then sent to the Arduino where I can then see it if I print the following data to the serial monitor.

If the text entered is ExampleText the URL at the top is:

http://192.168.0.20/?box=ExampleText

And the entire text received by the Arduino is:

GET /?box=ExampleText HTTP/1.1
Host: 192.168.0.20
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.76 Safari/537.36
Referer: http://192.168.0.20/
Accept-Encoding: gzip,deflate
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6

Currently the second larger block of text is saved one character at a time into a newly declared String variable called HTTP_req. However the reason I am stuck is because I now want a way of taking whatever text is between "?box=" and "HTTP/1.1" in the above block of text, then save this is to a new string variable so I literally have that value isolated in a new variable ready to print to LCD.

So any answers would be great, there surly must be an easy way to do that!?

Thanks!

So any answers would be great, there surly must be an easy way to do that!?

There is. But, it works this way. You post your code. We tell you how to fix it. We don't normally write your code for you.

Oh sorry, below is my current code. The reason I did not post it is because I haven't written anything to help with my question, that's why I need help. The ProcessPrintout() function at the button is just in place to test that substrings were working.

#include <LiquidCrystal.h>
#include <SPI.h>
#include <Ethernet.h>
#include <SD.h>

LiquidCrystal lcd(7, 6, 5, 8, 3, 2);
byte mac[] = { 0x90, 0xA2, 0xDA, 0x0E, 0xCE, 0xAB };
IPAddress ip(192, 168, 0, 20);
EthernetServer server(80);

File webFile;
String HTTP_req;

void setup() {
  Serial.begin(9600);
  lcd.begin(20, 4);
  Ethernet.begin(mac, ip);
  server.begin();
  lcd.setCursor(0, 0);
  lcd.print("Server Started");
  lcd.setCursor(0, 1);
  lcd.print("IP = ");
  lcd.print(Ethernet.localIP());
}

void loop() {  
  EthernetClient client = server.available();
  
  if (client) 
  {
    boolean currentLineIsBlank = true;
    while (client.connected())
    {
      if (client.available())
      {
        char c = client.read();
        HTTP_req += c;
        if (c == '\n' && currentLineIsBlank)
        {
          client.println("HTTP/1.1 200 OK");
          client.println("Content-Type: text/html");
          client.println("Connection: close");
          client.println();
          //Send web page
          client.println("<!DOCTYPE html>");
          client.println("<html>");
          client.println("<head>");
          client.println("<body>");
          client.println("<h1>Hello from the Arduino!</h1>");
          client.println("<p>A web page from the Arduino SD card server</p>");
          client.println("<form method='get'>");
          client.println("<input type='text'  name='box'>");
          client.println("<input type='submit'>");
          client.println("</form>");
          client.println("</body>");
          client.println("</html>");
          ProcessPrintout();
          HTTP_req = "";
          break; 
        }
        if (c == '\n')
        {
          currentLineIsBlank = true;
        }
        else if (c!= '\r')
        {
          currentLineIsBlank = false;
        }
      }
    }
    delay(1);
    client.stop();
  }
}

void ProcessPrintout()
{
  if (HTTP_req.indexOf("GET /?box=") > -1)
  {
    Serial.println(HTTP_req);
    if (HTTP_req.substring(10, 13) == "TTT")
   {
     Serial.println("Done it!");
   } 
   else
   {
     Serial.println("Nope!");
   }
  }
}

The ProcessPrintout() function at the button is just in place to test that substrings were working.

That's where you want to create a substring from HTTP_req. Save the value from HTTP_req.indexOf("GET /?box=") into a variable. Save the value from HTTP_req.indexOf(" HTTP/1.1") into a variable. Add 10 (the length of "GET /?box=") to the first variable. That will be the index of the first character of the substring. Subtract the first variable from the second. That will be the length of the substring. Create a new String that is a substring of HTTP_req.

Hiya,

Thanks for the reply, I have been attempting to write the code the way you have suggested, however I am having some very strange and unpredictable errors. The only change to the code that I have made is that of which is in the ProcessPrintout() function and the new code is shown below.

void ProcessPrintout()
{  
  if (HTTP_req.indexOf("GET /?box=") > -1)
  {
    int startPos = HTTP_req.indexOf("GET /?box=");
    int endPos = HTTP_req.indexOf(" HTTP/1.1");
    startPos += 10;
    int subStringLength = endPos - startPos;
    String toPrint = HTTP_req.substring(startPos, subStringLength);
    Serial.print(toPrint);    
  }
}

The errors are tricky to explain but will try my best. Firstly, the result of toPrint shown in the terminal console is not correct. If I send through the text "test" it prints out '/?box=' which is not correct. However if I then typed testt (2 t's) I get '?box=' which is very odd. And finally if I send through another text value, to print suddenly equals blank! So confused, am I being stupid here!?

Thanks!!

Print HTTP_req at the start of the function:

Serial.print("Request: [");
Serial.print(HTTP_req);
Serial.println("]");

Print the values of startPos and endPos. Are they reasonable?

However the reason I am stuck is because I now want a way of taking whatever text is between "?box=" and "HTTP/1.1" in the above block of text, then save this is to a new string variable so I literally have that value isolated in a new variable ready to print to LCD.

You probably could use the start/stop delimited code often used for serial communications to capture the desired data. Use the = as the capture start delimiter and the space character as the capture stop delimiter.

Thanks zoomkat for the reply. That does sound like a good possibility, and I will definitely do some further research into it if I cannot get this current method working correctly or efficiently!

As for the data stored in start and end pos, the start pos is 10 which I believe is correct, but the end pos is 0 for some strange reason!? You don't by any chance happen to have an Arduino and Ethernet shield setup you can test this on? No need for the LCD just for testing I guess.

Thanks again!

OK correction, just tested the start and end pos values again today and with test as the text entered, start pos is now 10 and end pos is 14 as it should be unlike yesterday! Very odd as I have changed no other code. So with that now correct I begun to wonder why it was storing the wrong sub-string value in toPrint. So I changed the following line of code;

String toPrint = HTTP_req.substring(startPos, subStringLength);

To...

String toPrint = HTTP_req.substring(startPos, endPos);

And then removed this line altogether...

int subStringLength = endPos - startPos;

The new code now looks like this...

void ProcessPrintout()
{  
  if (HTTP_req.indexOf("GET /?box=") > -1)
  {
    int startPos = HTTP_req.indexOf("GET /?box=");
    int endPos = HTTP_req.indexOf(" HTTP/1.1");
    startPos += 10;
    String toPrint = HTTP_req.substring(startPos, endPos);
    Serial.println(HTTP_req);
    lcd.clear();
    lcd.setCursor(0, 0);
    lcd.print(toPrint);   
  }
}

OK so this new code now works just how I wanted it to, it takes the value entered in the browser text field, then prints it perfectly to the LCD screen. However, this is when browsing to the Arduino on my laptop using chrome, if I navigate to the exact same address (http://192.168.0.20/?box=test) on my iPhone browser, it all breaks. So I then printed what was coming in in HTTP_req to the serial monitor and weirdly there is another section to it as follows;

GET /favicon.ico HTTP/1.1
Host: 192.168.0.20
Connection: keep-alive
User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 7_0_4 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) CriOS/31.0.1650.18 Mobile/11B554a Safari/8536.25
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6

This is confusing me as the code to print the HTTP_req to the serial monitor is contained in the if statement, if (HTTP_req.indexOf("GET /?box=") > -1), which the second block of text does not have in it and so shouldn't even enter the statement in the first place! Please help as this is super weird! Am I missing anything!? Thanks!

  if (HTTP_req.indexOf("GET /?box=") > -1)
  {
    int startPos = HTTP_req.indexOf("GET /?box=");

Is there some reason that you need to call indexOf() twice for the same string?
Is there some reason that you need to call indexOf() twice for the same string?

OK so this new code now works just how I wanted it to

The sub-string function I use (from a completely different class) use start position and length, not start position and the first position beyond the string that you want, as input. Sorry about leading you astray. In my defense, you should RTFM. 8)

Am I missing anything!?

Here's a hint: "There is another section".

The HTTP_req String contains TWO GET statements and two HTTP/1.1 statements. The indexOf() statement can be used to find the HTTP/1.1 phrase that follows the GET/?box phrase. RTFM :slight_smile:

The repeated code is in fact not required which I had already noticed but had not got around to changing as it is not important for now. Secondly when I ask a question and get what seems like a very confident answer from a very confident person, I don't feel like I should then have to go and RTFM!!

And anyway this part of your reply doesn't seem to make sense to me?

The HTTP_req String contains TWO GET statements and two HTTP/1.1 statements. The indexOf() statement can be used to find the HTTP/1.1 phrase that follows the GET/?box phrase.

Because when it prints the HTTP_req variable to the console it prints the two blocks of text as two separate blocks of text, the first block with GET /?box= in is saved to HTTP_req, used in the ProcessPrintout function then HTTP_req is set to blank. After that the second block of text with the favicon stuff in is written to the HTTP_req variable, but when ProcessPrintout is again called it allows it into the if statement even though it shouldn't as it doesn't contain GET /?box= and so the value should't be greater than -1 and so doesn't make any sense!!!! :~

it allows it into the if statement even though it shouldn't as it doesn't contain GET /?box= and so the value should't be greater than -1 and so doesn't make any sense!!!!

The print of HTTP_req should be in ProcessPrintout(), so that it is VERY clear what ProcessPrintout() is going to deal with.

The value returned by indexOf() should be printed, in ProcessPrintout(), and used in the if test.

OK think I know what you are saying, so I have changed my code to the following...

void ProcessPrintout()
{  
  int startPos = HTTP_req.indexOf("/?box=");  
  Serial.println(HTTP_req);
  Serial.println(startPos);
  if (startPos > -1)
  {    
    Serial.println("Entered IF Statement!");
    int endPos = HTTP_req.indexOf(" HTTP/1.1");
    startPos += 11;
    String toPrint = HTTP_req.substring(startPos, endPos);
    lcd.clear();
    lcd.setCursor(0, 0);
    lcd.print(toPrint); 
  }
}

In my head, this should print both the HTTP_req with the get info in as well as the favicon request, followed by the index of position of " /?box=". The one with the " /?box=" should print a index of 5 meaning it will enter the IF statement which is correct, then the second favicon one should have an indexOf -1 meaning it it shouldn't enter the IF statement! This is the results from entering the text "test"...

GET /?box=test HTTP/1.1
Host: 192.168.0.20
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.76 Safari/537.36
Referer: http://192.168.0.20/
Accept-Encoding: gzip,deflate
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6


0
Entered IF Statement!
GET /favicon.ico HTTP/1.1
Host: 192.168.0.20
Connection: keep-alive
Accept: */*
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.76 Safari/537.36
Accept-Encoding: gzip,deflate
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6


0
Entered IF Statement!

Any ideas!?

Be aware that some "tutti-frutti" browsers make a second request for an icon like below. You may need to ignore such request.

GET /favicon.ico HTTP/1.1

That's fine, and I'm happy to just ignore it, but it's how to do that. My code as it currently stands should be ignoring it but as you can see from my above post, it's not?

Below is some simple server test code which had the "on" in the query_string changed to "on1" to stop the "on" in "icon" from causing similar problems.

//zoomkat 8-17-13
//web LED code
//for use with IDE 1.0
//open serial monitor to see what the arduino receives
//use the \ slash to escape the " in the html (or use ') 
//address will look like http://192.168.1.102:84 when submited
//for use with W5100 based ethernet shields
//turns pin 5 on/off

#include <SPI.h>
#include <Ethernet.h>

byte mac[] = { 
  0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xED }; //physical mac address
byte ip[] = { 
  192, 168, 1, 102 }; // arduino server ip in lan
byte gateway[] = { 
  192, 168, 1, 1 }; // internet access via router gateway
byte subnet[] = { 
  255, 255, 255, 0 }; //subnet mask
EthernetServer server(84); //arduino server port

String readString; 

//////////////////////

void setup(){

  pinMode(5, OUTPUT); //pin selected to control
  //start Ethernet
  Ethernet.begin(mac, ip, gateway, subnet);
  server.begin();

  //enable serial data print 
  Serial.begin(9600); 
  Serial.println("servertest1"); // so I can keep track of what is loaded
}

void loop(){
  // Create a client connection
  EthernetClient client = server.available();
  if (client) {
    while (client.connected()) {
      if (client.available()) {
        char c = client.read();

        //read char by char HTTP request
        if (readString.length() < 100) {

          //store characters to string 
          readString += c; 
          //Serial.print(c); //print what server receives to serial monitor
        } 

        //if HTTP request has ended
        if (c == '\n') {

          ///////////////
          Serial.println(readString);

          //now output HTML data header

          //client.println("HTTP/1.1 200 OK");
          //client.println("Content-Type: text/html");
          //client.println();
          
          //client.print(F("HTTP/1.0 200 OK\r\nContent-Type: text/html\r\n\r\n"));


          client.print(F("HTTP/1.0 200 OK\r\nContent-Type: text/html\r\n\r\n"
            "<HTML><HEAD><TITLE>Arduino GET test page</TITLE>"
            "</HEAD><BODY><H1>Zoomkat's simple Arduino button</H1>"
            "<a href='/?on1'>ON</a>&nbsp;<a href='/?off'>OFF</a></BODY></HTML>"));

          delay(1);
          //stopping client
          client.stop();

          /////////////////////
          if(readString.indexOf("on1") >0)//checks for on
          {
            digitalWrite(5, HIGH);    // set pin 5 high
            Serial.println("Led On");
          }
          if(readString.indexOf("off") >0)//checks for off
          {
            digitalWrite(5, LOW);    // set pin 5 low
            Serial.println("Led Off");
          }
          //clearing string for next read
          readString="";

        }
      }
    }
  }
}

Both GET requests, the one with /?box and the one without, cause indexOf() to return 0. Clearly, that is wrong.

Why, I don't want to venture a guess concerning until you stop printing anonymous, undelimited data.

Each time I've suggested that you print something, I've included code that printed an identifier and, for Strings, starting and ending markers. Why do you strip that stuff out?

Thanks for the reply zoomkat, I will definitely look into that theory to see if there is any similar data in each text block which might be the same as or similar to each other.

PaulS...

Both GET requests, the one with /?box and the one without, cause indexOf() to return 0. Clearly, that is wrong.

I know this is incorrect which is why I posted the results. I am now trying to discover why it is incorrect.

Why, I don't want to venture a guess concerning until you stop printing anonymous, undelimited data.

By this statement, do you mean that when I am printing data into the serial terminal, I am not adding text in front to show what that data is showing? E.g. instead of printing 0, instead print StartPos: 0? If this is what you mean then my apologies and I will begin to do this.

Each time I've suggested that you print something, I've included code that printed an identifier and, for Strings, starting and ending markers. Why do you strip that stuff out?

Finally you have confused me here. I have just re-read this forum post and cannot see where you have ever posted example code? I have seen a couple of hints towards code I should add or try but never

included code that printed an identifier and, for Strings, starting and ending markers

Please could you verify this?

Thanks

Please could you verify this?

Sure. Look at reply #5.

Ah my apologies, I had skipped past that assuming it was a quote of my code for some reason. OK so I have now altered my code for a much smoother trouble shooting process which I hope you will be satisfied with :smiley:

Here is my new code...

void ProcessPrintout()
{  
  Serial.print("Request: [");
  Serial.print(HTTP_req);
  Serial.println("]");
  Serial.println("");
  int startPos = HTTP_req.indexOf(" /?box="); 
  int endPos = HTTP_req.indexOf(" HTTP/1.1");
  Serial.print("Start Pos: [");
  Serial.print(startPos);
  Serial.println("]");
  Serial.print("End Pos: [");
  Serial.print(endPos);
  Serial.println("]");
  if (startPos > -1)
  {    
    Serial.println("Entered IF Statement!");  
    Serial.println("");  
    startPos += 11;
    String toPrint = HTTP_req.substring(startPos, endPos);
    lcd.clear();
    lcd.setCursor(0, 0);
    lcd.print(toPrint); 
  }
}

And here is my new results from the serial monitor when using the word "test" as a test...

Request: [GET /?box=test HTTP/1.1
Host: 192.168.0.20
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.76 Safari/537.36
Referer: http://192.168.0.20/
Accept-Encoding: gzip,deflate
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6

]

Start Pos: [0]
End Pos: [0]
Entered IF Statement!

Request: [GET /favicon.ico HTTP/1.1
Host: 192.168.0.20
Connection: keep-alive
Accept: */*
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.76 Safari/537.36
Accept-Encoding: gzip,deflate
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6

]

Start Pos: [0]
End Pos: [0]
Entered IF Statement!

Hope this is much more clear and helps to find what the heck is happening here! :~