Wifi shield/php/mySQL data transfer issue

Disclaimer - I am far from an expert in internet communication protocols - most of what I know was learned through this project; please don’t assume I know something if you don’t see it included below.

The setup: Arduino Mega collects data, uses Wifi shield to perform Get command and passes data to php script. Php script logs data into sql database and sends confirmation to the arduino that the data has been recieved. It is important that data not be duplicated or skipped. I have since added datalogging code that records all the sent and received data to an SD card to help in debugging.

The problem: Everything will run fine for a period of time (usually between 10 minutes and 3 hours), but then the code will randomly hang when it is waiting for the php response and timeout. Sometimes part of the response is received before timing out, other times nothing. Also, usually the last set of data manages to make it into the database, but not always. Because the writing to the database comes after the response to the arduino in the php code, I’m inclined to believe that the problem is somewhere in the return path from the server to the arduino reading in the data.

Questions: do you see anything in the below code that could possibly be causing the response to timeout? 97% of the responses are received by the arduino in under 2 seconds, with a handful taking 5-7 seconds, so I feel like the 15 second timer should be sufficient? Is there a way that I can make the code more robust so that the arduino can recover from this issue rather than just freezing? I had suspected that the WiFi might be loosing the connection since it is in a fairly noisy environment, but when I poll Wifi.status() after the timeout, it still says that it is connected.

Here is what a normal response looks like on the logger:

GET /engineering/update.php?ID=00&E=010&PG=00000&P=00&S=0 HTTP/1.1 <-shows the outgoing request
HTTP/1.1 200 OK <-beginning of the response header
Content-Type: text/html
Server: Microsoft-IIS/7.0
X-Powered-By: ASP.NET
Date: Tue, 09 Jul 2013 15:59:26 GMT
Connection: close
Content-Length: 9 <-end of header

8:59:27 ← data returned from php script
!
1774 ← elapsed time (milliseconds)

This is the segment of the arduino quote that handles the wifi:

void servePage(byte i) {
  Serial.println("connecting to server");
  char url[67];
  sprintf(url, "GET /test/update.php?ID=%02d&E=%03d&PG=%05d&P=%02d&S=%01d HTTP/1.1", machineID[i], byte(elapsed[i]), currentProg[i], parts[i], state[i]);
  Serial.print("URL:");
  char server[] = "myserver.com";
  Serial.println(url);
  boolean complete = false;
  byte q = 0;
  File dataFile = SD.open("serveLog.txt", FILE_WRITE);
  dataFile.println(url);
  while (!complete) {
    if (client.connect(server, 80)) {
      client.println(url);
      client.println("Host: www.myserver.com\r\nConnection: close\r\n");
      Serial.println("Update complete, waiting for response");
      complete = true;
    }
    else {
      Serial.println("Failed to reach server, retrying");
      dataFile.println("Failed to reach server, retrying");
    }
    q++;
    if (q > 10) {
      Serial.println("Server connect timeout");
      dataFile.println("Server connect timeout");
      dataFile.close();
      while (true) {
        delay(200);
        digitalWrite(status_led, !digitalRead(status_led));
      }
    }
  }
  long startTime = millis();
  long lagTime = startTime;
  complete = false;
  while (!complete) {  //wait to receive opening symbol from server (discard HTML header)
    if (client.available()) {
      lagTime = millis();
      complete = client.peek() == '#';
      dataFile.print(char(client.read()));
    }
    if (WiFi.status() != WL_CONNECTED) {
      dataFile.println("Connection Lost");
    }
    if (millis() > lagTime + 15000) {  //timeout if more than 15 seconds have elapsed since the last character arrived
      Serial.print("Response timeout");
      dataFile.println("Response timeout");
      dataFile.println(WiFi.status());  //check to see if the wifi connection is still good
      dataFile.close();
      while (true) {
        digitalWrite(status_led, HIGH);
        delay(200);
        digitalWrite(status_led, LOW);
        delay(1000);
      }
    }      
  }
  
  q = 0;
  dataFile.println();
  while (client.peek() != '!') {  //read in data until end symbol is found
    dataFile.print(char(client.read()));
    if (WiFi.status() != WL_CONNECTED) {
      dataFile.println("Connection Lost");
    }
    q++;
    if (q > 10) {
      Serial.println("Invalid server response");
      dataFile.println("Invalid server response");
      dataFile.close();
      while (true) {
        digitalWrite(status_led, HIGH);
        delay(200);
        digitalWrite(status_led, LOW);
        delay(200);
        digitalWrite(status_led, HIGH);
        delay(200);        
        digitalWrite(status_led, LOW);
        delay(1000);
      } 
    }     
  }
  dataFile.println(char(client.read()));  //clear end symbol
  dataFile.println(millis() - startTime);
  dataFile.close();
  Serial.println();
  Serial.println("Disconnecting");
  client.stop();
  Serial.println("Request complete");  
}

And update.php:

<?PHP
echo "#";
echo date("G:i:s");
echo "!";
$host=""; // Host name
$username=""; // Mysql username
$password=""; // Mysql password
$db_name=""; // Database name

mysql_connect("$host", "$username", "$password")or die("cannot connect");
mysql_select_db("$db_name")or die("cannot select DB");

$ID = $_GET['ID'];
$Status = $_GET['S'];
$Program_Num = $_GET['PG'];
$Parts = $_GET['P'];
$Elapsed = $_GET['E'];

$sql = "INSERT INTO Update_Log (`ID`, `Status`, `Program_Num`, `Parts`, `Elapsed`) VALUES ('$ID', '$Status', '$Program_Num', '$Parts', '$Elapsed')";
$result = mysql_query($sql);
?>

This is not what you want. If the connection times out, this stays in the while(true) loop forever, right?

    q++;
    if (q > 10) {
      Serial.println("Server connect timeout");
      dataFile.println("Server connect timeout");
      dataFile.close();
      // if the connection breaks, it will stay in this loop forever
      while (true) {
        delay(200);
        digitalWrite(status_led, !digitalRead(status_led));
      }
    }

I think you should close the connection at that point instead. I would definitely wait longer too.

    q++;
    if (q > 10) {
      Serial.println("Server connect timeout");
      dataFile.println("Server connect timeout");
      dataFile.close();
      // close your end
      client.stop();
      // you are complete now
      complete=true;
    }

Correct, it will stay in that loop because I don't want to proceed without knowing whether or not the data transferred successfully. I could just close it and try again, but then I would not know if the last set of data had successfully gone through - I could call a different php script to check the last entry, but that would likely have the same connection issue.

I'll trying bumping up the timeout even more and see what happens.

If that is what you want,

Correct, it will stay in that loop because I don’t want to proceed without knowing whether or not the data transferred successfully.

and that is what it is doing,

The problem: Everything will run fine for a period of time (usually between 10 minutes and 3 hours), but then the code will randomly hang when it is waiting for the php response and timeout.

then you are doing fine. You know it will hang if the connection fails, and when it does, you act surprised.

Sometimes connections fail. Sometimes it is caused by a power fail somewhere in Georgia due to lightning. Sometimes a flooded basement in Peoria. You can’t prevent that unless the server and client are on the same localnet.

Sometimes the server gets the request and just doesn’t get a chance to send a response before the connection breaks. Sometimes the server doesn’t get the request.

You must be kinda prepared for all these “sometimes”.

And sometimes your code doesn’t do what you think it will. Here is my http client code for the ethernet shield. Once the transport is established, all the rest should be the same.
http://playground.arduino.cc/Code/WebClient
If it works for you, then feel free to borrow the code sections you need.