WiFi shield server library corruption/crashes

The wifi shield server library has what appears to be a very serious bug. It is corrupting file uploads and crashing if too many requests in a row, especially large files like images.

My wifi shield ran ok most of yesterday, but no large downloads in quick succession. This morning when I checked it, it was crashed, and the last files uploaded was the waterfall image twice in a row. I am suspecting this multiple image request is the culprit.

This has been discussed in a previous thread, but the subject of that thread is misleading. http://forum.arduino.cc/index.php?topic=166151.0

I have a wifi shield running server code exposed to the internet. You can experiment with it if you like. My only request is if you crash it, please post here what you did to make it crash. Normally, the culprit appears to be two images downloaded one immediately after the other. http://68.99.58.119

Here are links to the the images. http://68.99.58.119/img/wolver.jpg http://68.99.58.119/img/waterfal.jpg

If you are having issues with your wifi shield server, I would like to hear from you.

BTW, the wifi shield client library seems to work great. Here is a thread on it. http://forum.arduino.cc/index.php?topic=200784.0

The UDP library works good also.

I would like to help with a firmware solution, however I don't have a shield. So if anyone has seen a cheap(er) version available please post where you saw it.

The server just crashed after two requests one immediately after the other, but neither were were large files. Both requests should have received an error message. The first error should have been "Bad request", and the second should have been "file not found".

edit: Here are the two requests that apparently crashed it:

Client request #31: GET //admin/phpmyadmin/scripts/setup.php HTTP/1.1 file = //ADMIN/PHPMYADMIN/SCRIPTS/SETUP.PHP file type = PHP GET method = GET params = protocol = HTTP/1.1 SD file filename too long disconnected -1 80 -1 0 -1 0 -1 0

Client request #32: GET //admin/pma/scripts/setup.php HTTP/1.1 file = //ADMIN/PMA/SCRIPTS/SETUP.PHP file type = PHP method = GET params = protocol = HTTP/1.1 SD file filename format ok File not found disconnected -1 80 -1 0 -1 0 -1 0 SSID: OhSh BSSID: 0:C:42:2B:B8:B4 signal strength (RSSI): 0 Encryption Type:0 Status:no shield

edit2: The client sent the same requests, and the server crashed again. :( @client: What web browser or software were you using to send the requests?

Nope. That long request did not crash my "THINGY". My "THINGY" is now an ethernet shield. Give this device your best shot.

edit: This as almost the identical code run on an ethernet shield. All I asked was: If you can crash it, tell me how you did it. You couldn't even do that. :(

My "THINGY" is also not affected at all by the double image download. Kept right on going. Socket #0 for the first request. Socket #1 for the second request.

I am testing the possibility it is my code, so this is an ethernet shield running the same code. It has taken some real abuse over the last few weeks. It will be a tough nut to crack. :D

edit2: I forgot to post the ip again. http://68.99.58.119 See if you can crash that. Do your best (worst).

The code seems ok, so it is now the wifi shield. If you crash it, please let me know what you did, the OS (Windows, Mac, Linux), and the web browser you are using.

edit: If you get the incorrect images, let me know what OS and web browser if you please.

Just an update. I have the wifi library socket allocation problem worked out (I think), but the firmware still malfunctions, and fails (locks up) under repeated attack from the same ip. It requires multiple requests pending to malfunction and send the wrong file, and I am trying to determine the reason for the lockup.

I have a hacker that tries every night to either hack or crash my server sketch. Every night he/she has succeeded. But every night I get closer to finding the reason. I am hit by multiple requests (several per second) to bogus php pages, and after several, the Arduino quits responding to both wifi and serial input.

I want to thank the hacker for exposing that weakness. Keep it up! I'll find the cause eventually. http://68.99.58.119

edit: Does anybody know who did the wifi shield firmware? I have a few questions. edit2: Never mind. I found out. https://github.com/arduino/wifishield/tree/master/firmware

My quest is to determine when the socket will accept a new connection. It appears (?) the socket will accept a new connection after the sketch empties the rx buffer of the previous connection request. If that is the case, it explains a lot. If my sketch empties the rx buffer, and before my sketch can return a complete response, the server socket accepts the next connection request, that explains the corruption and wrong files.

SurferTim: My quest is to determine when the socket will accept a new connection. It appears (?) the socket will accept a new connection after the sketch empties the rx buffer of the previous connection request.

I see you are still struggling with this Tim. I would like to help but I fear you will not allow me to help.

In an attempt to get things moving in the right direction. Can we agree on what a Socket is?

I propose; + A Socket is an instance of a state machine with standardised behaviour. + The purpose of a TCP Socket is to separate TCP endpoints, into separate data streams at the application layer.

If we can agree on that, we might subsequently agree what the standardised behaviours should be and more quickly identify what is wrong with the WiFi shield firmware.

Your fear is not justified. I am looking for help. I was hoping you would respond. I would like your input on a socket function.

I have stopped the Arduino from crashing. That was Norton antivirus scanning the usb flash for viruses, and it did weird things to the Arduino. I moved it to a Linux box to monitor the serial port, and the Arduino crashes stopped.

I have not stopped the wifi shield firmware from crashing. If a client makes more than one request, it can cause the firmware to crash. I believe this is due to the socket failing to stop listening for a new client, and accepts the next connection before my code sends a response to the previous connection. Then you will get the wrong file (previous client request) on the current connection.

The wifi shield has 4 sockets. Each has its own section of memory to handle that socket. Each will handle one client connection at a time. I see the problem now as the one socket used by the server never stops listening for a new connection. It seems it accepts a new connection once my code empties that socket's rx buffer of the client request.

The ethernet shield does not have this problem because it will use all 4 sockets for server functions. When a client is detected on a socket, that socket quits listening, and a new server socket starts listening if one is available.

My question is: Should the server socket quit listening for a new connection until the server closes the connection with client.stop()? My tests have shown it never stops listening, and will accept another connection before the previous tcp connection on that socket is closed with client.stop().

SurferTim: Your fear is not justified. I am looking for help. I was hoping you would respond. I would like your input on a socket function.

Then I will ask you to do three things for me please; + Read my comments completely. + Try not to ignore what you might not understand. + Ask about what you might not understand.

I believe this is due to the socket failing to stop listening for a new client, and accepts the next connection before my code sends a response to the previous connection.

Fantastic, we have some common ground. As I said some time ago, Socket states are mutually exclusive. A Socket can not be in both the Established and Listening states at the same time. Your experiments are showing you what happens when that rule is broken.

If you might understand the implications of the network being asynchronous. You might understand the H&D chip and the AT32 MCU firmware on the shield handle the protocol independently of your sketch. I will post some test code which categorically proves that is what happens.

Then you will get the wrong file (previous client request) on the current connection.

Sort of.

You need to straighten out your understanding of the relationship between; + TCP - A Session layer [u]protocol[/u] + Sockets - An application layer [u]interface[/u]

A Socket can only behave like a Socket, when it follows the rules of the protocol it is manipulating. The interface must send SYN, FIN, ACK, RST packets in the right order at the session layer, to keep endpoints separated at the application layer. BTW, a TCP end point is the unique combination of Source IP and [u]Source[/u] Port number.

The wifi shield has 4 sockets. Each has its own section of memory to handle that socket. Each will handle one client connection at a time.

Yes. For the sake of clarity, can we put aside how many Socket instances the Shield firmware may or may not provide. We only need to concern ourselves with the one instance which is providing the broken Server functionality, for the minute.

I see the problem now as the one socket used by the server never stops listening for a new connection.

Yes. A Socket can not be both Listening and Established. I keep repeating this because it is really, really important.

It seems it accepts a new connection once my code empties that socket's rx buffer of the client request.

NO. TCP connections are accepted by the H&D chip and data is placed into the Rx buffer by the AT32 firmware, independently of your sketch polling the SPI bus. See the test code below.

The ethernet shield does not have this problem because it will use all 4 sockets for server functions. When a client is detected on a socket, that socket quits listening, and a new server socket starts listening if one is available.

Yes. When a Socket in the Listening state is assigned to an end point, it changes to the Established state. A Socket can not be in two states at once. Note, I am capitalising Socket because I am talking about a standard Socket object, rather than what the Shield's firmware might refer to.

My question is: Should the server socket quit listening for a new connection until the server closes the connection with client.stop()?

Pretty much yes. There is more to it. You need to consider the relationship between a Socket and TCP. When we can talk about the SYN, SYN/ACK, SYN/RST and FIN packets, which are used to accept, reject, create and close sessions, I can tell you exactly what the Shield should do.

My tests have shown it never stops listening, and will accept another connection before the previous tcp connection on that socket is closed with client.stop().

It is actually worse than that but one thing at a time.

This is what the shield is doing; + Once the Socket is set to listen, the shield accepts up to 4 concurrent TCP session requests. + While the number of concurrent sessions is 4, further connection requests are ignored. + When the number of concurrent sessions falls below 4, the shield responds to any connection requests it has so far ignored. The client may have given up in the meantime. + The shield then accepts new connection requests.

There are up to 4 concurrent session. There is only one input buffer. Connections are accepted and data is placed into the buffer, independently of the sketch polling the SPI bus. Do you see the problem?

You can't really call the Server functionality a Socket, because it does not behave as a Socket must behave. The firmware does not separate the TCP endpoints. What we have is a session counter and a buffer which may contain data from different end points, interleaved within it.

Code to follow

This is what the shield is doing;

  • Once the Socket is set to listen, the shield accepts up to 4 concurrent TCP session requests.

How can it be worse than this? When the sketch starts the wifi server, it will use only one socket for all connections. One socket, four connections, and no way to tell them apart in the library. You are sending the responses to one socket. How does it determine which ip/port to send the response to?

It should accept only one connection per socket. All other connection requests should be ignored until the socket is finished with the current client.

The ethernet shield does exactly what you describe, and it works because it is using all four sockets if they are available, with each client getting its own socket.

edit Here is the function I use to check the socket status. It uses only one socket (socket 0). The ethernet shield has this same function, and it shows all 4 sockets being used as clients connect.

#include <utility/server_drv.h>

void ShowSockStatus() {
  for(int x = 0; x < MAX_SOCK_NUM; x++) {
    Serial.print(WiFi._state[x]);    
    Serial.print("  ");
    Serial.print(WiFi._server_port[x]);    
    Serial.print("  s=");
    Serial.print(serverDrv.getServerState(x));    
    Serial.print("  c=");
    Serial.print(serverDrv.getClientState(x));    
    Serial.print("  d=");
    Serial.println(serverDrv.availData(x));    
  }
}

The test code.

All I do is start the server and print out the bytes available in the input buffer, every 10 seconds.
I never read the contents of the input buffer.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <avr/pgmspace.h>

#include <Streaming.h>
#include <WiFi.h>
#include <utility/server_drv.h>

const char SSID[] = "ssid";
const char PSK[] = "password";

void setup()
{
	Serial.begin(9600);

	do {
		Serial << F("Joining...\r\n");
		WiFi.begin((char*) SSID, (char*) PSK);
	} while (WiFi.status() != WL_CONNECTED);

	ServerDrv::startServer(80, 0, TCP_MODE);
	
	Serial << F("Exit Setup\r\n");
}

void loop()
{
	uint16_t bytes =0;
	bytes =	ServerDrv::availData(0);
	Serial << "bytesAvailable=" << bytes << "\r\n";
	delay(10000);
}

Here is what Wireshark sees

No.     Time     Src Port Dst Port Protocol Length Info                         Data Len
      1 0.000                  	   0x0806   42     ARP
      2 0.091                      0x0806   60     ARP
      3 0.000    56798    80       TCP      78     56798 > 80 [SYN] 		Len=0
      4 0.001    80       56798    TCP      60     80 > 56798 [SYN, ACK] 	Len=0
      5 0.000    56798    80       TCP      54     56798 > 80 [ACK] 		Len=0
      6 0.009    56798    80       TCP      425    56798 > 80 [PSH, ACK] 	Len=371
      7 0.117    80       56798    TCP      60     80 > 56798 [ACK] 		Len=0
      8 2.209    56799    80       TCP      78     56799 > 80 [SYN] 		Len=0
      9 0.001    80       56799    TCP      60     80 > 56799 [SYN, ACK] 	Len=0
     10 0.000    56799    80       TCP      54     56799 > 80 [ACK] 		Len=0
     11 0.061    56799    80       TCP      425    56799 > 80 [PSH, ACK] 	Len=371
     12 0.006    80       56799    TCP      60     80 > 56799 [ACK] 		Len=0
     13 130.494                    0x0806   60     ARP

Packets 4 and 9 are the Syn/Acks, sent from the Shield as it accepts client connection requests from two different end points.
The ~2 second gap between the connections is much less than the 10 second delay in the sketch.

Packets 6 and 11 show 371 bytes of data being sent from each of the end points.

Here is the Arduino output. 371 bytes x 2 clients = 742 bytes

Port open
Joining...
Exit Setup
bytesAvailable=0
bytesAvailable=0
bytesAvailable=742
bytesAvailable=742
bytesAvailable=742
[/quote]

Which proves, the shield accepts connections and writes data into the Rx buffer, independently of the sketch polling the SPI bus.

Thanks, Matt! That was the info I was looking for. So my test sketch timing was out of sych to see that happen. I always waited until I read the rx buffer to check the socket status.

Let's presume the wifi firmware uses only one socket, and it will accept connections to that one socket as fast as it can, overwriting or appending the previous request as it does. If my sketch reads the request, and while it is opening the SD file, the socket accepts another request from a different ip/port, which would get the response when my sketch starts sending it?

edit: My sketches only read the rx buffer until the first blank line. So if the socket is appending the connection requests, and the server gets two requests before I read the rx buffer, my sketches would send the first client request to the second client.

SurferTim:

This is what the shield is doing; + Once the Socket is set to listen, the shield accepts up to 4 concurrent TCP session requests.

How can it be worse than this?

When I say worse, I guess I mean more broken, rather than the consequences of the brokenness being any worse than they already are.

You are sending the responses to one socket. How does it determine which ip/port to send the response to?

The short answer is, it doesn't! When I send your web server sketch a series of requests for different resources in short succession, I end up with the wrong data in the wrong files at my end. The right data appearing in the right files is, I think, a coincidence of timing.

It should accept only one connection per socket. All other connection requests should be ignored until the socket is finished with the current client.

If the shield has only one socket able to enter the Listening state, I would argue the shield should reject, rather than ignore, any connection request which arrives while the socket is in any other state.

Frustratingly, the argument is academic until there is some commitment to update the firmware. The crux of what I have been saying all along is re-writing the firmware, to implement Sockets which behave as Sockets must behave, would be a lot quicker than working out what is wrong with the current implementation, then trying to work around it on the Arduino side of the SPI bus.

edit: My sketches only read the rx buffer until the first blank line. So if the socket is appending the connection requests, and the server gets two requests before I read the rx buffer, my sketches would send the first client request to the second client.

That may be an assumption too far. Given what we know, there are multiple failure modes which can produce the one result you are observing. Unfortunately, testing your theory is not straightforward, as there is no way to lookahead into the shield's Rx buffer beyond peeking the first character. The socket Rx buffer is 2KB, which is more than my Uno can shadow. I might be able to find time to free up my Mega and look into what is occurring, but not today.