Possible Arduino 1.0.1 Ethernet Library bug?

I am having problems with GETs failling when more than socket is used for server TCP connections. I though there might be some problem just with my sketch so I decided to set up a simple test hardware/sketch configuration to reproduce the problem.

Test config was

Hardware
328UNO and a standard W5100 Ethernet Shield

Sketch

The Arduino 1.0.1 Sketch is the standard Ethernet library Webserver example sketch with one line modification to extend the time a socket remains connected. A simple delay(200) has been inserted at line 64 in the sketch.

        if (c == '\n' && currentLineIsBlank) {
          // send a standard http response header
          delay(200); // add some arbitary delay
          client.println("HTTP/1.1 200 OK");
          client.println("Content-Type: text/html");

To test the GET handling of the sketch I set up an open source web tester "WebInject 1.41" with a very simple test case that just sends GET sequentially 1000 times. I set the "Minimal output" checkbox on the WebinJect GUI to speed up sending the GETs.

The WebInject Test Case is

If the WebInject test case is run with no other GETs being sent to the test setup it runs perfectly with no errors. When there is only one socket connected at any time everything works as it should.

BUT when the WebInject test is run and the test set up is also accessed at HxxP://192.168.0.140:60 using Mozilla or Internet Explorer there are intermittent GET failures seen during the WebInject test.

Sometimes I had to refresh the Firefox or IE page many times. But it is always possible to provoke WebInject test failures when there is more than one GET at the same time.

It looks like the Arduino 1.0.1 Ethernet library has a bug and does not handle more than one server socket connection at the same time. Anyone else seeing the same problem??

It looks like the Arduino 1.0.1 Ethernet library has a bug and does not handle more than one server socket connection at the same time. Anyone else seeing the same problem??

I do not believe that is a bug, but the limitations of the w5100. It does not process more than one connection at a time if you use the ethernet library. It will accept four connections at a time, but processes only one. The other connections will have to wait their turn until it is finished with the current active connection. Then it will process the next connection.

The challenge is when there are already four connections, one active and three waiting. It will not accept another until it finishes and closes one of the current connections.

It does not process more than one connection at a time if you use the ethernet library. It will accept four connections at a time, but processes only one. The other connections will have to wait their turn until it is finished with the current active connection. Then it will process the next connection.

I agree the Ethernet library will only process one connection at a time.

BUT the other connections don't wait their turn. They just fail.

Z

Mine did ok when subjected to many requests, but I was using my server code. Post all your test sketch.

This is similar to the code I used, but I did not use the form fields for the test.

edit: It took a bit for me to find the topic that had the info on this test. It is quite long, but page 16 is where it gets good.
http://arduino.cc/forum/index.php/topic,75324.0.html

Mine did ok when subjected to many requests, but I was using my server code. Post all your test sketch.

I have no problems either when the GETs are consecutive (and there is only one socket connected at any one time). It's only when more than one socket is connected that the failures happen.

I can post the test code but there is not much point. As I said, I used the standard "Webserver" example sketch from the Arduino 1.0.1 IDE Ethernet Library with one line added (line 64) to include some delay exactly as I explained in my post. I used the standard Webserver sketch to avoid any possibility that I had coded the sketch incorrectly.

If you go to the Arduino IDE 1.0.1 File menu / Examples / Ethernet / WebServer you have the test sketch I used and can add the delay(200) statement at line 64.
Obviously you will need to set an IP address and port number in the sketch that works in your network.

I don't use that server code in the ethernet examples. It malfunctioned when I tried it.

Just as a test, try the "web server with forms" code I posted in the link above. Does it do any better?

edit: I'm not saying there is no possibility of a new bug, but I have not run into it yet. You may be the first.

Ok ST. I'll try your code and see if I can break it the same way.

Could take me a day to run the test.

cheers

Keep me posted on this. If it is a bug, my record has been pretty good at finding and patching them.

edit: Sometimes it does take a while to find it. Here is my "everything is ok":

...and a few posts later, here is my "This is not good!":

Didn't take as long as I thought :slight_smile:

I added some delay in your code to hold the connections open a little longer

pch = strtok(NULL,"& ");
}
Serial.println("Sending response");
delay(200); // add some delay
client.write("HTTP/1.0 200 OK\r\nContent-Type: text/html\r\n\r\n

TEST

");

client.write("T:
");
client.write("R:
");

and used WebInject as I described in my first post to send consecutive GETs. I Also left the Arduino serial window open to see what was happening and to slow the connection processing down a bit.

I can provoke intermittent GET failures in WebInject by accessing your query form from IE while the WebInject test is running. Note you will need to refresh the IE page frequently to provoke the failures.

In other words your code breaks as well when it sees more than one connection at the same time.

When you say "breaks", do you mean the server code froze? Or did your connect or download fail, and the server code keeps running?

edit: Here is the start of the download test with draythomp:

He hit that code pretty hard, but it would not fail. Even timed out the send/receive a few times.

By "breaks" I just meant I could see the intermittent GET failures. Sorry - just my slang.

No the code doesn't freeze. What happens is the GET which arrives while another GET is being processed is just not handled and there is no response.

I found this problem with another sketch which serves up fairly complex webpages with js files, images etc. It took some investigation to track down why some of the GETs from the html were never being processed.

I've looked at the Ethernet code and if this really is a bug, it will be a tricky one to fix.

No the code doesn't freeze. What happens is the GET which arrives while another GET is being processed is just not handled and there is no response.

Until the w5100 finishes the previous socket connection, there will be no response. If you are connected to a socket that is not the active socket, you will get the "connected" message on your browser, and the little "whirly wheel" or "blinky line" while it waits for a response. If all the sockets are currently connected to clients, you will get a "server not responding" message.

Which are you getting?

With the sketch serving up the complex webpages I get the "waiting for a response" in the browser - or whirly wheel as you call it. When GETs receive no response, random parts of the webpage, images etc are never rendered.

I've looked at what is going on with a HTTP Debugger and could see that some GETs were never getting a response. I also noticed that if I could make sure the GETs didn't overlap everything works fine. For example, Firefox is well behaved and usually sends its GETs consecutively. IE 9 sends GETs which often overlap in time.

My sketch never freezes or completely stops responding. It is just when more than one GET arrives at the same time, one of the GETs doesn't get processed and there is no response.

cheers

IE is probably opening multiple sockets.
One for the main page,
three for images...
...and fail on the next.
Does that sound about right? That would be 4 sockets. That is all you have. :frowning:

I don't think the problem is related to the 4 socket limit.

My reason is that the WebInject test is only using one socket, but I can provoke a failure by using Firefox or IE to send just one more GET.

For example your forms sketch would only generate one GET from IE or Firefox. So the failures must be happening when only 2 sockets are being used.

EDIT : If the socket limit were being exceeded the message would be "Server not responding/busy"

I also checked the max connections per server registry key

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Internet Explorer\MAIN\FeatureControl\FEATURE_MAXCONNECTIONSPERSERVER

and it is set to 2

EDIT : If the socket limit were being exceed the message would be "Server not responding/busy"

Exactly! If that is not the case, then the challenge is elsewhere.

Then the WebInject program also shows errors when using the browser at the same time?

edit: You know that the web browser should be sending two GET requests, not just one. The first will be the webpage request, and the second will be for favicon.ico.

edit2: Have you tried to run two instances of WebInject?

I'll try to think of a way to test the sockets. If it is not using all 4, I would like to know why.

(moved this to post below)

Then the WebInject program also shows errors when using the browser at the same time?

Yes. The WebInject shows errors where it receives no response to its GETs.

And yes I'm aware of favicon.ico. Actually Firefox sends 2 GET requests for favicon.ico when it renders a page for some reason??

I thought of trying two or more instances of WebInject - just need to get my hands on another machine.

EDIT : Maybe we can look at this another way. Has anyone got a sketch which shows that 4 server sockets can be connected and work at the same time??

When I tested this last year with draythomp, we got some timeouts because we were hitting it so hard. It appeared to me to be using all 4 then. I had two localnet computers and draythomp hitting it. I would get the "whirly wheel", but always got the "connected" message and eventually a webpage, or the "server not responding" message.

And draythomp was "cheating". He was sending a request and not waiting for the server to respond before sending another request.

I'll try to think of a test.

edit: Keep in mind the public ip was posted here on the forum, so I am sure more than just draythomp and I were on that. For a good percentage of the time, my two computers would both show "connected" and the whirly wheel, and I showed draythomp as the active socket on the serial monitor. Eventually, the computers got a webpage.

I'll try to think of a way to test the sockets. If it is not using all 4, I would like to know why.

That might be part of the problem. You may have been attributing issues to something that you had no way of verifying. Also not really sure that your test methods actually do testing or just create issues or generate erronious conclusions in themselves. Some of the issues you report are normally expected when a server is busy with another client. Are you trying to serve up "images" from the arduino?

@zoomkat: I agree to a point. Like I said, mine seemed to allow at least two other connections (no response) while the w5100 did respond to the current socket/connection. That I expect. I have not tested it in a while. I think you have a couple posts in that topic where the test was done. It is 18 pages long. :astonished:

edit: zoomkat, you might be the person to test this. As I recall, you have server code that sends a page that auto-reloads every second or so. Put some delays in the server code so it takes a few seconds to upload to a client, then start about 3 computers on it. That was my test.