Playground WebClient code, hardware failure modes

First a big THANK YOU to @SurferTim for posting and maintaining this code. Easier to understand and more robust than earlier examples, IMHO. I'm not an Ethernet genius so I appreciate this!

I think I understand the code OK, my question is regarding the comment

// connectLoop controls the hardware fail timeout

...what are the failure modes being guarded against here? I can think of a few, just wondering if I'm missing any major ones:

  • The server on the other end goes down or out to lunch for an extended period of time, so a complete response is not received
  • Network propagation delays cause a response to not be received in a timely manner (this may be indistinguishable from the first bullet)
  • The W5100 finally gets a little too hot and lets the magic smoke go
  • ???

The first two plus network fails. Some assistant IT in Podunk reboots a router in the middle of your download. Your ISP's router down the street loses power. Your router loses power. You accidentally (or intentionally in my case to test it) pull the CAT5 cable out of the ethernet shield.

Anything fails between the server and your Arduino during your download, and the server will not be able to send a connection close message. At that point, the "while(client.connected())" loop becomes an endless loop. To prevent that, the connectLoop variable will exceed 10000, and exit what would be an endless loop.

I see it as a network timeout, not so closely related to hardware error. All it really does is ensure an end to a possibly endless loop when no data is received from a server.

I guess some other reasons could be: - DHCP has expired on client. - Network congestion caused an overflow and data was lost. - Network congestion forced a longer route, which in turn expired the packet TTL before delivery. - And like you say, the packet simply taking longer than the timeout ( which I guess is not an error, just unfortunate ). TCP's 3 way handshake could easily take its time one a busy network. - A physical link in the network has gone missing ( off, no cable ). - And I guess you can't overlook the possibility of dodgy Arduino side code from corrupting the W5100 state.

Thanks, guys, guess I was thinking along the right lines.

I almost forgot the infamous "Backhoe Fade".

Good to hear users find that example sketch useful.

If you want the same client code for the wifi shield, I posted it here. http://forum.arduino.cc/index.php?topic=200784.0

Another question if I may. When the timeout occurs, the client is stopped here, but then a second client.stop() will be executed here.

Is this required, or just a situation where it doesn't hurt?

It just doesn't hurt. If you call client.stop(), and the client socket has already closed, the function returns immediately.

SurferTim:
It just doesn’t hurt. If you call client.stop(), and the client socket has already closed, the function returns immediately.

Super, thanks. I’m going to try rewriting the code as a state machine, so that saves a state.

I’m actually using a custom PC board with a WIZ811MJ. It’s essentially equivalent to the Ethernet Shield, but the MCU can reset the W5100. I’ve just been doing this in setup(), e.g.

    digitalWrite(WIZ811MJ_RESET, LOW);
    delay(1);
    digitalWrite(WIZ811MJ_RESET, HIGH);

but I’m wondering if reliability could be improved by using this in the main code under certain conditions.

If absolutely necessary, you can reset the w5100 with your hardware. I prefer to find the cause and fix that. I'm digging into the wifi library code to find bugs in there now. They will not escape! ;)

Certainly. I was just wondering if there is a situation where the W5100 could get balled up even if the code is perfect (and I never assume mine is!) Maybe if a connection does not succeed after a certain number of attempts, a W5100 reset would be a reasonable thing to try?

[quote author=Jack Christensen link=topic=202859.msg1495406#msg1495406 date=1386246741] Certainly. I was just wondering if there is a situation where the W5100 could get balled up even if the code is perfect (and I never assume mine is!) Maybe if a connection does not succeed after a certain number of attempts, a W5100 reset would be a reasonable thing to try? [/quote] Yes, but it appears I am the only person who can do it. If a client connection is established, and before the client sends anything, the connection fails, the socket is lost. It requires almost a deliberate attempt to crash your server. In the many weeks of testing, I have never seen that situation occur "naturally".

SurferTim: Yes, but it appears I am the only person who can do it.

Ha, you da man! Test 'till it breaks!

If a client connection is established, and before the client sends anything, the connection fails, the socket is lost. It requires almost a deliberate attempt to crash your server. In the many weeks of testing, I have never seen that situation occur "naturally".

Since I have the ability to reset the W5100, I'll try to work that into the code. Sounds like it should be a rare occurrence, which is good.

Thanks for all your help, I really appreciate it. Have a great day!

[quote author=Jack Christensen link=topic=202859.msg1495428#msg1495428 date=1386247422] Ha, you da man! Test 'till it breaks! [/quote] That is what I do best! The electronics do not like being here. This place is like an eDungeon. Electronics go in, blown up pieces come out. I will bet the wifi shield PaulS loaned me can't wait to get back to Seattle. :)

LOL!

BTW, I’ve been testing the GroveStreams service and it looks really good. Quite reliable so far, which is not something I can say of other services I’ve tried. If you’re interested in such things, check out this dashboard I created. It’s basically your sample code with some instrumentation added to trap performance stats.