W5100 client (email and twitter) unreliable, but very reliable server

I've got a sketch which tweets and emails when it senses events, and also serves a webpage to give the up-to-date sensor status. The server is always solid and I can always get the page to load even after hours of uptime. But sending email and twitter, I frequently get errors that the connections can't be established, not sure why. The server continues to work.

Sometimes when the Arduino comes up, the email/tweets are working OK, and the server is OK. Then eventually the email/tweets die but the server keeps on going. Other times, right when Arduino comes up, it can't email/tweet, but again, server is OK.

Any suggestions? Saw another thread about the shield being unreliable (this: http://forum.arduino.cc/index.php?topic=85342.0), but since my server is working, I'm not sure it applies.

Here's the code, but it's pretty big - sorry for not trimming it down. I'm looking for ideas before proceeding:
http://pastebin.com/3Ktx1wU9

Which Arduino are you using? Hopefully a Mega or better.

Add this function to your sketch and call it before or after a twitter or email attempt. If there is no socket with a status of 0x0, then the connect will fail. You apparently use a socket for the server and a socket for UDP, so you have only two sockets available for TCP client connections.

#include <utility/w5100.h>

byte socketStat[MAX_SOCK_NUM];

void ShowSockStatus()
{
  for (int i = 0; i < MAX_SOCK_NUM; i++) {
    Serial.print(F("Socket#"));
    Serial.print(i);
    uint8_t s = W5100.readSnSR(i);
    socketStat[i] = s;
    Serial.print(F(":0x"));
    Serial.print(s,16);
    Serial.print(F(" "));
    Serial.print(W5100.readSnPORT(i));
    Serial.print(F(" D:"));
    uint8_t dip[4];
    W5100.readSnDIPR(i, dip);
    for (int j=0; j<4; j++) {
      Serial.print(dip[j],10);
      if (j<3) Serial.print(".");
    }
    Serial.print(F("("));
    Serial.print(W5100.readSnDPORT(i));
    Serial.println(F(")"));
  }
}

A socket status list:
0x0 = available
0x14 = waiting for a connection
0x17 = connected
0x1C = connected waiting for close
0x22 = UDP

BTW, a lot has changed since that thread you posted a link to above. My client and server code is working much better with some error checking and fault tolerance stuff added. You might want to check out the new code I use.
http://playground.arduino.cc/Code/WebClient
http://playground.arduino.cc/Code/WebServerST

Yes, this is a Mega. I'm curious, why do you ask? Is the code too long otherwise? Binary is ~30k.

Should I be switching to your new client/server libraries?

Thanks a ton for the diagnostic function. I added it and am including the outputs from Serial Monitor in this post.

I turn on the arduino, and right away, it tweets and emails no problem:

//EMAIL AND TWEET BOTH WORKED//

tweetBuffer (length 0): 
before concat length test... buffer:19:59:43 Sun 21  Dec 2014: Washer Turned on
Finished emailBuffer concat, which is: 19:59:43 Sun 21  Dec 2014: Washer Turned on
Finished tweetBuffer concat, which is: 19:59:43 Sun 21  Dec 2014: Washer Turned on
Washer Turned on
emailBuffer (length 43): 19:59:43 Sun 21  Dec 2014: Washer Turned on
Socket status before email:
Socket#0:0x14 80 D:0.0.0.0(0)
Socket#1:0x22 8888 D:138.236.128.112(123)
Socket#2:0x0 1037 D:192.168.10.1(53)
Socket#3:0x0 0 D:0.0.0.0(0)
sendEmail connecting to: mail.aplace.com:26
connected...
250 OK
235 OK
250 OK
250 OK
354 OK
250 OK
Email sent
Email Buffer Cleared
tweetBuffer (length 43): 19:59:43 Sun 21  Dec 2014: Washer Turned on
emailBuffer (length 0): 
tweetBuffer (length 43): 19:59:43 Sun 21  Dec 2014: Washer Turned on
connecting to twitter ...
19:59:43 Sun 21  Dec 2014: Washer Turned on
Socket status before tweet:
Socket#0:0x14 80 D:0.0.0.0(0)
Socket#1:0x22 8888 D:138.236.128.112(123)
Socket#2:0x0 1025 D:69.89.25.165(26)
Socket#3:0x0 0 D:0.0.0.0(0)
Trying to tweet: 19:59:43 Sun 21  Dec 2014: Washer Turned on
Tweeted!
emailBuffer (length 0):

Next I Reset the Arduino. This time, the email fails, but the tweet works immediately after (within 10 seconds after the failed email). I see right before the email, I have two 0x0 ports, so I'd have thought a socket should have been available for that email to go.

//EMAIL DID NOT WORK, BUT TWEET WORKED//

tweetBuffer (length 0): 
before concat length test... buffer:20:00:23 Sun 21  Dec 2014: Washer Turned on
Finished emailBuffer concat, which is: 20:00:23 Sun 21  Dec 2014: Washer Turned on
Finished tweetBuffer concat, which is: 20:00:23 Sun 21  Dec 2014: Washer Turned on
Washer Turned on
emailBuffer (length 43): 20:00:23 Sun 21  Dec 2014: Washer Turned on
Socket status before email:
Socket#0:0x14 80 D:0.0.0.0(0)
Socket#1:0x22 8888 D:24.23.190.188(123)
Socket#2:0x0 1037 D:192.168.10.1(53)
Socket#3:0x0 0 D:0.0.0.0(0)
sendEmail connecting to: mail.aplace.com:26
No Email connection
ERROR: email did not send... will try again later
tweetBuffer (length 43): 20:00:23 Sun 21  Dec 2014: Washer Turned on
emailBuffer (length 43): 20:00:23 Sun 21  Dec 2014: Washer Turned on
tweetBuffer (length 43): 20:00:23 Sun 21  Dec 2014: Washer Turned on
connecting to twitter ...
20:00:23 Sun 21  Dec 2014: Washer Turned on
Socket status before tweet:
Socket#0:0x14 80 D:0.0.0.0(0)
Socket#1:0x22 8888 D:24.23.190.188(123)
Socket#2:0x0 1025 D:69.89.25.165(26)
Socket#3:0x0 0 D:0.0.0.0(0)
Trying to tweet: 20:00:23 Sun 21  Dec 2014: Washer Turned on
Tweeted!
emailBuffer (length 43): 20:00:23 Sun 21  Dec 2014: Washer Turned on
tweetBuffer (length 0):

I have a retry after a minute - the email did work after that minute elapsed without restarting Arduino

//ONE MINUTE LATER, EMAIL WORKED (NO RESTART FROM LAST BLOCK)//

emailBuffer (length 43): 20:00:23 Sun 21  Dec 2014: Washer Turned on
Socket status before email:
Socket#0:0x14 80 D:0.0.0.0(0)
Socket#1:0x22 8888 D:24.23.190.188(123)
Socket#2:0x0 1026 D:74.125.28.141(80)
Socket#3:0x0 0 D:0.0.0.0(0)
sendEmail connecting to: mail.aplace.com:26
connected...
250 OK
235 OK
250 OK
250 OK
354 OK
250 OK
Email sent
Email Buffer Cleared
tweetBuffer (length 0):

The other post said I can't use the LED on pin 13 else I may interfere with the Ethernet shield. Is that true? I am using the LED at the moment.

UPDATE 1: I just ditched the PIN13 LED and it seems to be working at turn on a lot more reliably. I'll keep testing (at restart and for longer terms) it to see if the email/tweet has problems after the change.

Update 2: Bummer, the email and tweets are still unreliable after removing the PIN13 LED.

BTW, I'll be out for a few days, but back to it on Jan 1 :grin:

You can’t use pin 13 LED on an Uno. You can on a Mega.

It appears you are just not connecting to the email server. Maybe it is busy? Could be the same with the Twitter server. Or maybe your network is busy?

edit: The w5100 will try to connect 8 times (200ms wait per connection attempt) before returning fail. You can change both those parameters. Maybe extending the wait might help.

If you add this in setup after Ethernet.begin(), this will try 4 times at 400ms wait per attempt.

#include <utility/w5100.h>

W5100.setRetransmissionTime(400);
W5100.setRetransmissionCount(4);

So based on the outputs I posted, do you think number of available sockets is sufficient?

Unsure if the network is busy - could I test that somehow? This is a home network with not a lot going on, and a high speed internet connection. I do know that I'm not reaching twitter when it fails because Twitter would reply with different error codes if I hit their server but it decided not to do the desired task.

So this is promising! I added the Ethernet retransmit count/time changes you suggested and tested arduino's reliability by restarting 10 times. Email worked all 10 times which seems better than before. Tweeting worked 8/10 times on the first try, and 10/10 on the second try. I'll do some longer term reliability testing and report back.

One new issue cropped up: the DNS lookup for the NTP server IP address at the beginning used to be 100% reliable. Now, about half the time it takes two tries to resolve DNS (I mean tries from my calling it, not Ethernet-internal retransmits). Should I go back to 8/200 Ethernet retrans/wait for DNS, then use the 4/400 for everything else perhaps?

It appears you have 2 sockets available (#2 and #3) for your email and Twitter. Those are the sockets with the status of 0x0. Socket 0 was being used as the listening socket for the server (0x14) and socket 1 was being used as the UDP socket (0x22).

I don't know about the DNS. It is normally UDP, so there is no real "connection" to establish. I don't know if DNS uses the retransmission time for a timeout.

Good news. After 10 days, Tweets, emails, NTP, and DNS are all apparently working as intended. I'll credit it to the adjusted ethernet retransmit timeout and number of retransmits SurferTim suggested. Thanks again! :grin:

Follow up. Intermittent issue sending email and tweets discussed above is still happening, though less regularly now. I am powering the Mega and ethernet shield using a USB wall wart. If that wasn't providing enough current, could that cause the network problems I'm seeing?

What is the current rating on the wall wart? If 500ma or more, probably not the problem. You can always check the 5v bus on the Mega and see if it is 5 volts. An o-scope would be best for this to check possible sags in the voltage during ethernet transmissions.

Yep, wall wart had 500 mA. Maybe my problem was I was not releasing the sockets properly when I was done with them. I never did figure out what the fix was for this above problem. So I changed my approach.

I decided to just simplify my Arduino code. Remove NTP, remove twitter, remove SMTP. I kept the Arduino web server so I can pull status from a browser still. Then for push notifications, I setup a ~15 line PHP script on a webserver which sends an email based on a GET string. The notifications I have are simple enough that this works. Arduino just visits this PHP website as a client with the notification message in the get string, and PHP appends the current time to the message and forwards the email on to me.

The project doesn't have Twitter messaging for the time being. If I add it back, I'll just put it in the PHP script.