Ethernet (w5100) sketch hangs

I'm having a problem and I wonder if anyone can share insight.

I have a sketch running on an Uno with an ethernet (w5100) shield. The sketch monitors a weather station and about once a minute sends a HTTP GET request to a web server.

I'm having trouble keeping the sketch running. After a certain number of iterations it just stops. Initially it was stopping after about 200 or 300 updates. The number would vary from run to run and was as low as 50 once. I changed it so that Ethernet.begin() gets called before each GET request. This has improved reliability somewhat, now I can go about 1500 updates before the sketch hangs. This still is not acceptable reliability, as it means the sketch typically runs for about 24 hours.

Resetting the arduino gets it running again, until it hangs again.

When it is locked up it does respond to a ping.

I suspect that some flakiness within the ethernet shield is causing the problem -- but of course it's human nature for me to believe that my code can't possibly be the problem. That said, my code is relatively simple, the problem is non-deterministic, and calling Ethernet.begin() made it work better, all of which combine to make me suspicious.

So, a couple of questions: First, is what I'm trying to do uncommon? Are there people out there who use the ethernet shield for tens of thousands or hundreds of thousands of connections without resetting? Does it work? Second, if it is flakiness, is there a way that I can safely have my sketch do a hardware reset periodically?

Any thoughts are appreciated.

Thanks.

Running out of RAM?

I suspect that some flakiness within the ethernet shield is causing the problem -- but of course it's human nature for me to believe that my code can't possibly be the problem. That said, my code is relatively simple, the problem is non-deterministic, and calling Ethernet.begin() made it work better, all of which combine to make me suspicious.

Yes, the hardware rewires itself depending on how it is feeling. :) I'd go with the loss of memory as a suspect area due to a failed connections or similar.

Maybe if you posted your code, someone might be able to help.

This post gives a plausible explanation: http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1235991468/14#14

That link is not a plausible explanation to me.

Since you did not post your code, I used mine as a test. I started it last night. Over 8 hours and 525 downloads later, it is still running fine. I did not need to call Ethernet.Begin() again or anything like that. I even broke the connection for a few minutes, and it resumed after the reconnect as if nothing had happened.

I use a MemoryFree library to determine if there are memory leaks. It reported 7367 free on startup, and 7369 free after every download.

So, do you want to post your code? I'll show you mine if you'll show me yours! :D

Update: About three hours later, and it just passed 700 downloads. Here is my declaration for the function called by the loop that does the download:

byte getPage(byte *ipBuf,char *page);

OK, some code:

int HTTPClient::Get(const char * host, char * page, byte * hostaddr)
{

// hits a webserver
// return 1 on success, 0 on failure.  Success is defined as receiving 20 or more bytes
// discards the returned page

// host is the name of the host -- needed for HTTP 1.1
// page is the URL minus the host
// hostaddr is the IP address of the host
  
  Client client(hostaddr, 80);
  if (client.connect()) {
    client.print("GET ");
    client.print(page);
    client.println(" HTTP/1.1");
    client.print("Host: ");
    client.println(host);
    client.println();
  } 
else {
	client.stop();
	return 0;
  }

int done=0;
int nread;
nread=0;
unsigned long begintime;
begintime=millis();

while(client.available()==0 && millis()-begintime < 4000)
{
};

while(!done)
{
  if (client.available()) {
    char c = client.read();
     nread++;
  }

  if (!client.connected()) {
    client.stop();
    done=1;
  }
}

if(nread> 20)
{
	return 1;
}

return 0;

}

Other than calling Ethernet.begin() that’s the extent of my involvement with the Ethernet library.

Fair enough. Here is what I have working. Compile and upload. Open the IDE serial window and watch.
If you close the serial window and reopen it, it will restart the arduino. Mine has been open all night.
Check the code for comments. There are a few places that need editing to your settings.

#include <SPI.h>
#include <Ethernet.h>

byte mac[] = {  0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xED };

// change to your localnet settings
byte ip[] = { 192,168,0,2 };
byte gateway[] = { 192, 168, 0, 1 };
byte subnet[] = { 255, 255, 255, 0 };

// change this to your server ip.
byte server[] = { 1,1,1,1 };

void setup() {
  Ethernet.begin(mac, ip, gateway, subnet);
  Serial.begin(9600);
  delay(2000);
}

int totalCount = 0;
int loopCount = 0;

void loop()
{
  if(loopCount < 60)
  {
    delay(1000);
  }
  else
  {
    loopCount = 0;
    totalCount +=1;

// change "/arduino.html" to the page you want to download from the server. No larger than about 100 characters.
// if you want the default page, use "/"
    if(!getPage(server,"/arduino.html")) Serial.print("Fail ");
    else Serial.print("Pass ");
    
    Serial.println(totalCount,DEC);
  }    

  loopCount += 1;
}

byte getPage(byte *ipBuf,char *page)
{
  Client client(ipBuf, 80);

  int inChar;
// If you need a large url, increase this
  char outBuf[128];
  
  Serial.print("connecting...");

  if(client.connect())
  {
    Serial.println("connected");
    sprintf(outBuf,"GET %s HTTP/1.0\r\n\r\n",page);
    client.write(outBuf);
  } 
  else
  {
    Serial.println("failed");
    return 0;
  }

  while(client.connected())
  {
    while(client.available())
    {
      inChar = client.read();
// I only wanted to know if the download worked, not what it contained
// uncomment next line to see the data downloaded
//      Serial.write(inChar);
    }
  }

  Serial.println();
  Serial.println("disconnecting.");
  client.stop();
  return 1;
}

Edit: I forgot to use the gateway and subnet in the Ethernet.begin() call. My bad!

Is it the Ethernet chip that's crashing or the Arduino?

Make the Arduino flash its onboard LED in loop(), write a log to the serial port, etc.

Find out exactly where the jam happens.

@fungus: I found that LED blink check (pin 13) is "ethernet death" for an Uno. Works ok for a Mega.

I've had substantial trouble with three of the ethernet implementations out there: the WiShield, 5100 and 28whatever (can't remember right now). The 5100 was by far the best of the lot, but it hangs up and fails pretty regularly. I spent hours on the web looking for solutions, and while some of them were promising, none of them actually solved the problems. I finally settled on the 5100 chip and the regular arduino library as the best of the lot because it took a heck of a lot less memory than any of the others and didn't fail as often. However, it does fail.

Since the underlying cause could be in the firmware of the 5100, I decided to take a different tactic. I reset the darn thing when it locks up. There were two failure modes that I ran into: 1. It just wouldn't connect when it starts up and 2. once connected, it would fail after running quite nicely for several hours. I added some wires to the board and cut a couple of traces and was able to sample what the board was doing and force a reset by lowering an arduino pin. This was prompted by the original arduino 5100 ethernet board having a reset problem. Since I had to get in there and fix something anyway, might as well go all the way. Note that the newer ethernet board released with the UNO has exactly the same problems, but the power on reset is fixed.

Now, by having some blinking lights on the various projects, I can tell when the board locks up and watch the arduino sense it and reset the ethernet board. Since sometimes this doesn't fix it, I go further and reset the arduino. This all happens automatically and I never have to walk over and do it myself. This works really, really well. Since the arduino can reset and be back at work in a couple of seconds and the ethernet board only takes a few seconds to initialize, the various devices never miss a beat. I've been updating to pachube at minute intervals for months now, and aside from times when I have something off line changing it, or the power company has a problem, it has consistently worked.

Trying to find the problem was just impossible. The boards would fail after an unknown number of hours (sometimes minutes) after an unknown series of events, at an ungodly hour of the night. If I tried logging debugging information, it was too voluminous to try and wade through to find something of value. I spent at least a month trying to figure out what was going on before I just gave up and started cutting wires.

SurferTim:
@fungus: I found that LED blink check (pin 13) is “ethernet death” for an Uno. Works ok for a Mega.

Ding! Ding! Ding!

My sketch is blinking the LED…

Exactly the kind of first-hand knowledge I was looking for.

draythomp: I've had substantial trouble with three of the ethernet implementations out there: the WiShield, 5100 and 28whatever (can't remember right now). The 5100 was by far the best of the lot, but it hangs up and fails pretty regularly. I spent hours on the web looking for solutions, and while some of them were promising, none of them actually solved the problems. I finally settled on the 5100 chip and the regular arduino library as the best of the lot because it took a heck of a lot less memory than any of the others and didn't fail as often. However, it does fail.

Since the underlying cause could be in the firmware of the 5100, I decided to take a different tactic. I reset the darn thing when it locks up. There were two failure modes that I ran into: 1. It just wouldn't connect when it starts up and 2. once connected, it would fail after running quite nicely for several hours. I added some wires to the board and cut a couple of traces and was able to sample what the board was doing and force a reset by lowering an arduino pin. This was prompted by the original arduino 5100 ethernet board having a reset problem. Since I had to get in there and fix something anyway, might as well go all the way. Note that the newer ethernet board released with the UNO has exactly the same problems, but the power on reset is fixed.

Now, by having some blinking lights on the various projects, I can tell when the board locks up and watch the arduino sense it and reset the ethernet board. Since sometimes this doesn't fix it, I go further and reset the arduino. This all happens automatically and I never have to walk over and do it myself. This works really, really well. Since the arduino can reset and be back at work in a couple of seconds and the ethernet board only takes a few seconds to initialize, the various devices never miss a beat. I've been updating to pachube at minute intervals for months now, and aside from times when I have something off line changing it, or the power company has a problem, it has consistently worked.

Trying to find the problem was just impossible. The boards would fail after an unknown number of hours (sometimes minutes) after an unknown series of events, at an ungodly hour of the night. If I tried logging debugging information, it was too voluminous to try and wade through to find something of value. I spent at least a month trying to figure out what was going on before I just gave up and started cutting wires.

Thanks for the extremely helpful post. Your experience sounds a lot like mine, and I feel like you just saved me a month!

Can you talk a little bit about how you sense that the ethernet board has locked up, and how you reset the arduino?

Sure, take a look at my blog. The instructions aren't perfect for the latest ethernet board, but you can see what is done. Scroll down the page a bit. The code I used is there too, but it's all wrapped up with the other code in the sketch so it may be a little harder to understand.

http://www.desert-home.com/p/super-thermostat.html

MarkT: Running out of RAM?

That could cause a sketch to reset? I just used MemoryFree library and reported 92 just at the end of the setup(). Is this bad? Could this be the reason for my program to reset from time to time?

Yes, it sure can. See, the stack is part of memory and when you over write it and then return, you go off into space and eventually some code somewhere does a call to location zero. That's the definition of an unexpected reboot. 92 bytes isn't enough to handle the things you want to do. My experience is that I have to have a couple of hundred minimum to survive the various calls and such that use up memory. Every temporary variable, subroutine call, etc chews up bytes.

draythomp: Yes, it sure can. See, the stack is part of memory and when you over write it and then return, you go off into space and eventually some code somewhere does a call to location zero. That's the definition of an unexpected reboot. 92 bytes isn't enough to handle the things you want to do. My experience is that I have to have a couple of hundred minimum to survive the various calls and such that use up memory. Every temporary variable, subroutine call, etc chews up bytes.

wow you just made me learn something that has been driving me crazy with my projects. I didnt know about this problem with RAM and never checked it. I was having random reboots of my project and couldnt know why. maybe this is the reason. all i was doing is commenting and uncommenting code but things were very random. so maybe this is the reason for it !!! Is there anything in the playground (have not found it myself) or anywhere else where it gives tips on how to save RAM ? I need to rewrite part of my code to get some free space. So u say around 200 is enough right? thank you a lot !!! pd. i have tried to print the freememory every 10 seconds (which is the timing for uploading some stuff to a mysql database) and arduino reboots. so i guess i am over the limit and checking free memory is causing the problem you explain,right?

OK, I know you're excited, but there are other things that can cause problem and this is only one of them. But, it is a big one if you don't know about it. Now you do, so it should help you a lot. Things like passing an object to a function can take a ton of bytes and you don't have a clue it's happening; concatenating strings using the String library allocates a new string while the old one is still around and can cause an unexpected problem totally behind your back. Heck, I've over run the packet buffer in the ethernet library a couple of times doing silly things I should have known not to do. Yes, for me and my coding style, a couple of hundred bytes keeps me out of trouble. You will likely have different results. Test it and play around; learn to work with it.

As to saving memory: Yes there are things you can do, but fixing it depends on the situation. The one I use a LOT is to put things like strings and such in PROGMEM to save space in ram. So, check the playground for notes on progmem and the macros you can use to save space. This set of macros and using them has made really complex things possible on the arduino for me. I'd give you examples and such, but I would just be duplicating the stuff in the playground.

Have fun.

Sergegsx: I just used MemoryFree library and reported 92 just at the end of the setup(). Is this bad?

Very.

Sergegsx: Could this be the reason for my program to reset from time to time?

Definitely.

SurferTim: @fungus: I found that LED blink check (pin 13) is "ethernet death" for an Uno. Works ok for a Mega.

Oh, I didn't think of that.

DCContrarian: Ding! Ding! Ding!

My sketch is blinking the LED...

Exactly the kind of first-hand knowledge I was looking for.

Pin 13 is where the SPI clock pulses go out so in theory it could affect any SPI device.