Revised Ethernet Library Client code

I've spent the last couple of days beating my head against a number of problems with the official Ethernet library. There are a variety of problems with the Client code that manifest themselves as connections that fail to close, and the inability to open multiple connections without running into a 35 second delay.

While there were a number of issues I addressed, the primary problem was in the stop() method that is meant to disconnect a connected client. The code was erroneously closing the connection without sending a FIN to gracefully disconnect, and then followed that immediately with a disconnect() which tries to gracefully shutdown with a FIN. This was causing the connection to get stuck in a FIN_WAIT state which made it unavailable for use until the state timed out after 35 seconds.

There were also a few scary opportunities where an array out-of-bounds condition could be triggered if the sketch called some of the client methods out of order. This could have led to memory corruption in the sketch.

There were too many changes to list the parts to patch here, so I'm simply making the revised Client.cpp file available. You can download it from:

EDIT Link removed because it is no longer valid EDIT

NOTE! This code is based on the Ethernet library included with Arduino-0015. Compatibility with previous versions is unknown.

There are a lot of comments detailing the changes included in the code.

Simply replace (after you back up) the Client.cpp file in your Ethernet library (hardware/libraries/Ethernet). Be sure to delete the old Client.o file or the code won't recompile.

Here's a sample sketch that demonstrates the improved client code by making 50 connections to a webserver as quickly as possible. Be sure to update the ip[] and server[] with addresses that are appropriate for you.

// Sample sketch to demonstrate the fixes to the Ethernet library Client
// Connect to a webserver 50 times as quickly as possible and read the resulting page

#include <Ethernet.h>

byte mac[] = { 0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xED };
byte ip[] = { xxx, xxx, xxx, xxx };
byte server[] = {yyy, yyy, yyy, yyy }; 

uint8_t connectStatus;
uint8_t closeStatus;
int ignore;

Client client(server, 80);

void setup()
{
  Ethernet.begin(mac, ip);
  Serial.begin(9600);
}

void loop()
{
  for (int i=1; i<=50; i++) {
    Serial.print("Connection Attempt ");
    Serial.print(i, DEC);
    ConnectToWebserver();
    delay(1);
  }

// Enter an infinite loop to "stop" the sketch
  while (true);
}


void ConnectToWebserver(){

  uint8_t connectStatus;
  
  if (client.connect()) {
    Serial.print(" - Connected");
    
// Send the HTTP GET to the server
    client.println("GET / HTTP/1.0");
    client.println();

// Read the response
    Serial.print(" - ");
    Serial.print(ReadResponse(), DEC);
    Serial.println(" bytes received");

// Disconnect from the server
    client.flush();
    client.stop();
    
  } else {
// Connection failed
    Serial.println(" - CONNECTION FAILED!");
  }
}


int ReadResponse(){
  int totalBytes=0;
  unsigned long startTime = millis();

// First wait up to 5 seconds for the server to return some data.
// If we don't have this initial timeout period we might try to
// read the data "too quickly" before the server can generate the
// response to the request

  while ((!client.available()) && ((millis() - startTime ) < 5000));

  while (client.available()) {
    char c = client.read();
    totalBytes+=1;
  }
  return totalBytes;
}

I've successfully tested a run of 5000 consecutive connections successfully so the code appears solid. The only thing I've noticed is an occasional failure to connect on the initial attempt. I think this might be related to a reset and how quickly you start trying to make connections. The reality is you need to write your code to be "tolerant" of failed connections. Connection failures happen in the real world - even in a controlled environment with a local web server.

Another common mistake I see in peoples' code is not allowing time for the server to generate a response before attempting to read from the client. Sometimes web servers take time to respond. They almost always take longer than the time it takes the MCU to execute the few lines of code people typically have between their connect() and read()'s. The example sketch demonstrates how to wait for response with a timeout.

Let me know how testing goes. :wink:

Great! Nice work. Can you post this to the developers mailing list: http://arduino.cc/mailman/listinfo/developers_arduino.cc? I'll try to incorporate it into the distribution soon.

I made a complete (but unfortunately long) post to the developer list. So all of the details are there for discussion.

Thanks for looking into all of this.

Is this a good time to merge it into Ethernet2 and use that instead? I don't think this code actually touches the low-level stuff that the Ethernet2 changes do.

How much of an issue is the "SPI busyness" concern? It's not like anything else will be using the SPI bus at the same time is it? Does the WIZnet chip have a problem if it's called repeatedly? If it's not absolutely required I'd suggest removing it as an unnecessary complication.

I wonder about having a private 'isValid()' method rather than repeating the check in the code? I guess this is probably a stylistic question.

The whole close/disconnect thing did always seem to be odd and I remember running into issues with it back when I first worked on WIZnet stuff, so it's good to see it might finally work reliably! :slight_smile:

--Phil.

I'm not really that familiar with the Ethernet2 library, but after a quick look it seems like most of the enhancements apply. From what I see the Client.cpp looks basically unchanged.

My main concerns on the "SPI busyness" have to do with:

  1. The impact on the bus if other SPI devices were being used. The tight loops continually checking the status of the socket would monopolize the bus. Maybe not an issue because the master is busy anyway?

  2. Potential impact on interrupt handlers. I'm not sure about the interrupt priority of the hardware SPI or if it's even an issue. It just didn't seem like a good idea to be thrashing the bus.

Since a loop is needed to keep checking the status anyway, the only issue is whether a delay(1) serves any value. The only negative is adding a millisecond to the connect or disconnect. There's so much other overhead that I wasn't able to make more than a few connections a second anyway so I can't see 1ms making much difference. Maybe it's not enough delay to help items 1 & 2 anyway and therefore moot?

Anyway, that's the reasoning behind the code.

I just tested this out with my Webduino library with my AjaxBuzzer sketch; I'm no longer getting resets trying to load pages, I can open lots of tabs of the main page without an issue, and I don't get dropped events sent to the board. I'm calling it a big success, and I hope this gets into arduino-0016.

I'm very happy with the new ethernet code, I don't have any problems with subsequent connections anymore! My code has been running for more than 48 hours continuously, sending an HTTP request to a webserver every 5 minutes.

However, my Arduino froze two times up to now, both some time after the webserver to which the request was sent became unavailable. I don't know exactly after how much time, but I guess it is around half an hour. Does anybody of you have any idea what could be the cause of this?

BTW, previously I had another problem that when my sketch became too large (> 11k) every connection resulted in a frozen Arduino (using the new client code). I "solved" this my reducing the sketch size (removing debugging Serial.print statements).

mickl - Your Arduino may be freezing up as a result of running out of memory and not the size of the sketch (although altering the sketch may have changed your memory consumption).

Check out this post:

http://www.faludi.com/2007/04/18/arduino-available-memory-test/

Hi etracer,

I am using your new ethernet library above.

When you have issued a client.stop(), what should client.connected() and client.status() return?

I am getting a '1' for both which suggests that the connection is still active.

Thanks etracer, the code above fixes the multiple connection problems. However, it wouldn't compile out of the box.

The following functions declared in Client.h are not included:

virtual void write(const char *str);
virtual void write(const uint8_t *buf, size_t size);

They need to be added to Client.cpp under Client:write(uint8_t b) so that the 3 write functions look like:

// Revisions for write()
// 1. Added a check for a valid socket to prevent a possible
//    array out-of-bounds memory corruption if the method
//    was called when not connected
//
void Client::write(uint8_t b) {
  if (_sock < MAX_SOCK_NUM)     // Valid socket?
    send(_sock, &b, 1);
}


void Client::write(const char *str) {
  if (_sock < MAX_SOCK_NUM)     // Valid socket?
    send(_sock, (const uint8_t *)str, strlen(str));
}

void Client::write(const uint8_t *buf, size_t size) {
  if (_sock < MAX_SOCK_NUM)     // Valid socket?
    send(_sock, buf, size);
}

This worked for me, I'm new to Arduino so your mileage may vary.

It sounds like you're trying to merge in the revised Client.cpp (which was designed for Arduino-0015) into Arduino-0016.

The newest 0016 release already has the changes included along with other enhancements and fixes. Don't try to use the revised Client.cpp code with it.

The newest 0016 (downloaded 6/7/09) didn't work. Same issues mentioned above, would connect fine the first time then fail upon each subsequent connection attempt. The revised code you posted above, with the fixes I mentioned, works fine. Seems like there may still be a bug in 0016?

I just tested the base library included with 0016 and it worked perfectly. Using the sample sketch I posted at the top of this topic I was able to make 50 consecutive connections to my web server and retrieve the page without a single failure. I don't see any problems.

My test environment is OSX, a Duemilanove with a ATmega 328, and an official ethernet shield.

If you're messing around with the source files for the library, be sure you delete the .o files otherwise the library code won't get recompiled and you'll be using an old (and probably incompatible) version.

Hi etracer,

I am using your new ethernet library above.

When you have issued a client.stop(), what should client.connected() and client.status() return?

I am getting a '1' for both which suggests that the connection is still active.

There are bugs in the status() and connected() functions. The good news is that the bugs only manifest themselves when the connection is in fact closed. Basically the functions are failing to check if a socket is bound to the client before checking the status of the connection. When the connection is closed, the socket is released and the code is trying to pass an invalid socket reference in the call to the Wiznet chip.

I'll write up a patch for this and send it to the developer list. In the meantime if you want to fix it yourself, replace the status() and connected() functions in Client.cpp with the code below (be sure to delete the .o files as well).

uint8_t Client::connected() {
  if (_sock == 255) {
    return 0;
  } else {
    uint8_t s = status();
    return !(s == SOCK_LISTEN || s == SOCK_CLOSED || s == SOCK_FIN_WAIT ||
      (s == SOCK_CLOSE_WAIT && !available()));
  }
}

uint8_t Client::status() {
  if (_sock == 255) {
    return SOCK_CLOSED;
  } else {
    return getSn_SR(_sock);
  }
}

Note that these bugs were in the earlier versions of the Ethernet library and weren't addressed by my revisions. As a result, the bugs are still in the current release 0016.

Thanks for the revised Client.cpp.

I had to add the two overloaded methods below to get my sketch to compile in Arduino-0017

void Client::write(const char *str) {
if (_sock != 255)
send(_sock, (const uint8_t *)str, strlen(str));
}

void Client::write(const uint8_t *buf, size_t size) {
if (_sock != 255)
send(_sock, buf, size);
}

They needed to be added because server.cpp called the following:

client.write(buffer, size)

I am not a very experienced programmer so this all may be proven wrong. Is this a good solution?

Hi,

I have similar problems, my client is not able to connect after first connection, but
link to Client.cpp is broken now and I can't find a way to fix client library...

Could anyone help me download revised client please?
All forums, sites and google searches are pointing only to this post :frowning:

All of the revised client code is already included in the current Arduino version (and the previous several releases).

Thanks! It works great now for most sites.
But I still have second connection problem to the one particular site, I hope I'll find a solution..

I noticed that I'm having the same problem with the new IDE 20. It doesn't seem to release the connection, when it has a connection problem. I used the test, increasing it to 150 attempts. It consistently locks up around 56, but then because of the flush command, it recovers. I added the flush and clear commands to my script and am testing them now.