Official Arduino WiFi Shield - Issues - Improvements - Call to action

Hi Folks,

I recently picked up an Arduino Wifi Shield and have bumped in to a few major setbacks with what I believe is the library and firmware. I won't post code, because you can use the "WifiChatServer" example sketch that ships with it to demonstrate the problems. Forgive me if I've missed something obvious here and please point out my error.

I know this is a new project and the code has only recently been made public. I'd like to rally some folks to see if we can wrap up a few bugs add some functionality and perhaps get some momentum going (or be informed that there already is some).

Basically I've experienced the following issues (one of which there is a bug report for). You can test this by uploading the example chat server sketch to your board+wifishield and running it.

SETUP:

  • Mac OS X 10.8.2
  • Arduino Uno Rev 2
  • Arduino WifiShield
  • Latest code as of Oct 21, 2012 from GitHub - arduino/wifishield: Arduino Wifi Shield
  • I re-flashed the shield with the binaries in the github repo to make sure it's latest (Consequently it broke WiFiDrv::getFwVersion(), which now returns an empty string whereas it used to return 1.0.0 when it was stock - Yes, I realize that the github repo contains 1.0.0 as well)

ISSUES:

  1. Server hangs up and turns in to a zombie: Make a socket connection to the chatserver. Don't send data. After ~12 seconds, the wifiserver will disconnect with a TCP RESET packet. You will no longer be able to initiate a connection (via nc,socat,telnet) to the server until you reset the Arduino. From a tcp/ip perspective, packets are still sent and ack'd from the server. It's just that the server is somehow dead. If you put in some debug code and call server.status(), after that event, it always returns 0. You can not run a server.begin() on it again as it has no effect.

If the you/client initiates the disconnect (ctrl-c, ctrl-d, etc) at any point. Things are fine. You can reconnect and be on your merry way. Just don't let the server disconnect you, or it'll go zombie.

Furthermore, this isn't an issue of the client not sending data. If you comment out all the server.write() calls so that the server NEVER returns data. It will always hang up after ~10-12 seconds, no matter how much client data is received (and logged to serial). This issue is specifically with the shield firmware or library timing out for some reason if the server never sends a byte of data.

WORKAROUND: You can write some heartbeat code to always send (server.write()) a NULL byte every 5 seconds to the client when connected and this issue no longer occurs. You can remain connected without worrying about the server going zombie.

References: Google Code Archive - Long-term storage for Google Code Project Hosting.

  1. Only one connection at a time: It looks like you should be able to handle up to 4 sockets, but the server never seems to return the additional connections in a server.available() call. Perhaps I misunderstand how to code this, but again, use the chatServer example and you'll see that only one connection at a time works. The additional connections over the initial one will send and receive tcp/ip packets, but the client connection never gets passed to the "application layer".

  2. Wifi Disconnects, server goes zombie: If the Wifi signal is lost you can detect this and have it reconnect. However, after that event, on a wifi.disconnect() then wifi.begin(), you can server.begin(), but server.status() always returns a 0. I can post modified chat server code for this on request, but this issue seems to be related to issue (1) above.

FUNCTIONALITY REQUEST:

  1. UDP server: Lack of code at the moment.

WHAT I NEED:

  1. Set up environment to compile firmware: Eclipse or otherwise. I've done quite a bit of reading and tracing of the code back to the firmware. I'd like for us to come together and see if we can get to the bottom of some of these issues. I've managed to compile avr32-toolchain for os x with the help of GitHub - jsnyder/avr32-toolchain: Makefile & supporting patches/scripts to build an AVR32 toolchain. with some slight modifications and I've got eclipse installed with the avr-plugin, CDT and avr32 studio, but I still can't import the wifiHD firmware project as I'm getting a "No file system is defined for scheme: framework" message on import. Any help on this would be appreciated.

  2. Expert advice: If anything I've written sounds crazy or I'm doing something wrong or misunderstanding, please point that out. If there is anyone who has some pointers to make my life easier in trying to track this bug down, please provide them.

  3. Development help: Anyone out there who's willing to donate some time. We'd all appreciate it.

Thanks so much! Hope to hear from you.

  1. Only one connection at a time: It looks like you should be able to handle up to 4 sockets, but the server never seems to return the additional connections in a server.available() call. Perhaps I misunderstand how to code this, but again, use the chatServer example and you'll see that only one connection at a time works. The additional connections over the initial one will send and receive tcp/ip packets, but the client connection never gets passed to the "application layer".

Last time I looked at the server.available() code it returned the first client that had data available. If one of the clients always has data available this will starve the others for attention.

This design also causes a problem for applications where the client expects to get an immediate reply from the server, even without sending a 'request'. Since the client doesn't have data to process it will never be returned by server.available().

Thanks for the reply John. This is totally true. However, if the first connection is idle and all the data has been read() or flush()'d, it still fails to serve the other client. Unless I'm missing something nuanced here, it still feels like something is broken.

I agree though, the design is not the most robust either. It can surely be improved.

I've constructed a workaround to reset the arduino (and aruduino wifi shield) when the wifi server is detected to be dead by rigging up a simple circuit that connects:

digital pin 8 -> 1kohm resistor -> Base pin of NPN transistor
Emitter pin of NPN transistor -> GND on Arduino
Collector pin of NPN transistor -> Reset pin on Arduino

Using this code, I write a HIGH to pin 8 and the whole shebang reboots and reconnects. Obviously this is just a temporary workaround until a fix is found in the example, libraries, or firmware (wherever it may be).

  // Reset arduino if the server is dead.
  if (server.status() != 1) {
    digitalWrite(8, HIGH); // Reset Arduino
  }

Regarding the idle disconnect in the firmware...

The lwip (Light Weight IP) tcp_poll function is being used by the firmware to define callback functions atcp_poll and atcp_poll_conn.

These are both checked roughly once every two seconds.

There is a value called tcp_poll_retries that is checked in these functions.

The TCP connection is aborted when tcp_poll_retries is > 4 (so 5) in atcp_poll or when tcp_poll_retries is > 8 (so 9) in atcp_poll_conn.

The tcp_poll_retries value is incremented by 1 every two seconds no data is sent.

The tcp_poll_retries value only appears to get reset to 0 when data is successfully sent [tcp_data_sent].

So, this appears to be an intentional mechanism used to kill the TCP connection when no data has been sent for 10 seconds or 18 seconds respectively.

The firmware and WiFi shield library should be modified to allow the user to configure or disable this timer.

I have tried to import the wifiHD project into Eclipse so I could look at making the change on the firmware side.

I have the AVR-Eclipse plugin installed.

However, like others have mentioned I am not able to successfully import this project into Eclipse.

Arduino Developers please update the github files (or provide additional instructions) so that the user base can start to make changes.

Thanks!

An Arduino developer has committed a fix for the TCP server hang-up issue! You'll have to grab it off of github and flash your shield. I haven't tested it yet, but I'll flash it in a couple hours and update this thread.

I am glad we are starting to see some movement with the firmware! :slight_smile:

Hopefully the Arduino developers will still provide (shortly) the project files and / or instructions for building the firmware.

Although the issue was written against the server mode, my comments [listed above and in the github issue] reflect that the disconnect problem will happen (in client mode) when tcp_poll_retries is > 8 (so 9) in atcp_poll_conn.

So, I expect the problem with idle disconnect is still present with the client mode of operation.

I will not be able to test this until much later today (at the earliest).

If anyone has a chance to test the client mode for this issue on the new firmware, please do so.

Thanks!

Sounds good. I did flash the new firmware and I've tested the server side of it and it appears to have fixed the server hanging up if the server doesn't send data for ~10-12 seconds. It also seems to have fixed the issue with the wifi connection being lost and it reconnecting, only to have the server be in a zombie state. I'll do more testing later today. I didn't get a chance to test multiple socket connections to the server though. This is nice, as already I'm not relying on the self-reset circuit I added and can take that digital I/O pin back, as I need it for something else.

The developers should be able to provide some documentation on importing the firmware in to eclipse so the we can start fixing this stuff on our own. I'm currently waiting on a response from them. Hopefully within the next day or so.

I want to make my board function as both web client and web server . So I combine "WifiWebServer" and "WifiWebClient" together. I used "WifiWebServer" as the base code. Tested it and it worked well. Then I add the following code just before server.begin():
WiFiClient client;
client.connect(someAddr, 80);
However, this time, the web server cannot respond to any request.

I also have troubles with my official Arduino Wifi Shield.
How to I upgrade to the newest firmware?I have never tried this before.

Thanks!

Hey,

for me the disconnect issue is still there with firmware update.
I currently just use the client part and get disconnects after a few seconds so f.e. a file transfer is
disconnected before file is completed. (tested with serialprint).

Hi AndreasW79,

Clearly the last update only fixed the server end of it. Your best bet is to create an issue @ Google Code Archive - Long-term storage for Google Code Project Hosting. so it's tracked.

Hi AndreasW79,

I have opened an issue under the github project for the sheild:

For those of you having problems upgrading the shield firmware - I just did a writeup on how to do it:

Hope it helps!

J

Jensa,

Thanks for the writeup! That's great.

I've confirmed the issue jia4234594 posted with the shield not being able to act as a client and server. I've opened an issue on github: Running a wifi server is mutually exclusive to using it as a client. · Issue #12 · arduino/wifishield · GitHub

Hopefully we can get this resolved. I keep bumping in to issues like this with the wifishield and it's causing me some pain =\

Hey guys,

I read all the previous threads and tried all the suggestions mentioned.

  1. Upgraded wifi shield's firmware
  2. Send empty byte to all clients every 5 sec when clients are available
  3. Even have a reset circuit for when server.status() = 0

I still have my server show server.status() = 0 at random times. It works for a little bit but then it all goes to mars or somewhere.

I have the arduino UNO + wifi shield + servo all hooked up with a 12 v power supply.

Any of you still experiencing this inconsistent connection to the LAN?

Thank you for your time fellas.

-Andres

Hi,

The good news : you're not alone...
The bad news : you're not alone...

I get the exact same inconsistent connection quite consistently.

What is strange is the fact that the examples alone seem to be more robust but I can't figure out why. I thought it might be related to a conflict between the pins used for the servos and those used for the shield, but after digging into the documentation of both the servo lib and the wifi lib, I settled on pins 5 and 6. Unfortunately without much success.

Best regards,

Benoît

Hi,

I too have a problem with the official wi-fi shield.

I can successfully POST data to a target website. However, when I stop the client and disconnect, even though client.connected() reports that the wifi shield has disconnected the LINK LED on the shield remains lit, indicating that the connection is still present(?). This is the code I use to test :

delay(10000);
    client.stop();
    WiFi.disconnect();
        
    if (client.connected())
    {
      Serial.println("Client is still connected");
    }
    else
    {
      Serial.println("Client is not connected");
    }

Also note that I am using Arduino IDE 1.0.3 to produce the code.

Note that I include the 10 second delay to ensure that the shield has plenty of time to POST the data (and it does succeed in transmitting data to the target server).

Has anyone else had this issue? Anyone know of a fix or a workaround?

FYI I want to only turn on the shield when I need to POST following a sensor trigger; I want to get as much battery life as possible. If Supermechanical's Twine device can get weeks of wifi operation from 2 AA batteries, then surely we can get the Arduino + wifi shield to behave similarly?

Hello

I am new to the Arduino world and believe that I am having the problem described in these posts. I have the Arduino Uno and the WiFi Shield. When I run the Wi-Fi web server sample code it works for variable intervals from several minutes to several hours. Then, for no apparent reason, it can no longer be reached. It goes dead. I am using the sample code because for the project I have in mind the device must be reliable and not go into a state where it cannot be reached. Can someone point me to a solution? Am I having the same problem as discussed in the thread above? Any advice would be welcome.

Thanks!
Bill

How exactly would one use the server.write() to overcome the server going zombie? What would it look in a code?