EthernetUDP and EthernetClient/Server class compatability

Hi

Does anyone have any knowledge about compatibility between the EthernetUDP, EthernetClient and the EthernetServer classes? I use the EthernetClient class extensively without problem and the EthernetUDP class once a day to refresh my system time. Here is my ReadTimeUDP() function:

boolean ReadTimeUDP() {
  const byte c_proc_num = 24;
  Push(c_proc_num);
  IPAddress G_TimeServer(132, 163, 4, 101); // time-a.timefreq.bldrdoc.gov NTP server
  // IPAddress timeServer(132, 163, 4, 102); // time-b.timefreq.bldrdoc.gov NTP server
  // IPAddress timeServer(132, 163, 4, 103); // time-c.timefreq.bldrdoc.gov NTP server
  const unsigned int C_UDPlocalPort = 8888;      // local port to listen for UDP packets

  if (!G_EthernetOK) {  //Indicates that the EthernetServer is up
	Pop(c_proc_num);
	return false;
  }

  //If the EthernetClient is active just wait for it to timeout
  //Otherwise it seems we will likely get a system crash
  if (G_EthernetClientActive) {
	Pop(c_proc_num);
	return false;
  }

  ActivityWrite(EPSR(E_UPD_NTP_hy_Start_3496));
  boolean l_result = false; //D
  G_Udp.begin(C_UDPlocalPort);
  CheckRAM();

  sendNTPpacket(G_TimeServer); // send an NTP packet to a time server
  ActivityWrite("- Send");

  // wait to see if a reply is available
  delay(1000);
  int l_pck_size = G_Udp.parsePacket();
  ActivityWrite("- Packet");
  if (l_pck_size == DC_NTP_PACKET_SIZE) {
	// We've received a packet, read the data from it
	//Serial.print("Packet Size: ");
	//Serial.println(l_pck_size);
	G_Udp.read(G_PacketBuffer,DC_NTP_PACKET_SIZE);  // read the packet into the buffer
    ActivityWrite("- Read");
	//the timestamp starts at byte 40 of the received packet and is four bytes,
	// or two words, long. First, extract the two words:
	unsigned long l_highWord = word(G_PacketBuffer[40], G_PacketBuffer[41]);
	unsigned long l_lowWord = word(G_PacketBuffer[42], G_PacketBuffer[43]);
	// combine the four bytes (two words) into a long integer
	// this is NTP time (seconds since Jan 1 1900):
	unsigned long l_secsSince1900 = l_highWord << 16 | l_lowWord;

	//Adjust for the millis() offset when we began
	l_secsSince1900 -= (millis() / 1000);

	// now convert NTP time into UNIX time starting Jan 1 1970
	//In seconds, that's 2208988800:
	const unsigned long l_seventyYears = 2208988800UL;
	unsigned long l_epoch = l_secsSince1900 - l_seventyYears;

	//Now add NZ Time Advance (We are still in seconds)
	l_epoch = l_epoch + (long(3600) * 12); //NZST - we record time in NZST and adjust time requests for DST

	//Now adjust to Ardy time starting 01 Jan 2013
	unsigned long l_ArdyOffset = (15705L * 86400L); //
	unsigned long l_ArdyTimeSecs = l_epoch - l_ArdyOffset;

	//Now calculare Ardy days
	unsigned long l_ArdyDays = (l_ArdyTimeSecs / 86400L);

	//Now calculate Ardy seconds
	unsigned long l_ArdySeconds = l_ArdyTimeSecs - (l_ArdyDays * 86400L);
	//Scale Ardy Seconds to Ardy time - a fraction of 100,000 (Ardy units per day)
	long l_ArdyTime = l_ArdySeconds * 1000L / 864L;

	//Finally ass Ardy days to Ardy seconds to create Ardy Time
	long l_ArdyDateTime = (100000L * l_ArdyDays) + l_ArdyTime;
	//And initialise the Ardy clock

	//Serial.print(F("NTP Date: "));
	//Serial.println(DateToString(ArdyDateTime));
	//Serial.print(F("NTP Time: "));
	//Serial.println(TimeToHHMMSS(ArdyDateTime));

    ActivityWrite("- Pre Init");
	InitialiseStartDatetime(l_ArdyDateTime);
    ActivityWrite("- Post Init");
	l_result = true;
  }
  CheckRAM();
  G_Udp.stop();
  Pop(c_proc_num);
  return l_result;
}

Everything works fine until occasionally my system crashes. It crashed this morning (after about 16 days continuous running) doing its daily UPDNTP time update sometime after the “ActivityWrite(EPSR(E_UPD_NTP_hy_Start_3496));” code line was executed and before this calling procedure wrote an OK or failure message in my activity log file:

void CheckForUPDNTPUpdate() {
  const byte c_proc_num = 45;
  Push(c_proc_num);
  //We get time updates at 6:27am each day
  if ((G_NextUDPNTPUpdateTime == 0) || (Now() < G_NextUDPNTPUpdateTime)) {
	Pop(c_proc_num); //Pop off the G_CallHierarchy stack when we return early
	return;
  }
  G_UDPNTPOK = ReadTimeUDP(); //Will error out if G_EthernetOk == false
  CheckRAM();
  if (G_UDPNTPOK) ActivityWrite(EPSR(E_UDP_NTP_OK_83));
  else            ActivityWrite(EPSR(E_UDP_NTP_Failure_94)); //Can happen if returned packet not 48 bytes
  //If the UDPNTP time update failed we update to the next update time and skip this day's update
  G_NextUDPNTPUpdateTime = Date() + 100000 + C_UDPNTPUpdateTime; //Tomorrow @ 6:27am
  CheckRAM();
  Pop(c_proc_num);
}

Ignore these code lines in my ReadTimeUDP() procedure - I only inserted them this morning in an attempt to isolate crashes in the future:

ActivityWrite("- Send");
ActivityWrite("- Packet");
ActivityWrite("- Read");
ActivityWrite("- Pre Init");
ActivityWrite("- Post Init");

In the past when my system has crashed during UPD NTP time updates there has been evidence (in my systems various logs) of contemporaneous html requests occurring within my application. As the code stands the EthernetClient object had not been created but it is likely that EthernetServer.available() was holding buffered EthernetClient data. I was wondering if it is possible that the receipt of contemporaneous html requests coming into the Arduino ethernet hardware (for the EthernetServer via port 80) is corrupting the UPD NTP operation that runs on port 8888.

My UDPNTP time update code is checking for existing EthernetClients before it starts. But it does not attempt to handle, recognise or prevent a new browser html request arriving in the middle of its UDP NTP processing.

I do not think that this is a memory crash problem. My extensive memory checking functionality suggests that I never have less than about 850 bytes of free RAM and my application does not suffer from heap memory fragmentation - even after 16 days.

Any advice would be appreciated. As it stands now my application does not run reliably and I may have to disable daily automatic time updates using UDP NTP.

You can see my application at www.2wg.co.nz but checking out how it runs will not address this issue unless you want to use your browser to send an html request at exactly 6:27AM NZ time and try to force a system crash at a time when my system is known to be also doing a UPD NTP time update.

Cheers

Catweazle NZ

Are you starting and stopping the UDP socket for a reason?

G_Udp.begin(C_UDPlocalPort);
// your UDP code here
G_Udp.stop();

Why not leave it active? I start the UDP socket in setup() and don’t stop it, and mine works fine.

You might want to check your sockets. Add this code to your sketch and call it occasionally, like before the function call where you think the sketch may be failing. Maybe you are running out of sockets.

#include <utility/w5100.h>

byte socketStat[MAX_SOCK_NUM];

void ShowSockStatus()
{
  for (int i = 0; i < MAX_SOCK_NUM; i++) {
    Serial.print(F("Socket#"));
    Serial.print(i);
    uint8_t s = W5100.readSnSR(i);
    socketStat[i] = s;
    Serial.print(F(":0x"));
    Serial.print(s,16);
    Serial.print(F(" "));
    Serial.print(W5100.readSnPORT(i));
    Serial.print(F(" D:"));
    uint8_t dip[4];
    W5100.readSnDIPR(i, dip);
    for (int j=0; j<4; j++) {
      Serial.print(dip[j],10);
      if (j<3) Serial.print(".");
    }
    Serial.print(F("("));
    Serial.print(W5100.readSnDPORT(i));
    Serial.println(F(")"));
  }
}

A socket status list:
0X0 = available.
0x14 = socket waiting for a connection
0x17 = socket connected to a server.
0x22 = UDP socket.

I am starting and stopping UDP to "Disconnect from the server. Release any resource being used during the UDP session.". Arduino is of course a limited memory environment and I cannot waste resources holding open a UDP resource that I use once a day.

This and other links easily found on google refer to what may be my problem:

http://forum.arduino.cc/index.php?topic=162799.0

I can implement socket checking within my application but I will research this issue first through google and see if there is a definitive answer and solution anywhere before I start chasing shadows that I do not understand.

Cheers

Catweazle NZ

I am starting and stopping UDP to "Disconnect from the server. Release any resource being used during the UDP session.".

Disconnecting is not necessary. You never connect to the receiving UDP device. You are sending a packet, which may or may not arrive at its destination. The only resource you are using is the UDP socket.

If you are really worried about resources, use the F() function in your code to keep your static strings in program memory. That will save much more than starting and stopping the UDP socket.

edit: If you are going to start and stop the UDP socket, you should check the return value of the begin call to insure there was a socket available for UDP.

  if(G_Udp.begin(C_UDPlocalPort) == 0) {
      Serial.println(F("No socket available"));
  }
  else {
      // your UDP send and receive stuff here, then 
      G_Udp.stop();
  }

SurferTim:

I am starting and stopping UDP to "Disconnect from the server. Release any resource being used during the UDP session.".

Disconnecting is not necessary. You never connect to the receiving UDP device. You are sending a packet, which may or may not arrive at its destination. The only resource you are using is the UDP socket.

If you are really worried about resources, use the F() function in your code to keep your static strings in program memory. That will save much more than starting and stopping the UDP socket.

edit: If you are going to start and stop the UDP socket, you should check the return value of the begin call to insure there was a socket available for UDP.

  if(G_Udp.begin(C_UDPlocalPort) == 0) {
      Serial.println(F("No socket available"));
  }
  else {
      // your UDP send and receive stuff here, then 
      G_Udp.stop();
  }

Hi

This Arduino reference page for EthenetUDP.begin() explicitly states that there is no return value and none of the examples I have seen test the return value.

http://arduino.cc/en/Reference/EthernetUDPBegin

But I have implemented your suggestion and will see if it helps.

I use F() extensively for calls to .print() and .println(). For fixed String parameters passed to my own procedures I typically have the strings in EEPROM like this example:

ActivityWrite(EPSR(E_UDP_NTP_OK_83));

I have over 3,500 bytes of string messages stored in EEPROM. The above code line is retrieving the string "UDP NTP OK" from position (byte) 83 in EEPROM. E_UDP_NTP_OK_83 is a constant for 83.

These code lines were just a quick fix:

ActivityWrite("- Send");
ActivityWrite("- Packet");
ActivityWrite("- Read");
ActivityWrite("- Pre Init");
ActivityWrite("- Post Init");

I will move these strings to EEPROM when I next do some application code housekeeping - or delete the code lines if the problem is resolved.

Cheers

Catweazle NZ

This Arduino reference page for EthenetUDP.begin() explicitly states that there is no return value and none of the examples I have seen test the return value.

http://arduino.cc/en/Reference/EthernetUDPBegin

That page is not correct. There is a return value.

This is from EthernetUdp.h

  virtual uint8_t begin(uint16_t);	// initialize, start listening on specified port. Returns 1 if successful, 0 if there are no sockets available to use

and EthernetUdp.cpp

/* Start EthernetUDP socket, listening at local port PORT */
uint8_t EthernetUDP::begin(uint16_t port) {
  if (_sock != MAX_SOCK_NUM)
    return 0;

  for (int i = 0; i < MAX_SOCK_NUM; i++) {
    uint8_t s = W5100.readSnSR(i);
    if (s == SnSR::CLOSED || s == SnSR::FIN_WAIT) {
      _sock = i;
      break;
    }
  }

  if (_sock == MAX_SOCK_NUM)
    return 0;

  _port = port;
  _remaining = 0;
  socket(_sock, SnMR::UDP, _port, 0);

  return 1;
}

edit: I submitted an error report with the change to webmaster@arduino.cc.

BTW, I really don’t care how examples and others use a function. I try to use all of them the correct way. That is why my stuff works. 8)

Since my last contribution to this post I have been focusing on other things. Along the way I made a few changes to my application and my application reliability got even worse.

However this weekend I have isolated the new bug recently introduced and I am hopefull of getting back to some reliability. The application's web site seems to running OK (for one day anyway) but my daily UDP NTP time update may still be failing.

I have now implemented a version of the socket checking routine suggested and have set up to run it immediately before every UDP NTP time update - regular daily ones and ad hoc ones as well. The results of the socket check are written to my activity log file - it all works for ad hoc UDP NTP updates. Now I will wait until tomorrow morning to see if the automated daily UDP NTP update fails again.

I have also made some changes to my UDP NTP procedure that I think will make it deal better with delayed, lost or truncated return packets.

If I am still having problems I will then use the socket statistics (status, remote IP, ports) and the times and activities written to my application logs to possibly identify the cause of my problems.

Fingers crossed then.

Catweazle NZ

Hi all

My recent changes have been running for a few days and my system seems more stable. I am now dumping information for the W5100 ethernet sockets into my system's activity log file every hour. Here is my system's activity log for the last few hours:

18:00:00 ETHERNET SOCKET LIST
18:00:00 #:Status Port Destination DPort
18:00:00 0=avail,14=waiting,17=connected,22=UDP
18:00:00 1C=close wait
18:00:00 Socket#0:0x0 80 D:157.55.36.50 (36881)
18:00:00 Socket#1:0x14 80 D:192.168.1.39 (50263)
18:00:00 Socket#2:0x0 80 D:192.168.1.243 (51446)
18:00:01 Socket#3:0x0 0 D:0.0.0.0 (0)
18:00:01 Climate Update
- FREE RAM: 2719
18:05:19 RAM Checking setting switched to F
19:00:00 ETHERNET SOCKET LIST
19:00:00 #:Status Port Destination DPort
19:00:00 0=avail,14=waiting,17=connected,22=UDP
19:00:00 1C=close wait
19:00:00 Socket#0:0x17 80 D:114.79.12.116 (22108)
19:00:00 Socket#1:0x17 80 D:114.79.12.116 (20916)
19:00:00 Socket#2:0x0 80 D:114.79.12.116 (22100)
19:00:00 Socket#3:0x14 80 D:0.0.0.0 (0)
19:00:00 Climate Update
- FREE RAM: 2719
20:00:00 ETHERNET SOCKET LIST
20:00:00 #:Status Port Destination DPort
20:00:00 0=avail,14=waiting,17=connected,22=UDP
20:00:00 1C=close wait
20:00:00 Socket#0:0x17 80 D:114.79.12.116 (22108)
20:00:00 Socket#1:0x17 80 D:114.79.12.116 (20916)
20:00:00 Socket#2:0x0 80 D:119.63.193.195 (5591)
20:00:00 Socket#3:0x14 80 D:202.46.55.59 (21227)
20:00:01 Climate Update
- FREE RAM: 2719
21:00:00 ETHERNET SOCKET LIST
21:00:00 #:Status Port Destination DPort
21:00:00 0=avail,14=waiting,17=connected,22=UDP
21:00:00 1C=close wait
21:00:00 Socket#0:0x17 80 D:114.79.12.116 (22108)
21:00:00 Socket#1:0x17 80 D:114.79.12.116 (20916)
21:00:00 Socket#2:0x14 80 D:222.154.229.4 (50077)
21:00:00 Socket#3:0x0 80 D:222.154.229.4 (50078)
21:00:01 Climate Update
- FREE RAM: 2719
21:45:00 Auto Alarm OFF
21:58:35 Garage Door Operated
21:58:38 Garage Opening
21:58:51 Garage Open
22:00:00 ETHERNET SOCKET LIST
22:00:00 #:Status Port Destination DPort
22:00:00 0=avail,14=waiting,17=connected,22=UDP
22:00:00 1C=close wait
22:00:00 Socket#0:0x17 80 D:114.79.12.116 (22108)
22:00:00 Socket#1:0x17 80 D:114.79.12.116 (20916)
22:00:00 Socket#2:0x0 80 D:192.168.1.39 (50303)
22:00:01 Socket#3:0x14 80 D:192.168.1.39 (50302)
22:00:01 Climate Update
- FREE RAM: 2719
22:00:17 Garage Closing
22:00:32 Garage Closed

Until this evening I always seemed to have one status 14 socket and the other three at status zero.

And here is an extract of my systems html request log for two hours:

18:05:18
- GET /50028/ HTTP/1.1
- CLNT 157.55.33.41
- HOST 219.88.69.69
- PAGE RAM Checking
18:08:31
- GET /PUBLIC/COOKIES.TXT/ HTTP/1.1
- CLNT 114.79.12.116
- HOST WWW.2WG.CO.NZ
- PAGE SD File Display
19:07:59
- GET / HTTP/1.1
- CLNT 202.46.55.59
- HOST WWW.2WG.CO.NZ
- PAGE Dashboard
19:09:13
- GET / HTTP/1.1
- CLNT 119.63.193.195
- HOST WWW.2WG.CO.NZ
- PAGE Dashboard

What I would like to know is what did or could IP address 114.79.12.116 have done at 18:08:31 to hold a permanent connection on two out of four of my W5100 ethernet connection sockets?

The apparent loss of these sockets and the loss of any more will I think lead to the blocking of my systems www ethernet connectivity. (And is likely the cause of recent problems of this nature.)

I presume that this IP address my have used one of the available HTML request command options that overrides my EthernetClient.flush() and .stop() commands.

Any help would be appreciated. Sometime in the next week I will update my code to dump a full listing of html requests into my html log so I can determine if this IP address is doing something different.

If there anyway to force a disconnection on the two locked/connected sockets? I am going to review my code again to see of there is any possible way that I have not flushed and stopped the ethernetclient connections.

Cheers

Catweazle NZ

Post your code.

The major reason I discovered is the client not sending a double cr/lf before the connection breaks (fails, not closed by the client). This can be accidental or intentional on the client end. I can cause this type fail with PuTTY if the server code does not have a timeout.

I have code on the playground that will prevent that. Look for the loopCount variable. If the client doesn't send the double cr/lf or another packet within 10 seconds, the connection is terminated by the server. You can test it with PuTTY. http://playground.arduino.cc/Code/WebServerST

edit: The change for the UDP.begin() reference page has not been changed yet. The webmaster posted a change request on the Arduino Github site, but it has yet to be acted upon. :(

The second most common cause of the socket not closing is leaving characters in that socket rx buffer and attempting to close the connection with client.stop(). Many times it will not close the socket, and the socket is lost.

SurferTim

SurferTim:
Post your code.

How much do you want? After about nine months of coding when I get some time I have more than 8,000 lines. (The executable is 123KB.)

However, regarding Ethernetclient.stop() I have this procedure:

void EthernetClientStop() {
  while (G_EthernetClient.available())
    G_EthernetClient.read(); //discard any remaining input
  //
  G_EthernetClient.flush();
  G_EthernetClient.stop();
  G_EthernetClientActive = false;
  G_EthernetClientTimeout = 0;
}

Before this evening I had essentially the same code in five places. Now I just call this procedure five times.

I think that my way of stopping/closing Ethernetclient instances will correctly release ethernet sockets on the W5100. I could see nothing additional that I might consider in your example at http://playground.arduino.cc/Code/WebServerST.

Tonight I had another www ethernet lockout of my system. The system kept running and this was its 10PM report of my socket status:

22:00:00 ETHERNET SOCKET LIST
22:00:00 #:Status Port Destination DPort
22:00:00 0=avail,14=waiting,17=connected,22=UDP
22:00:00 1C=close wait
22:00:00 Socket#0:0x17 80 D:114.79.12.96 (20308)
22:00:00 Socket#1:0x17 80 D:114.79.13.77 (47957)
22:00:00 Socket#2:0x17 80 D:114.79.12.96 (20303)
22:00:00 Socket#3:0x17 80 D:114.79.13.77 (46852)

Thanks to your assistance with the socket status list function I now have a better indication of where the problem is.

Firstly this function has identified the IP addresses 114.79.X.X as the problem.

Secondly, a review of my system’s html log file (details of html requests processed) reveals that these IP addresses only access one web page on my system - http://www.2wg.co.nz/public/cookies.txt.

I presume that the IP addresses in the range 114.79.X.X is a web crawler. That presumably means they use custom software to parse the web pages received when they send the above URL to my arduino system.

That leads me to suspect that there is (was) something in my cookies.txt file display web page that was a bit of a problem and that also the web crawler is processing the web page in a way that I did not anticipate that caused the crawler (my client) to hold open (keep-alive) my Arduino W5100 sockets.

So tonight I have made a change to the cookies.txt text file (and therefore the web page) and I will now wait to see what happens next time this web crawler visits (retrieves) the web page.

Stand by for updates.

Cheers

Catweazle NZ

Something is amiss with your code if it is not releasing those sockets, or the clients are reconnecting faster than you can service the requests.

What you posted is not adequate for any troubleshooting on my end. The problem as I indicated earlier may not be the stop call, but the way the client is connecting. If the client connects and doesn't send what you expect, or doesn't send anything at all, your code may not be closing the connection.

I haven't played with the scenario of the client connecting, then sending nothing and breaking the connection. I'll give it a try when I have time. It should time out in the w5100, but I have not verified that.

As a followup, the UDP reference doc has been changed to reflect the correct UDP.begin return value. http://arduino.cc/en/Reference/EthernetUDPBegin

SurferTim

SurferTim: As a followup, the UDP reference doc has been changed to reflect the correct UDP.begin return value. http://arduino.cc/en/Reference/EthernetUDPBegin

Hopefully that will help others - but they will probably be mislead by example sketches that do not test the return value of UDP.begin().

The web crawler at 114.79.X.X has not returned since I made my change - so I am unsure if the change I made to address their particular issue has solved the problem. If they do return I will get an answer and try to explain what I think may have been happening.

I also found a problem with invalid passwords during application logins that was losing sockets. That code is seldom executed because I generally enter the correct password and not many hackers and web crawlers try to break in using random passwords.

Since my last post I have made quite a few other tidy ups to my application's Ethernet functionality code. So at this stage I just need to wait to see if the application continues to suffer lost sockets by letting it run for several weeks.

Thanks for you help.

I will post again in a few weeks if I am having no more problems. I will post again if I continue to experience lost sockets.

Catweazle NZ

The web crawler at 114.79.X.X has not returned since I made my change - so I am unsure if the change I made to address their particular issue has solved the problem.

Just a side note, web crawlers seem to visit IP addresses that have been publically published. If your current server link is different from what it was in the past, you may not have another crawler visit to the new link.

zoomkat: Just a side note, web crawlers seem to visit IP addresses that have been publically published. If your current server link is different from what it was in the past, you may not have another crawler visit to the new link.

That is my experience. An analysis of the pages that are being visited (I presume 90% by web crawlers) indicates that most only visit the home page dashboard at www.2wg.co.nz and go no further. (oops - more web crawlers coming.) Before I purchased the web site name I was just referring to the web site by its IP address http://219.88.69.69/ (oops more crawlers) and after several months I still get accesses to the web site by the IP address.

I designed my system to mostly use random number page numbers that dynamically change everyday. I thought that I would outsmart the web crawlers. But now I don't care so I have kept constant web page numbers for some months. e.g. http://www.2wg.co.nz/19788/ accesses my Recent Climate web page. (Microsoft Bing did capture a list of some of my old web page numbers and keeps trying to access them months after the numbers became invalid web pages.)

Interestingly the web crawlers do not seem to be following the URLs within my web site. Many crawlers just go to the home page and no further. Those that are accessing other pages in many cases are typically accessing web pages that I have listed in this forum.

So the general rule seems to be to increase your presence on the internet don't bother creating a huge number of pages on a web site and hope they get indexed by the web crawlers - just make sure you get your web pages listed as links on other web sites. How many extra accesses am I going to get to my Recent Climate web page because of the URL listed above?

Regarding the issue you referred to. The web page URL is the same - I have changed the content of the web page returned (the html) and am waiting to see what happens when the particular web crawler visits that particular web page.

Cheers

Catweazle NZ