Ethernet UDP hang

Hello,
I'm getting mad trying to solve a random hang with arduino mega and ethernet shield...
I tried with latest ethernet library, stop interrupts before any ethernet lib call.
Even tried with a watchdog, but when the arduino hangs the watch dog doesn't reboot. Sometimes is also imposible to upload anything and I need to remove power.

The incoming message rate is 10ms, and the loop never takes longer, it usually takes 7/8 ms

The code is more or less like this....
Any cue?

Thanks!

#include <SPI.h>
#include <Ethernet.h>
#include <EthernetUdp.h>
#include <util.h>
#include <avr/wdt.h>
//
//Other declaration stuff
//
void setup()  
{ //--- Create an interrupt to update a multiplexed display
  //
  TIMSK2 &= ~(1<<TOIE2);
  TCCR2A &= ~((1<<WGM21) | (1<<WGM20));
  TCCR2B &= ~(1<<WGM22);
  ASSR &= ~(1<<AS2);
  TIMSK2 &= ~(1<<OCIE2A);
  TCCR2B |= (1<<CS22)  | (1<<CS20);
  TCCR2B &= ~(1<<CS21);
  TCNT2 = 0; //tcnt2;
  TIMSK2 |= (1<<TOIE2);  
  //---
  Ethernet.begin(mac,ip);
  Udp.begin(localPort);  
  pinMode(4,OUTPUT);
  digitalWrite(4,HIGH);
}
//--------------------------------------------------
ISR(TIMER2_OVF_vect) {
  iCount++;
  if (iCount == 5) { 
    LCD.update();
    iCount = 0;
  }
}
//--------------------------------------------------
char packetBuffer[UDP_TX_PACKET_MAX_SIZE];
//-----------------------------------------
void loop()  
{     
  _dataIn in;
  _dataOut out;
 
  cli();  
  int packetSize = Udp.parsePacket();   //The packet sended is 1 byte long, but packetSize is usually 10,11 or 12 ???
  if (packetSize) {
    Udp.read(packetBuffer,UDP_TX_PACKET_MAX_SIZE);
    if (Udp.remotePort() == remotePort) {   //Sometimes the remotePort is different... I not sure why.
      memcpy((byte *)&in, packetBuffer, sizeof(_dataIn));
      //Assign incoming value to local variable
      //Prepare the output struct
      Udp.beginPacket(remoteIp, remotePort);
      Udp.write((byte *)&out, sizeof(_dataOut));
      Udp.endPacket();
     }
  }
  sei();
  //
  //Do thing with the incoming values
  //
 }

Just looking at it ,
The code will not even compile take a look at the memcpy line.
Why should people waste their time, if you cannot take the time to supply a valid test case?

HC

The code is more or less like this....

More or less most people will ignore those that won't post the actual code that is causing a problem.

Well, I was trying to make things as simple as possible, that's why I removed all not ethernet related stuff.
Anyways... digging into the problem I realizes that it gets stuck in a loop inside sendUDP. I need to investigate in depth what's happening, but I supposed it can't send the package and never gets timeout.

int sendUDP(SOCKET s)
{
  W5100.execCmdSn(s, Sock_SEND);
  /* +2008.01 bj */
  while ( (W5100.readSnIR(s) & SnIR::SEND_OK) != SnIR::SEND_OK ) 
  {
    if (W5100.readSnIR(s) & SnIR::TIMEOUT)
    {
      /* +2008.01 [bj]: clear interrupt */
      W5100.writeSnIR(s, (SnIR::SEND_OK|SnIR::TIMEOUT));
      return 0;
    }
  }

  /* +2008.01 bj */	
  W5100.writeSnIR(s, SnIR::SEND_OK);
  /* Sent ok */
  return 1;
}

Have you tried applying this fix?
http://code.google.com/p/arduino/issues/detail?id=605

It affects any 16 bit register read from the w5100. It has been around long enough to pick up the nickname "the 605 bug".

Yep, actually I'm using the latest git version of ethernet library that already have the patch.

OK. That takes that out of the picture.

Next, this:

    if (Udp.remotePort() == remotePort) {   //Sometimes the remotePort is different... I not sure why.
      memcpy((byte *)&in, packetBuffer, sizeof(_dataIn));
      //Assign incoming value to local variable
      //Prepare the output struct
      Udp.beginPacket(remoteIp, remotePort);
      Udp.write((byte *)&out, sizeof(_dataOut));
      Udp.endPacket();
     }

I presume this is the server end. Is that correct? Your "server" is responding to another device UDP request?

What type device is on the other end, sending the request?

The reason the remote port is different is most of the time is that the client will not normally use the destination port. All my clients use ports other than port 80 on the client end to make http requests.

What is the value assigned to remotePort in your code? Maybe you should be using the port sent by the requesting device to reply to the request.

edit: If your ethernet shield is exposed to the internet, it may not be your device making some of the requests. It could be a port scanner looking for open ports to exploit.

The remote machine is a windows machine that sends to arduino's port 8888, and is always one machine and always the same one, then the arduino reads the package, compose the answer and send it back to windows machine on port 8888.

_dataIn size is 1 byte, and I just check that "int packetSize = Udp.parsePacket();" not always return 1 byte

OK. I see. Not sure I can help you with that.

But it might help if you could be more specific about this?

_dataIn size is 1 byte, and I just check that "int packetSize = Udp.parsePacket();" not always return 1 byte

How often is "not always"?
What does it return when it isn't 1 byte? 2? 9?

edit: I just saw the comment in the original code. Between 10 and 12. And never less than 9?

You almost hit!... 10
Here you see only 8888 port incoming packages because I filtered all others before "167" are different traces inside sendUDP

PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 8
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 8
PacketSize: 10  Port: 8888      Cycle time: 3
        Cycle time: 3
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 7
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 6
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 7
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 7
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 7
PacketSize: 10  Port: 8888      Cycle time: 3
        Cycle time: 3
PacketSize: 10  Port: 8888      Cycle time: 4
        Cycle time: 1
        Cycle time: 1
        Cycle time: 3
PacketSize: 10  Port: 8888      Cycle time: 3
        Cycle time: 1
        Cycle time: 1
        Cycle time: 3
PacketSize: 10  Port: 8888      Cycle time: 3
        Cycle time: 1
        Cycle time: 1
        Cycle time: 3
PacketSize: 10  Port: 8888      Cycle time: 4
        Cycle time: 1
        Cycle time: 1
PacketSize: 10  Port: 8888      Cycle time: 5
        Cycle time: 1
        Cycle time: 1
        Cycle time: 1
PacketSize: 10  Port: 8888      Cycle time: 3
        Cycle time: 1
        Cycle time: 1
        Cycle time: 1
PacketSize: 10  Port: 8888      Cycle time: 3
        Cycle time: 1
        Cycle time: 1
        Cycle time: 1
PacketSize: 10  Port: 8888      Cycle time: 4
        Cycle time: 1
        Cycle time: 1
        Cycle time: 3
PacketSize: 10  Port: 8888      Cycle time: 3
        Cycle time: 1
        Cycle time: 1
        Cycle time: 2
PacketSize: 10  Port: 8888      Cycle time: 3
        Cycle time: 1
        Cycle time: 2
        Cycle time: 2
PacketSize: 10  Port: 8888      Cycle time: 3
        Cycle time: 1
        Cycle time: 1
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 7
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 7
        Cycle time: 1
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 8
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 9
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 8
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 8
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 8
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 6
        Cycle time: 3
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 6
PacketSize: 1   Port: 8888      Power: 1      Write: Ok167END Cycle time: 6
        Cycle time: 3
PacketSize: 10  Port: 8888      Cycle time: 3
        Cycle time: 1

I'm not sure where that output came from.

Have you tried the example code here?

Does that serial output help?

I don't see a "Serial.begin(9600);" in the setup function tho. ??

Arduino 1.0 UDP sample doesn't have the call to available, anyways, the available() call is embedded in parsePacket().
I will try in my next run.

There's a Serial.begin(115200)

Edit: It seems to solve the wrong packet size... now is always 1, except very seldom 0 sizes, like data arrives between parsePacket() and available() calls
Let's see what happens after a few hours.....

Edit2: Nop... even the wrong packet size appears again.... I think my next step is to allow only a few retries in the loop of death
Maybe is some timing issue, the "print" fail rate is much more smaller than the no serial output version

Edit3: SnIR::TIMEOUT is always 8 when the problem happend, rx light in the ethernet shield is always blinking (http://db.tt/bgB1VQLK in the video 1 and 2 are stucked and 3 is running fine)

Inside this loop....

while ( (W5100.readSnIR(s) & SnIR::SEND_OK) != SnIR::SEND_OK )
{
if (W5100.readSnIR(s) & SnIR::TIMEOUT)
{

can you dump SnIR

HC

HardCore, you mean dumping the result of W5100.readSnIR(s). no?

Same thing.
I'm asking for the register to be dumped your asking for a function returning the register to be dumped.

I wanted to see the bottom 2 bits of SnIR whilst this was going on.

Because UDP does not require an 'ACK' there should be absolutely no reason it does not (SEND_OK), other than the external network cable is already busy/ no destination.

Here is another example of a UDP hang during NTP request.  This is built on the UDPNTPClient code in Arduino 1.0.  Sketch was working before code was added to service the LCD117 board and Dallas Temp sensors.  I am looking to add NTP update of ds1307 RTC when this gets working.  Any help appreciated. Entire sketch does not fit. Trimmed version may not help much.

[code// Compile with Arduino IDE version 1.0  Comments removed to save space
// Libraries :
#include <SPI.h>
#include <Ethernet.h>
#include <EthernetUdp.h> 
#include <DallasTemperature.h>
#include <SD.h>
#include <Time.h>
#include <DS1307RTC.h>
//#include <DS1307new.h>

#include <Wire.h>
#include <OneWire.h>
#include <SoftwareSerial.h>
#include <Ping.h>

//  Valve Position defines
#define ON  HIGH
#define OFF LOW
#define recirculate LOW
#define heating     HIGH
// This is the control pin for the valve actuator relay
const int Valve_Pin = 12;
const int SolarValve = 15;
// pin 13 LED
const int LEDpin = 13 ; // Winker LED on pin 13
//
#define txLCDPin   14
#define rx1WPin    10
#define tx1WPin     8
//
// Data wire is plugged into port 2 on the Arduino
#define ONE_WIRE_BUS 2
#define TEMPERATURE_PRECISION 9
//
// Setup a OneWire bus instance to communicate with any OneWire
// devices (not just Maxim/Dallas temperature ICs)
OneWire oneWire(ONE_WIRE_BUS);
//
// Pass our OneWire reference to Dallas Temperature Library.
DallasTemperature sensors(&oneWire);
SoftwareSerial myLCDSerial = SoftwareSerial(rx1WPin, txLCDPin);
// Arrays to hold device addresses
// uint8_t poolThermometer[8], airThermometer[8], roofThermometer[8];

// Assign address manually.  The addresses below will need to be changed
// to valid device addresses on your buss.  Device addresses can be retrieved
// by using either the sensors.search(deviceAddresses) or individually via
// sensors.getAddress(deviceAddress, index)
uint8_t poolThermometer[8] = {0x28, 0x26, 0x84, 0x8f, 0x2, 0x0, 0xc2, 0x0};
uint8_t airThermometer[8] = {0x28, 0xd1, 0xa7, 0x8f, 0x2, 0x0, 0x0, 0x83};
uint8_t roofThermometer[8] = {0x28, 0xca, 0x7b, 0x8f, 0x2, 0x0, 0x0, 0x5};

//used by Print Double routine
double x;
double y;
double z;

double LastPoolTemp;
double LastRoofTemp;
byte ValvePositionCurrent;


// Enter a MAC address for your controller below.
// Newer Ethernet shields have a MAC address printed on a sticker on the shield
byte mac[] = { 0xfE, 0xeD, 0xad, 0x06, 0xF0, 0x0D }; // feed a dog food

unsigned int localPort = 8888;      // local port to listen for UDP packets

IPAddress timeServer(192, 43, 244, 18); // time.nist.gov NTP server

const int NTP_PACKET_SIZE= 48; // NTP time stamp is in the first 48 bytes of the message

byte packetBuffer[ NTP_PACKET_SIZE]; //buffer to hold incoming and outgoing packets 

// A UDP instance to let us send and receive packets over UDP
EthernetUDP Udp;

//=========================================================================
//
//=========================================================================
void setup() 
{
  Serial.begin(9600);
  
  pinMode(LEDpin, OUTPUT);
  
  pinMode(rx1WPin, INPUT);
  pinMode(txLCDPin, OUTPUT);
  pinMode(tx1WPin, OUTPUT);
  
  // Solar Valve Actuator
  pinMode(Valve_Pin, OUTPUT);
  // Initialize valve to recirculate
  digitalWrite(SolarValve, recirculate);
 // ValvePositionCurrent = recirculate;
  
  // Start up the Temp Sensor library
  sensors.begin();
  Wire.begin();
  
  
myLCDSerial.begin(9600);
delay(100);
myLCDSerial.print("?G4x20"); // display is 4 lines of 20 char each
delay(100);
myLCDSerial.print("?B40"); // set backlight to half intensity on
delay(100);
//
//

Serial.println("Solar Pool Controler  Version 1.2  02-09-2012CJT");
delay(3000);

//
// locate devices on buss
myLCDSerial.print("?F");
Serial.print("Found ");
Serial.print(sensors.getDeviceCount(), DEC);
Serial.println(" thermal sensors.");
delay(1000);

// report parasitic power requirements
Serial.print("Parasitic power: ");
if (sensors.isParasitePowerMode()) Serial.println("ON");
else Serial.println("OFF");
delay(500);


 setSyncProvider(RTC.get);   // the function to get the time from the RTC
  if(timeStatus()!= timeSet) 
     Serial.println("Unable to sync with the RTC");
  else
     Serial.println("RTC has set the system time");      
     //
  Serial.println("Getting Ethernet IP via DHCP");
  // start Ethernet and UDP
  if (Ethernet.begin(mac) == 0) {
    Serial.println("Failed to configure Ethernet using DHCP");
    // no point in carrying on, so do nothing forevermore:
   
    for(;;)
      ;
 
  }
  
  Udp.begin(localPort);
} //  End of Setup

//===========================================================================
//                       M A I N  
//===========================================================================
void loop()
{
  sendNTPpacket(timeServer); // send an NTP packet to a time server

    // wait to see if a reply is available
  delay(1000);  
  int packetSize = Udp.parsePacket();
  if(Udp.available())
  {
    Serial.print("Received packet of size ");
    Serial.println(packetSize);
    Serial.print("From ");
    IPAddress remote = Udp.remoteIP();
    for (int i =0; i < 4; i++)
    {
      Serial.print(remote[i], DEC);
      if (i < 3)
      {
        Serial.print(".");
      }
    }
    Serial.print(", port ");
    Serial.println(Udp.remotePort());
    // We've received a packet, read the data from it
    Udp.read(packetBuffer,NTP_PACKET_SIZE);  // read the packet into the buffer

    //the timestamp starts at byte 40 of the received packet and is four bytes,
    // or two words, long. First, esxtract the two words:

    unsigned long highWord = word(packetBuffer[40], packetBuffer[41]);
    unsigned long lowWord = word(packetBuffer[42], packetBuffer[43]);  
    // combine the four bytes (two words) into a long integer
    // this is NTP time (seconds since Jan 1 1900):
    unsigned long secsSince1900 = highWord << 16 | lowWord;  
    Serial.print("Seconds since Jan 1 1900 = " );
    Serial.println(secsSince1900);               

    // now convert NTP time into everyday time:
    Serial.print("Unix time = ");
    // Unix time starts on Jan 1 1970. In seconds, that's 2208988800:
    const unsigned long seventyYears = 2208988800UL;     
    // subtract seventy years:
    unsigned long epoch = secsSince1900 - seventyYears;  
    // print Unix time:
    Serial.println(epoch);                               


    // print the hour, minute and second:
    Serial.print("The UTC time is ");       // UTC is the time at Greenwich Meridian (GMT)
    Serial.print((epoch  % 86400L) / 3600); // print the hour (86400 equals secs per day)
    Serial.print(':');  
    if ( ((epoch % 3600) / 60) < 10 ) {
      // In the first 10 minutes of each hour, we'll want a leading '0'
      Serial.print('0');
    }
    Serial.print((epoch  % 3600) / 60); // print the minute (3600 equals secs per minute)
    Serial.print(':'); 
    if ( (epoch % 60) < 10 ) {
      // In the first 10 seconds of each minute, we'll want a leading '0'
      Serial.print('0');
    }
    Serial.println(epoch %60); // print the second
  }
  // wait ten seconds before asking for the time again
  delay(10000); 
}

// send an NTP request to the time server at the given address 
unsigned long sendNTPpacket(IPAddress& address)
{
  // set all bytes in the buffer to 0
  memset(packetBuffer, 0, NTP_PACKET_SIZE); 
  // Initialize values needed to form NTP request
  // (see URL above for details on the packets)
  packetBuffer[0] = 0b11100011;   // LI, Version, Mode
  packetBuffer[1] = 0;     // Stratum, or type of clock
  packetBuffer[2] = 6;     // Polling Interval
  packetBuffer[3] = 0xEC;  // Peer Clock Precision
  // 8 bytes of zero for Root Delay & Root Dispersion
  packetBuffer[12]  = 49; 
  packetBuffer[13]  = 0x4E;
  packetBuffer[14]  = 49;
  packetBuffer[15]  = 52;

  // all NTP fields have been given values, now
  // you can send a packet requesting a timestamp: 		   
  Udp.beginPacket(address, 123); //NTP requests are to port 123
  Udp.write(packetBuffer,NTP_PACKET_SIZE);
  Udp.endPacket(); 
}










//==========================================================================
//            Service Routines
//==========================================================================
//==============================================================================================

// function to print a device address

void printAddress(uint8_t deviceAddress[])
{
  for (uint8_t i = 0; i < 8; i++)
  {
    Serial.print(deviceAddress[i], HEX);
    if (i < 7) Serial.print(" "); //space at end
  }
}
]

sketch_feb13a.cpp:11:31: error: DallasTemperature.h: No such file or directory
sketch_feb13a.cpp:13:18: error: Time.h: No such file or directory
sketch_feb13a.cpp:14:23: error: DS1307RTC.h: No such file or directory
sketch_feb13a.cpp:18:21: error: OneWire.h: No such file or directory
sketch_feb13a.cpp:20:18: error: Ping.h: No such file or directory
sketch_feb13a:-1: error: 'IPAddress' was not declared in this scope
sketch_feb13a:-1: error: 'address' was not declared in this scope
sketch_feb13a:0: error: expected unqualified-id before '[' token
In file included from sketch_feb13a.cpp:8:
/Volumes/NAS/arduno/Arduino.app/Contents/Resources/Java/libraries/SPI/SPI.h:53: error: 'SPIClass' does not name a type
/Volumes/NAS/arduno/Arduino.app/Contents/Resources/Java/libraries/SPI/SPI.h:55: error: 'SPIClass' has not been declared
/Volumes/NAS/arduno/Arduino.app/Contents/Resources/Java/libraries/SPI/SPI.h:64: error: 'SPIClass' has not been declared
/Volumes/NAS/arduno/Arduino.app/Contents/Resources/Java/libraries/SPI/SPI.h:68: error: 'SPIClass' has not been declared
sketch_feb13a:37: error: 'OneWire' does not name a type
sketch_feb13a:40: error: 'DallasTemperature' does not name a type
sketch_feb13a.cpp: In function 'void setup()':
sketch_feb13a:98: error: 'sensors' was not declared in this scope
sketch_feb13a:129: error: 'RTC' was not declared in this scope
sketch_feb13a:129: error: 'setSyncProvider' was not declared in this scope
sketch_feb13a:130: error: 'timeStatus' was not declared in this scope
sketch_feb13a:130: error: 'timeSet' was not declared in this scope
sketch_feb13a.cpp: In function 'void loop()':
sketch_feb13a:154: error: 'sendNTPpacket' cannot be used as a function
sketch_feb13a.cpp: In function 'long unsigned int sendNTPpacket(IPAddress&)':
sketch_feb13a:220: error: 'long unsigned int sendNTPpacket(IPAddress&)' redeclared as different kind of symbol
sketch_feb13a:-1: error: previous declaration of 'long unsigned int sendNTPpacket'
sketch_feb13a.cpp: At global scope:
sketch_feb13a:267: error: expected unqualified-id before ']' token

perhaps you can strip all the crap out... and give a succinct example

@ctaylor1: It appears that code is using the digital pins that the ethernet shield uses.
You cannot use digital pins 4 or 10 (or digital pins 11-13 on the Uno). Those pins are used for the ethernet shield SPI interface.

I've not read this thread fully so this may well be unrelated, but have you looked at bug 669? (http://code.google.com/p/arduino/issues/detail?id=669)

There are circumstances where the UDP library gets out of sync with the Ethernet chip and starts returning corrupted data.

Dylan

Uhmm.. sound interesting...
Just trying with bug 669 fix......

Edit: All day running and still there (in 6 different arduinos)... sounds good!