Socket.cpp - Is This A Bug? It Sure Ain't a Feature...

I’ve just spent a frustrating day figuring out why my app sometimes just goes out to lunch. I’ve tracked it down to the following, in socket.cpp:

uint16_t send(SOCKET s, const uint8_t * buf, uint16_t len)
{
  uint8_t status=0;
  uint16_t ret=0;
  uint16_t freesize=0;

  if (len > W5100.SSIZE) 
    ret = W5100.SSIZE; // check size not to exceed MAX size.
  else 
    ret = len;

  // if freebuf is available, start.
  do 
  {
    SPI.beginTransaction(SPI_ETHERNET_SETTINGS);
    freesize = W5100.getTXFreeSize(s);
    status = W5100.readSnSR(s);
    SPI.endTransaction();
    if ((status != SnSR::ESTABLISHED) && (status != SnSR::CLOSE_WAIT))
    {
      ret = 0; 
      break;
    }
    yield();
  } 
  while (freesize < ret);

  // copy data
  SPI.beginTransaction(SPI_ETHERNET_SETTINGS);
  W5100.send_data_processing(s, (uint8_t *)buf, ret);
  W5100.execCmdSn(s, Sock_SEND);

  /* +2008.01 bj */
  while ( (W5100.readSnIR(s) & SnIR::SEND_OK) != SnIR::SEND_OK ) 
  {
    /* m2008.01 [bj] : reduce code */
    if ( W5100.readSnSR(s) == SnSR::CLOSED )
    {
      SPI.endTransaction();
      close(s);
      return 0;
    }
    SPI.endTransaction();
    yield();
    SPI.beginTransaction(SPI_ETHERNET_SETTINGS);
  }

  /* +2008.01 bj */
  W5100.writeSnIR(s, SnIR::SEND_OK);
  SPI.endTransaction();
  return ret;
}

The problem is the second while:

  while ( (W5100.readSnIR(s) & SnIR::SEND_OK) != SnIR::SEND_OK )

If, for any reason, the connection is lost when it gets to this point, it never exits the while, and everything just stops dead in its tracks. I’m thinking the fix might be to include in the while condition a test to make sure the connection state is still ESTABLISHED.

Does that sound right? Or, I could implement a timeout of, perhaps, a second, before aborting.

Regards,
Ray L.

I think a better fix is changing the inner if condition from:

if ( W5100.readSnSR(s) == SnSR::CLOSED )

to

if ( (W5100.readSnSR(s) != SnSR::ESTABLISHED ) && (W5100.readSnSR(s) != SnSR::CLOSE_WAIT ) )

This appears to resolve the problem I'm seeing.

Regards, Ray L.