ESP32 fails to connect to ThingSpeak after random time & never connects again

The following code runs on an Adafruit ESP32 Feather that connects to the internet via my router. The ESP32 is “remote” and is accessible only via wifi.

It posts to ThingSpeak every 10 minutes and works fine for a few hours, a day or so, sometimes a few days, but then it stops posting and returns error -301 (“failed to connect”) with every attempt**.** It only starts posting again after a hard reboot.

I suspected heap fragmentation, but free heap is constant at 247k (after an initial quick decline from 250k) and max allocatable heap is constant at 114k from the start.

After posting stops, I can access the ESP32 via the router (and run the “server.on” commands and do updates OTA), so the ESP32 hasn’t lost wifi connectivity.

I also have an ESP8266 posting to ThingSpeak every five minutes and it has been online for months, so the problem probably isn’t with the router or ISP. And I rarely get -401 errors, which mean posting too frequently, so that’s not an issue, either.

After ESP32 posting stops, I can successfully manually post from a browser with https://api.thingspeak.com/update.json?api_key=xyz&field5=199, so it seems the problem is with the code.

I’m running the latest ThingSpeak library, ESP core, and Arduino IDE.

Would appreciate suggestions on things to try or monitor.

Code to follow, in three parts. Full code also attached.

Qslim4help.ino (13.5 KB)

#include <WiFi.h>
#include <WiFiClient.h>
#include <WebServer.h>
#include <ArduinoOTA.h>
#include <ESP_Mail_Client.h>
#include <esp_int_wdt.h> // for hard reboot
#include <esp_task_wdt.h>// ditto

#include "ThingSpeak.h"  // "always put this last in the list of includes"

WebServer server(80); // OTA and server.on
WiFiClient  client;   // TS only

//**** definitions etc ****
#define SMTP_HOST "smtp.gmail.com"
#define SMTP_PORT 465
#define AUTHOR_EMAIL "xyz@gmail.com"
#define AUTHOR_PASSWORD "abc"
SMTPSession smtp;
void smtpCallback(SMTP_Status status);
ESP_Mail_Session session;
SMTP_Message message;

const char * myWriteAPIKey = "efg";  // TS

const byte deltaDecreaseCM = 30;  // threshold in cm... 12" = 30.48 cm
const int distAvg = 1060;	        // average distance
const unsigned long myChannelNumber = 123;  // TS

bool paused = false;
bool savedPaused;
bool intruder = false;
bool alarmSounded = false;
bool snowing = false;
bool snowTriggeredOnce = false;
bool distSaving = true;

byte reqdNumBreaks = 6;
byte lastTSalarmFlag;
byte snowFactor = 1;
byte savedSnowFactor;
byte snowCount;
byte saveDist[100];
byte saveIdx = 99;

int distCurrent;
int savedDistance;
int lastTScode = 200;
int wiFiFailsTS;

unsigned long numIntruders;  // can be very large if beam is blocked for a long time (eg. by parked car)
unsigned long alarmTriggeredTime;
unsigned long prevTSfailTime = 0;
unsigned long startSnowingTime;
unsigned long firstSnowTrigger;
unsigned long pauseStartTime;
unsigned long pauseDuration;

//**** setup ****
void setup()
{
  Serial1.begin(115200); // TF03 default rate = 115200

  WiFi.begin();

  while (WiFi.waitForConnectResult() != WL_CONNECTED) {
    delay(5000);
    ESP.restart();
  }

  setupMail();
  
  server.on("/", handleRoot);
  server.on("/reboot", reBootMe);
  server.on("/postTS", doTSpost);
  server.on("/showTS", showTScode);
  server.onNotFound(handleNotFound);

  ArduinoOTA.begin();
  server.begin();
  ThingSpeak.begin(client);

  readTFxTimes(50);  // clear serial1 buffer
}

//***************************************************************************************
//**** loop ****
//***************************************************************************************
void loop() {

  ArduinoOTA.handle();   // this works even if posting to TS does not work
  server.handleClient(); // ditto

  unsigned long currTime = millis();

  const unsigned long writeTSinterval = 600000UL; // post to TS every 10 min (and upon sounding alarm)
  static unsigned long prevTSwriteTime = 0;
  const unsigned long maxAlertInterval = 600000UL;  // no duplicate alarms for 10 min after an alarm

  // reset pause flag if time is up
  if (paused && (currTime - pauseStartTime > pauseDuration)) {
    paused = false;
  }

  // reset alarm flag if time is up
  if (alarmSounded && (currTime - alarmTriggeredTime > maxAlertInterval)) {
    alarmSounded = false;
  }

  // read TF03 once every loop
  readTFxTimes(1);
  if (! paused && ! alarmSounded) {  // chk for intruder, but only if not paused and not w/in 10 min of an alarm
    chkForIntruder();
    if (intruder && (numIntruders == reqdNumBreaks * snowFactor)) soundAlarm();  // sound alarm if sufficient number of sequential brks
  }

  // post to thingSpeak
  if (prevTSfailTime) { // if an alarmFlag=1 write failed (posted too soon after an alarmFlag=0 post)
    if (currTime - prevTSfailTime > 20000UL) {  // try again after 20 sec (15.1 sec didn't seem to work on 1/27 when there was a collision)
      prevTSfailTime = 0;
      prevTSwriteTime = currTime;
      writeThingSpeak(1, savedDistance, savedSnowFactor, savedPaused);
      //this will only do one re-try.  If this fails again with -401 (for whatever reason)
      //it will just continue on with normal (alarmFlag=0) posts after 10 minutes.
    }
  } else if ((currTime - prevTSwriteTime > writeTSinterval) && (! intruder)) {
    prevTSwriteTime = currTime;
    writeThingSpeak(0, distCurrent, snowFactor, paused);     // zero indicates no alarmFlag
  }
}

Straight from ThingSpeak example…

//**** writeThingSpeak ****
void writeThingSpeak(byte alarmF, int distC, byte snowF, bool pausD) {

  if (WiFi.status() != WL_CONNECTED) { // should already be connected, but check again anyway
    wiFiFailsTS++;  //this has never been > 1
    while (WiFi.status() != WL_CONNECTED) {
      WiFi.begin();
      delay(5000);
    }
  }

  int freeHeap = ESP.getFreeHeap();
  int maxAllocatable = ESP.getMaxAllocHeap();

  ThingSpeak.setField(1, distC);
  ThingSpeak.setField(2, alarmF);  // 0 = no intruder; 1 = intruder; 4 = manual test post
  ThingSpeak.setField(3, snowF);   // 1 = no snow; other = snowing
  ThingSpeak.setField(4, pausD);
  ThingSpeak.setField(5, lastTScode);
  ThingSpeak.setField(6, freeHeap);
  ThingSpeak.setField(7, maxAllocatable);
  ThingSpeak.setField(8, wiFiFailsTS);

  lastTScode = ThingSpeak.writeFields(myChannelNumber, myWriteAPIKey);

  readTFxTimes(50); // in case the above takes "a while".  100 = about one second of reads, so 50 is about half a second

  /*
    https://github.com/mathworks/thingspeak-arduino
    Return Codes
    Value 	Meaning
    200 	OK / Success
    404 	Incorrect API key (or invalid ThingSpeak server address)
    -101 	Value is out of range or string is too long (> 255 characters)
    -201 	Invalid field number specified
    -210 	setField() was not called before writeFields()
    -301 	Failed to connect to ThingSpeak <-------------------------------
    -302 	Unexpected failure during write to ThingSpeak
    -303 	Unable to parse response
    -304 	Timeout waiting for server to respond
    -401 	Point was not inserted (most probable cause is the rate limit of once every 15 seconds)
    0 	Other error
  */
}
//**** chkForIntruder ****
void chkForIntruder() {

  int deltaDist = distAvg - distCurrent;

  if (distSaving) { // not currently accessible (deleted the associated server.on)
    saveIdx = (saveIdx + 1) % 100;
    if (deltaDist < 0) {
      saveDist[saveIdx] = 0;
    } else {
      saveDist[saveIdx] = deltaDist;
    }
  }

  if (deltaDist > deltaDecreaseCM) { // if distance descreases more than the limit, then there's an intruder
    intruder = true;
    numIntruders++; // number of sequential breaks, actually
  } else {
    if (snowing) {
      if (millis() - startSnowingTime < 1800000UL) {
        if ((reqdNumBreaks / 2 < numIntruders) && (numIntruders < reqdNumBreaks)) snowCount++;
      } else { // time is up
        if (! snowCount) { // if snowCount == 0, reset flag and factor
          snowing = false;
          snowFactor = 1;
        } else {  // snowCount was > 0, so need to keep checking...
          startSnowingTime = millis();  // reset time, so check again later
          snowCount = 0;                // restart count for this new period
        } // end "else" (snow count > 0)
      } // end "else" (time is up)
    } else {  // end "if snowing"
      if (snowTriggeredOnce) {
        if (millis() - firstSnowTrigger > 300000UL) { // triggered once, but time expired, so re-set flag
          snowTriggeredOnce = false;
        } else if ((reqdNumBreaks / 2 < numIntruders) && (numIntruders < reqdNumBreaks)) { // triggered once, time not expired, meets criteria...set snowing flag, etc.
          startSnowingTime = millis();
          snowing = true;
          snowFactor = 4;
          snowTriggeredOnce = false;
          distSaving = false;
        } //end snowTriggeredOnce
      } else if ((reqdNumBreaks / 2 < numIntruders) && (numIntruders < reqdNumBreaks)) { // not triggered yet, but meets criteria, so set triggered once flag, etc.
        snowTriggeredOnce = true;
        firstSnowTrigger = millis();
      }  // end not triggered yet but meets criteria
    } // end "not snowing"
    intruder = false;
    numIntruders = 0;
  } // end "else" distance not decreased...so no intruder, and numIntruders reset to zero
}

//**** soundAlarm ****
void soundAlarm() {
  alarmTriggeredTime = millis();
  alarmSounded = true;
  sendMyMailNow();     //send an alert
  if (snowing && (startSnowingTime - alarmTriggeredTime < 5000)) {
    snowing = false;
    snowFactor = 1;
  }

  writeThingSpeak(1, distCurrent, snowFactor, paused); // 1 indicates intruder

  if (lastTScode == -401) {
    prevTSfailTime = millis();
    savedDistance = distCurrent;
    savedSnowFactor = snowFactor;
    savedPaused = paused;
  }
}

//**** readTFxTimes ****
void readTFxTimes(byte numOfReads) {
  for (byte i = 0; i < numOfReads; i++) {
    while (! readTF03once()) { //read until a number is obtained
    }
  }
}

//**** readTF03once ****
bool readTF03once() {
  int check;                 // checksum
  byte uart[9];              // stores each byte of data returned by LiDAR (was int... I changed to byte)
  const byte HEADER = 0x59;  // data package frame header...the letter "Y" in ASCII  (was int... I changed to byte)

  if (Serial1.available()) { 					    //check whether the serial port has data input
    if (Serial1.read() == HEADER) { 			// determine data package frame header = 0x59
      uart[0] = HEADER;
      if (Serial1.read() == HEADER) { 		// determine data package frame header = 0x59
        uart[1] = HEADER;
        for (byte i = 2; i < 9; i++) {  	// store rest of data to array
          uart[i] = Serial1.read();
        }

        check = uart[0] + uart[1] + uart[2] + uart[3] + uart[4] + uart[5] + uart[6] + uart[7];

        if (uart[8] == (check & 0xff)) {   // check the received data as per protocols    0xff = 0b11111111
          // Not sure why bitwise and (&) is used.
          distCurrent = uart[2] + uart[3] * 256; // calculate distance value
          return true; //got a reading
        }
      }
    }
  }
  distCurrent = 0;
  return false; //didn't get a reading
}

void handleRoot() {

  if (server.arg("pause") != "") { // i.e., if not zero, then user entered ...?pause=(a number)
    paused = true;
    pauseDuration = (unsigned long) server.arg("pause").toInt(); // in minutes
    pauseStartTime = millis();

    if (pauseDuration <= 0) {  // if neg, do nothing
      paused = false;
    } else if (pauseDuration > 1200) { // if large, limit to 1200 minutes = 20 hours
      pauseDuration = 1200UL;
      intruder = false; // so posting to TS continues during pause
      numIntruders = 0;
    } else { // otherwise, use received value
      intruder = false;  // so posting to TS continues during pause
      numIntruders = 0;
    }
    pauseDuration *= 60000UL; // convert minutes to milliseconds
    server.send(200, "text/plain", "pausing");

  } else { // not break or pause
    server.send(200, "text/plain", "ESP32 eye .151");
  }
}

void reBootMe() { // run with /reboot
  // see e32hardReset in test_espB folder for basis of this
  server.send(200, "text/plain", "reboot in 2");
  delay(2000);
  esp_task_wdt_init(1, true);
  esp_task_wdt_add(NULL);
  while (true);
}

void doTSpost() { // run with /postTS
  server.send(200, "text/plain", "posting a 2 to TS");
  writeThingSpeak(2, distCurrent, snowFactor, paused);
}

void showTScode() { // run with /showTS
   char myCstr[15];
   snprintf(myCstr, 15, "TScode=%d", lastTScode);
   server.send(200, "text/plain", myCstr);
}

void handleNotFound() {
  server.send(404, "text/plain", "404: Not found");
}

void smtpCallback(SMTP_Status status) {
  Serial.println(status.info());

  if (status.success())
  {
    Serial.println("----------------");
    Serial.printf("Message sent success: %d\n", status.completedCount());
    Serial.printf("Message sent failled: %d\n", status.failedCount());
    Serial.println("----------------\n");
    struct tm dt;

    for (size_t i = 0; i < smtp.sendingResult.size(); i++)
    {
      SMTP_Result result = smtp.sendingResult.getItem(i);
      localtime_r(&result.timesstamp, &dt);

      Serial.printf("Message No: %d\n", i + 1);
      Serial.printf("Status: %s\n", result.completed ? "success" : "failed");
      Serial.printf("Date/Time: %d/%d/%d %d:%d:%d\n", dt.tm_year + 1900, dt.tm_mon + 1, dt.tm_mday, dt.tm_hour, dt.tm_min, dt.tm_sec);
      Serial.printf("Recipient: %s\n", result.recipients);
      Serial.printf("Subject: %s\n", result.subject);
    }
    Serial.println("----------------\n");
  }
}

void setupMail() {
  smtp.debug(0); // 0 = none
  smtp.callback(smtpCallback);

  session.server.host_name = SMTP_HOST;
  session.server.port = SMTP_PORT;
  session.login.email = AUTHOR_EMAIL;
  session.login.password = AUTHOR_PASSWORD;
  session.login.user_domain = "mydomain.net";

  message.sender.name = "ESP Mail";
  message.sender.email = AUTHOR_EMAIL;
  message.subject = "Test sending plain text Email";
  message.addRecipient("Someone", "9075194150@mms.cricketwireless.net");

  message.text.content = "This is simple plain text message";
  message.text.charSet = "us-ascii";
  message.text.transfer_encoding = Content_Transfer_Encoding::enc_7bit;
  message.priority = esp_mail_smtp_priority::esp_mail_smtp_priority_normal;
  message.response.notify = esp_mail_smtp_notify_success | esp_mail_smtp_notify_failure | esp_mail_smtp_notify_delay;
  message.addHeader("Message-ID: <abcde.fghij@gmail.com>");
}

void sendMyMailNow() {
  if (!smtp.connect(&session)) {
    Serial.println("failed to connec to smtp sesh");
    return;
  } else if (!MailClient.sendMail(&smtp, &message)) { /* Start sending Email and close the session */
    //Serial.println("Error sending Email, " + smtp.errorReason());
  }
}

My ESP32 WiFI connect routine

void connectToWiFi()
{
  int TryCount = 0;
  //log_i( "connect to wifi" );
  while ( WiFi.status() != WL_CONNECTED )
  {
    TryCount++;
    WiFi.disconnect();
    WiFi.begin( SSID, PWD );
    vTaskDelay( 4000 );
    if ( TryCount == 10 )
    {
      ESP.restart();
    }
  }
  WiFi.onEvent( WiFiEvent );
  GetTheTime();
  printLocalTime();
} // void connectToWiFi()

It may be noted that there is a WiFi disconnect issued before any connection attempt is made. WiFi Disconnect, does 2 things. Disconnects from WiFi, if connected, and resets the WiFi memory stack to default values.

I use MQTT, one of the happenings is the ESP32 are running a MQTTwdt

////
void fmqttWatchDog( void * paramater )
{
  int maxNonMQTTresponse = 5;
  for (;;)
  {
    xEventGroupWaitBits (eg, evtDoMQTTwd, pdTRUE, pdTRUE, portMAX_DELAY );
    xSemaphoreTake( sema_mqttOK, portMAX_DELAY );
    mqttOK++;
    xSemaphoreGive( sema_mqttOK );
    if ( mqttOK >= maxNonMQTTresponse )
    {
      ESP.restart();
    }
  }
  vTaskDelete( NULL );
} //void fmqttWatchDog( void * paramater )
////

The ESP is supposed to receive regular publications from the Broker. If the publications continue, they reset the count to 0 and no ESP32 reset. If publications stop the ESP32 resets. That way if I update the server the MQTT broker is on, when the broker comes back on, the ESP32 reconnect to the MQTT with a new token, instead of trying to connect with an old token. Saves me the time of going around resetting the ESP's after doing a server thing.

Might be an idea you can use.

Oh yea.

Check for a WiFi connection can be confusing

void MQTTkeepalive( void *pvParameters )
{
  sema_MQTT_KeepAlive   = xSemaphoreCreateBinary();
  xSemaphoreGive( sema_MQTT_KeepAlive ); // found keep alive can mess with a publish, stop keep alive during publish
  // setting must be set before a mqtt connection is made
  MQTTclient.setKeepAlive( 90 ); // setting keep alive to 90 seconds makes for a very reliable connection, must be set before the 1st connection is made.
  for (;;)
  {
    //check for a is-connected and if the WiFi 'thinks' its connected, found checking on both is more realible than just a single check
    if ( (wifiClient.connected()) && (WiFi.status() == WL_CONNECTED) )
    {
      xSemaphoreTake( sema_MQTT_KeepAlive, portMAX_DELAY ); // whiles MQTTlient.loop() is running no other mqtt operations should be in process
      MQTTclient.loop();
      xSemaphoreGive( sema_MQTT_KeepAlive );
    }
    else {
      log_i( "MQTT keep alive found MQTT status %s WiFi status %s", String(wifiClient.connected()), String(WiFi.status()) );
if ( !(wifiClient.connected()) || !(WiFi.status() == WL_CONNECTED) )  
      {
        connectToWiFi();
      }
      connectToMQTT();
    }
    vTaskDelay( 250 ); //task runs approx every 250 mS
  }
  vTaskDelete ( NULL );
}

Note the WiFi connection test of if ( (wifiClient.connected()) && (WiFi.status() == WL_CONNECTED) ).

Either wifiClient.connected() or (WiFi.status() can have one as true and one as false and the WiFi 'program' still think the WiFi connection is good, but its not.

I physically mount the ESP32's with the antenna pointed up and the circuit board cut away from below and above the antenna.

Add a new tab to your project, call it secrets.h or certs.h or whateveryouwanttocallit.h. Put your secret cht on the what.h page. include the whatever h page and reference your secrets from the what.h page. That way when you post your code your not trying to remember to change your secret things.

Thank you for the suggestions.

I’ll add WiFi.disconnect() to my re-establish wifi connection code, but since I’ve never had (WiFi.status() != WL_CONNECTED) be true, I’m not sure it will do anything. Maybe it will if I require both ( (wifiClient.connected()) && (WiFi.status() == WL_CONNECTED) ) prior to posting. I’ll try that.

I bet wifiClient.connected() is false when this happens…but why…and why do ThingSpeak’s connect attempts fail? Wifi is definitely still connected. This is where the failure occurs (in the TS lib): connectSuccess = client->connect(const_cast<char *>(THINGSPEAK_URL), this->port);

And if I get, say, two “failed to connect” responses in a row, then I’ll also try doing a wifi disconnect and reconnect, or ESP.restart(), or hard reset. I’m sure the hard reset will work…it’s what I do manually (via my server.on “/reboot”) when this problem occurs. Not sure about the first two. Will try them.

But since my other ESP never has this problem posting to ThingSpeak, it seems like I’m missing a fundamental problem.

Well, I looked into the ThingSpeak library. The call to “ThingSpeak.writeFields()” initiates a client connection, writes the data, does what is essentially a flush, and then a stop. So client.connected() will always be false, before and after a post. The following little code demonstrates that fact.

So there’s no point in checking client.connected() and resetting the ESP if it is false prior to each post.

Still stumped as to the reason why the ESP32 fails to connect to ThingSpeak after some random time and then can never connect again.

#include <WiFi.h>
#include "secrets.h"
#include "ThingSpeak.h"
WiFiClient  client;

unsigned long myChannelNumber = SECRET_CH_ID;
const char * myWriteAPIKey = SECRET_WRITE_APIKEY;

void setup()
{
  Serial.begin(115200);

  WiFi.begin();
  while (WiFi.waitForConnectResult() != WL_CONNECTED) {
    delay(5000); ESP.restart();
  }

  ThingSpeak.begin(client);
  
  Serial.println(WiFi.status()); // = 3 if connecte
  Serial.println(client.connected()); 
  
  ThingSpeak.setField(8, 1);
  int lastTScode = ThingSpeak.writeFields(myChannelNumber, myWriteAPIKey);
  Serial.println(lastTScode);
  
  Serial.println(WiFi.status());
  Serial.println(client.connected());
}

void loop() {
}

Lots of discussion about this issue here: Arduino ESP32 stops posting to ThingSpeak after random times (error -301) - MATLAB Answers - MATLAB Central

The bottom line: looks like a bug in the ESP32 core library WiFiGeneric.cpp which was resolved in September 2020, but the "stable release" (that I assume most people use, like me) doesn't have that fix. An update to the core is expected "any day now" per conversation on Gitter; it will include that bug fix and many others.

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.