ESP8266WiFi - Connect Fails.... Sometimes....

I have a fairly large, complex 8266 application that implements a custom mesh-like network that is basically a tree-like structure of nodes. There is a top-level AP+STA node that connects to my TP-Link router. That node has several other 8266s, also in AP+STA mode, connected to it. Each of those nodes has a number of 8266 STA-onlly nodes connected to them. ALL nodes are running the exact same firmware, the only difference being the SSIDs and passwords they use, and their IP addresses. Hard IPs are used everywhere except for the connection to the TP-Link network, and the only communications between the nodes is via UDP packets that are sent between directly-connected nodes.

This all works just great, except for getting the initial STA connections BETWEEN 8266s going. The top-level node always connects to the TP-link just fine. The others generally connect only if all the nodes are started in the correct order. If I reset the nodes all at once, or first the top-level node, then the two intermediary nodes that connect to it, then the lowest-level nodes, all is well. But, if I then reset any of the intermediary or lowest-level nodes, they will, in most cases, never re-connect, despite all having auto-connect and auto-reconnect enabled. It acts like each node is only allowed to connect one time, and never again, either resetting the AP to which a node is trying to connect. What's really odd is the AP always works fine for OTHER drives. I can always connect my phone to any of the 8266 APs. Also, every once in a while, a single message will slip through, as though the connection is made for a split-second, then closed again.

Another thing: I NEVER see a status of WL_CONNECTED, no matter what the state of the connection is - whether the connection is up or down, status() ALWAYS returns any state BUT WL_CONNECTED. Could I be using a bad library? I'm using the one installed by board manager, for IDE v1.8.4. I also sometimes see communications in only one direction. For instance, I can sometimes send messages from an AP to a STA, but not the other way around.

What is the trick to making these things actually be able to disconnect and re-connect?

Here is the code used to initialize all the nodes. InitializeAP() is used to initialize those nodes that are both AP+STA, and InitializeTimer() is used to initalize the STA-only nodes. Again, this always works perfectly on the first try when the 8266s are reset in the correct order, so there seems to be no issue with the init sequence, or the values being passed. There must be something I'm missing in the initialization. I am current running on NODEMCUs, but the behavior is exactly the same running on ESP-01s.

boolean TimerWiFiDeviceAPI::InitializeAP(TimerModes mode,
	char *apssid, char *appassword, IPAddress apip, IPAddress gatewayip, uint8_t apchannel,
	char *stassid, char *stapassword, IPAddress staip,
	uint16_t port)
{
	boolean success = false;

	ConsolePort->printf("Initializing %s AP for Network %s on Channel %d...\n", DeviceName, TimerAPSSID, TimerAPChannel);

	while (!success)
	{
		// Set Mode
		if (!(success = WiFi.mode(WiFiMode_t::WIFI_AP_STA)))
			continue;

		// Configure STA
		if (!(success = WiFi.config(staip, gatewayip, Netmask, gatewayip)))
			continue;

		// Configure AP
		if (!(success = WiFi.softAPConfig(TimerAPIP, GatewayIP, Netmask)))
			continue;

		if (!(success = WiFi.setAutoConnect(true)))
			continue;

		if (!(success = WiFi.setAutoReconnect(true)))
			continue;

		// Start/Connect AP + STA
		if (!(success = WiFi.begin(stassid, stapassword)))
			continue;

		if (!(success = WiFi.softAP(apssid, appassword, apchannel)))
			continue;
	}
	
	// Setup port listener
	UDP = new WiFiUDP();
	ConsolePort->printf("UDP listener%s ready!\n", UDP->begin(UDPPort) ? "" : " not");
}


void TimerWiFiDeviceAPI::InitializeTimer(char *stassid, char *stapassword, IPAddress staip, IPAddress gatewayip)
{
	boolean success = false;

	// Setup WiFi Client
	ConsolePort->printf("Initializing %s Timer for Network %s...\n", DeviceName, stassid);

	while (!success)
	{
		// Set Mode
		if (!(success = WiFi.mode(WiFiMode_t::WIFI_STA)))
			continue;

		// Configure STA
		if (!(success = WiFi.config(staip, gatewayip, Netmask, gatewayip)))
			continue;

		if (!(success = WiFi.setAutoConnect(true)))
			continue;

		if (!(success = WiFi.setAutoReconnect(true)))
			continue;

		// Connect STA
		if (!(success = WiFi.begin(stassid, stapassword)))
			continue;
	}

	// Setup port listener
	UDP = new WiFiUDP();
	ConsolePort->printf("UDP listener%s ready!\n", UDP->begin(UDPPort) ? "" : " not");
}

Regards,
Ray L.

Here is the simplest test case I could put together, which for reasons I can't understand, does not work at all. Data is sent, but never received. I thought I was starting to understand these things....

[UPDATE] This example is now working - I had forgotten to put a call to RecvUDP() in loop(). But that leaves the question - why does it NOT work in my big application? The ONLY WiFi calls in the rest of the code are UDP send and receive code, identical to what's in this example. Yet, with that code running, it does not re-connect reliably, or hardly at all....

Just program two 8266s, one programmed with AP set to 0, the other with AP set to 1. They SHOULD be sending UDP packets to each other every 10 seconds, and each should be printing any packets it receives to its Serial port. The APs ARE working, all the IPs, SSIDs, passwords, channels, etc. are correct but no data seems to pass between them. Why? What am I doing wrong?

#include <Arduino.h>
#include <stdarg.h>
#include <stdio.h>
#include <ESP8266WiFi.h>
#include <WiFiUdp.h>


Stream *ConsolePort = &Serial;

char* GatewaySSID = "TP-LINK_A47DC2";
char* GatewayPassword = "79517515";
uint16_t UDPPort = 8000;
WiFiUDP *UDP = NULL;

#define AP			1

#if		AP
#define STASSID		"QDNetwork0"
#define STAPASSWORD	"QDPassword0"
#define STAIP		IPAddress(192,168,0,2)
#define GATEWAYIP	IPAddress(192,168,0,1)

#define APSSID		"QDNetwork1"
#define APPASSWORD	"QDPassword1"
#define APIP		IPAddress(192,168,1,1)
#define APCHANNEL	6

#define TARGET		IPAddress(192,168,0,1)
#else
#define STASSID		"TP-LINK_A47DC2"
#define STAPASSWORD	"79517515"
#define STAIP		IPAddress(10,0,0,50)
#define GATEWAYIP	IPAddress(10,0,0,1)

#define APSSID		"QDNetwork0"
#define APPASSWORD	"QDPassword0"
#define APIP		IPAddress(192,168,0,1)
#define APCHANNEL	1

#define TARGET		IPAddress(192,168,0,2)
#endif

#define NETMASK		IPAddress(255,255,255,0)

char *statusStrings[] =
{
	"WL_IDLE_STATUS",
	"WL_NO_SSID_AVAIL",
	"WL_SCAN_COMPLETED",
	"WL_CONNECTED",
	"WL_CONNECT_FAILED",
	"WL_CONNECTION_LOST",
	"WL_DISCONNECTED",
};

void setup()
{
	delay(1000);
	((HardwareSerial *)ConsolePort)->begin(115200);
	ConsolePort->println("\n\nNODEMCU Ready\n\n");

	delay(1000);
	WiFi.begin();

	InitializeAP(APSSID, APPASSWORD, APIP, GATEWAYIP, APCHANNEL, STASSID, STAPASSWORD, STAIP);
}


boolean InitializeAP(char *apssid, char *appassword, IPAddress apip, IPAddress gatewayip, uint8_t apchannel,
					 char *stassid, char *stapassword, IPAddress staip)
{
	boolean success = false;

	while (!success)
	{
		// Set Mode
		if (!(success = WiFi.mode(WiFiMode_t::WIFI_AP_STA)))
			continue;

		// Configure STA
		if (!(success = WiFi.config(staip, gatewayip, NETMASK, gatewayip)))
			continue;

		// Configure AP
		if (!(success = WiFi.softAPConfig(apip, gatewayip, NETMASK)))
			continue;

		if (!(success = WiFi.setAutoConnect(true)))
			continue;

		if (!(success = WiFi.setAutoReconnect(true)))
			continue;

		// Start/Connect AP + STA
		if (!(success = WiFi.softAP(apssid, appassword, apchannel)))
			continue;

		if (!(success = WiFi.begin(stassid, stapassword)))
			continue;
	}

	// Setup port listener
	UDP = new WiFiUDP();
	UDP->begin(UDPPort);

	IPAddress STAip = WiFi.localIP();
	IPAddress APip = WiFi.softAPIP();
	ConsolePort->printf("AP Ready, STAIP=%d.%d.%d.%d, APIP=%d.%d.%d.%d\n", STAip[0], STAip[1], STAip[2], STAip[3], APip[0], APip[1], APip[2], APip[3]);
	delay(500);
}


boolean sendUDP(char *s)
{
	boolean ret = false;

	IPAddress ip = TARGET;
	if (ret = UDP->beginPacket(ip, UDPPort))
	{
		UDP->write(s);
		ret = UDP->endPacket();
	}
	ConsolePort->printf("status()=%s, Sent %s to %d.%d.%d.%d\n", statusStrings[WiFi.status()], s, ip[0], ip[1], ip[2], ip[3]);
	return ret;
}


void RecvUDP(void)
{
	char packetBuffer[128];
	int len;

	while (len = UDP->parsePacket())
	{
		// read the packet into packetBufffer
		len = UDP->read(packetBuffer, 128);
		packetBuffer[len] = '\0';
		ConsolePort->printf("status=%s, Received: %s\n", statusStrings[WiFi.status()], packetBuffer);
	}
}


uint32_t start1 = 0;
uint32_t start2 = 0UL - 5000UL;
void loop()
{
	if ((millis() - start1 > 10000) && (!AP))
	{
		sendUDP("Hello from AP0");
		start1 = millis();
	}
	if ((millis() - start2 > 8000) && (AP))
	{
		sendUDP("Hello from AP1");
		start2 = millis();
	}
	WiFi.status() == WL_CONNECTED;

        RecvUDP();
}

Regards,
Ray L.

I am SUCH an idiot! Somehow, somewhere, sometime, the WiFi.begin() call disappeared from my code. It works MUCH better with that included!

Regards,
Ray L.