My project is on a NodeMCU 1.0 (ESP-12E). It is all about real-time data collection. To keep the data collection running at regular intervals, I created several "low delay state machine" modules for NTP, DDNS, and Sending Email.
The idea is that each module keeps track of whether it is idle, or waiting for a response, or needs to do something. The NTP and DDNS modules work perfectly. The emailSender, not so much.
The sendEmailStateMachine is based on EmailSender, with the State Machine idea coming from the discussion sending e-mails using esp8266 and arduino IDE.
If the WiFiClientSecure::connect() succeeds, everything is fine. But if it fails, the state machine logic tries to call connect() again, and that causes a soft WDT reset.
Normally everything works great. However, I have a cabin in the North Country, with a DSL link that works about as fast as dialup (and worse when there are multiple phones, tablets and computers all competing for bandwidth - no joke!) So quite often, it crashes repeatedly.
Note: full code are below.
Here is a sample console log to demonstrate the problem, with the stack dump replaced by the traceback from the ESP Exception Decoder.
WiFiClientSecure::connect() is first called just after 18:10:27.505, and gracefully times out / fails, at 18:10:50.871, about 30.3 seconds later. Then the state machine has it retry the connection at 18:10:51.380, and it gets a soft WDT reset at 18:10:59.707, only 8.3 seconds later. Why didn't it gracefully fail, as before? and Why is the Watchdog timer not being maintained? [ From many other trials, the time from connect() to WDT reset is usually between 2 and 3 seconds, not 8, as in this example, in case that helps anyone... ]
I've tried this code with the 8266 core 2.6.3 (as shown), and with 2.7.4, with identical results. I'm really hesitant to say its a bug in the core code, but I've tried dozens of modifications to my code with the same result, and I'm at a loss to see the bug. That's why I'm asking the community for help!
18:10:19.060 -> SDK:2.2.2-dev(38a443e)/Core:2.6.3=20603000/lwIP:STABLE-2_1_2_RELEASE/glue:1.2-16-ge23a07e/BearSSL:89454af
(shortened to meet length rule)
18:10:27.471 -> loop now=9
18:10:27.471 -> M.Mail Subject: Testing Email StateMachine
18:10:27.471 -> Mail Message: This is the test email body 1
--ME
18:10:27.505 -> SendEmail to: john.burgess.74@alumni.brown.edu
18:10:27.505 -> subject: Email Test ServerTesting Email StateMachine
18:10:27.505 -> body: This is the test email body 1
--ME - EMAIL_SM
18:10:27.505 -> Connecting to: smtp.comcast.net:465
18:10:32.473 -> pm open,type:2 0
18:10:35.860 -> BSSL:_connectSSL: start connection
18:10:38.555 -> BSSL:CERT: 30 82 07 5d 30 82 06 45 a0 03 02 01 02 02 11 00 9a 8f 52 68 32 4d fb 9d f9 94 89 fe 24 96 84 7f 30 0d 06 09 2a 86 48 86 f7 0d 01 01 0b 05 00 30 81 96 31 0b 30 09 06 03 55 04 06 13 02 47 42 31 1b 30 19 06 03 55 04 08 13 12 47 72 65 61 74 65 72 20 4d 61 6e 63 68 65 73 74 65 72 31 10 30 0e 06 03 55 04 07 13 07 53 61 6c 66 6f 72 64 31 1a 30 18 06 03 55 04 0a 13 11 43 4f 4d 4f 44 4f 20 43 41 20 4c 69 6d 69 74 65 64 31 3c 30 3a 06 03 55 04 03 13 33 43 4f 4d 4f 44 4f 20 52 53 41 20 4f 72 67 61 6e 69 7a 61 74 69 6f 6e 20 56 61 6c 69 64 61 74 69 6f 6e 20 53 65 63 75 72 65 20 53 65 72 76 65 72 20 43 41 30 1e 17 0d 32 30 30 38 31 32 30 30 30 30 30 30 5a 17 0d 32 32 30 38 31 32 32 33 35 39 35 39 5a 30 81 b9 31 0b 30 09 06 03 55 04 06 13 02 55 53 31 0e 30 0c 06 03 55 04
18:10:50.871 -> BSSL:_run_until: Timeout
18:10:50.871 -> BSSL:_wait_for_handshake: failed
18:10:50.871 -> BSSL:Couldn't connect. Error = 'Unknown error code.'
18:10:50.871 -> email client connection failed
18:10:50.871 -> Could not connect to mail server smtp.comcast.net on port 465
18:10:50.904 -> Mail respons: 5 = ES_RETRY
18:10:50.904 -> FREE HEAP! loopEmail(): 19696, was 41608, diff 21912
18:10:51.380 ->
18:10:51.380 -> loop now=33
18:10:51.380 -> M
18:10:51.380 -> pollEmail step: ST_CONNECT
18:10:51.380 -> expect resp: 220
18:10:51.380 -> Connecting to: smtp.comcast.net:465
18:10:57.624 -> BSSL:_connectSSL: start connection
18:10:59.707 ->
18:10:59.707 -> Soft WDT reset
18:10:59.707 ->
Decoding stack results
0x40204edb: BearSSL::WiFiClientSecure::_run_until(unsigned int, bool) at ...esp8266/PolledTimeout.h line 218
0x40220172: br_ssl_engine_hs_reset at src/ssl/ssl_engine.c line 1305
0x402051e4: BearSSL::WiFiClientSecure::_wait_for_handshake() at .../esp8266/2.6.3/libraries/ESP8266WiFi/src/WiFiClientSecureBearSSL.cpp line 564
0x402053ef: BearSSL::WiFiClientSecure::_connectSSL(char const*) at .../esp8266/2.6.3/libraries/ESP8266WiFi/src/WiFiClientSecureBearSSL.cpp line 1122
0x40100971: check_poison_neighbors(unsigned short) at ...esp8266/umm_malloc/umm_local.c line 71
0x40100ca4: umm_malloc(size_t) at ...esp8266/umm_malloc/umm_malloc.cpp line 511
0x4010085c: get_poisoned(void*, size_t) at ...esp8266/umm_malloc/umm_poison.c line 112
0x4021b038: ip4_output_if_opt at core/ipv4/ip4.c line 1577
0x401002d4: malloc(size_t) at ...esp8266/heap.cpp line 232
0x4010085c: get_poisoned(void*, size_t) at ...esp8266/umm_malloc/umm_poison.c line 112
0x4021b05e: ip4_output_if at core/ipv4/ip4.c line 1550
0x4021bbf7: ip_chksum_pseudo at core/inet_chksum.c line 395
0x40217142: tcp_output at core/tcp_out.c line 1361
0x4021642d: tcp_create_segment at core/tcp_out.c line 190
0x402078e3: __esp_yield() at ...esp8266/core_esp8266_main.cpp line 107
0x40207ef9: __delay(unsigned long) at ...esp8266/core_esp8266_wiring.cpp line 57
0x40207eee: __delay(unsigned long) at ...esp8266/core_esp8266_wiring.cpp line 54
0x402039f8: WiFiClient::connect(IPAddress, unsigned short) at .../esp8266/2.6.3/libraries/ESP8266WiFi/src/include/ClientContext.h line 133
0x40205553: BearSSL::WiFiClientSecure::connect(char const*, unsigned short) at .../esp8266/2.6.3/libraries/ESP8266WiFi/src/WiFiClientSecureBearSSL.cpp line 231
0x40201df7: sendEmailStateMachine::connect(char const*, unsigned int) at mypath/libraries/sendEmailStateMachine/sendEmailStateMachine.cpp line 494
0x40206a94: Print::println() at ...esp8266/Print.cpp line 186
0x402021a4: sendEmailStateMachine::pollEmail(long) at mypath/libraries/sendEmailStateMachine/sendEmailStateMachine.cpp line 316
(a bunch of calls deleted because I don't see the relevance, and to reduce post size)
0x402011cd: loopEmail(long) at mypath/Temp_Sensor_Project/testEmailSend_SM/SendEmail.ino line 51
0x40206c64: Print::println(long, int) at ...esp8266/Print.cpp line 228
0x40201557: loop() at mypath/Temp_Sensor_Project/testEmailSend_SM/testEmailSend_SM.ino line 65
0x40206a94: Print::println() at ...esp8266/Print.cpp line 186
0x4010017c: ets_post(uint8, ETSSignal, ETSParam) at ...esp8266/core_esp8266_main.cpp line 160
0x402079f8: loop_wrapper() at ...esp8266/core_esp8266_main.cpp line 180
0x401000c0: app_entry() at ...esp8266/core_esp8266_main.cpp line 303
0x40220b22: br_ssl_hs_client_run at src/ssl/ssl_hs_client.c line 1142
18:11:04.906 -> ets Jan 8 2013,rst cause:4, boot mode:(3,7)
(and reboot, repeat...)
The code including everything from setup() and loop() to sendEmail() and the sendEmailStateMachine module (.cpp and .h) are attached.
sendEmailStateMachine.cpp (20.3 KB)
sendEmailStateMachine.h (3.4 KB)
SendEmail.ino (2.11 KB)
testEmailSend_SM.ino (2.66 KB)