I have a device that can power two LED strips using mosfets and also a short RGB LED strip. It's based on an ESP32 feather from Adafruit, so I'm using Wifi to communicate and MQTT as the way to do it. On top of that I'm using one of the pins to use as a touch switch. The switch allows me to turn the light off or on, while over MQTT I can do the same, plus send a color to the strip (notification type stuff).
Now all works. For 1:48 or 108 minutes. Then it stops working. I confirm this by running a last will message on the mqtt connection, so when it connects it sends a connected mesage on the status topic and then a disconnected one when it dies, and the mqtt goes away. At 1:47 everything works 100% as intended. A bit after, it just stops. No mqtt messages are being processed, using the touch switch no longer works, just dead.
I've gone over the code and done some debugging, removing features etc, but so far, nothing did the trick, 108 minutes and it's dead. Since it's so absolutely consistent, I thought perhaps someone here has encountered something similar before and can point me in the right direction to look for. I checked my access points and wifi network to see if maybe there it does a weird thing and dies because of something there, but can't find anything that even matches close to that time. So I'm totally at a loss.
I'm not averse to post the code, but decided to just ask generally first and see if someone has an idea, before subjecting people to my spaghetti.
ChristianRiesen:
Now all works. For 1:48 or 108 minutes.
Is there some number that is stored in the wrong type of variable and which increments to the point of overflowing.
For example on an Uno or Mega an int can only count up to +32767 and the next increment changes it to -32768. If you needed a larger counter on an Uno you would have to define the variable as unsigned int (max 65535) or unsigned long with a max of about 4 billion (232)
I use the Uno and Mega as examples as I am not sure what sizes the datatypes on an ESP32 have.
Sounds rather like it is using something from the String (note the capital "S") class, which is pretty much guaranteed to fail. Do not use String-s, only defined character arrays.
ChristianRiesen:
I have a device that can power two LED strips using mosfets and also a short RGB LED strip. It's based on an ESP32 feather from Adafruit, so I'm using Wifi to communicate and MQTT as the way to do it. On top of that I'm using one of the pins to use as a touch switch. The switch allows me to turn the light off or on, while over MQTT I can do the same, plus send a color to the strip (notification type stuff).
Now all works. For 1:48 or 108 minutes. Then it stops working. I confirm this by running a last will message on the mqtt connection, so when it connects it sends a connected mesage on the status topic and then a disconnected one when it dies, and the mqtt goes away. At 1:47 everything works 100% as intended. A bit after, it just stops. No mqtt messages are being processed, using the touch switch no longer works, just dead.
I've gone over the code and done some debugging, removing features etc, but so far, nothing did the trick, 108 minutes and it's dead. Since it's so absolutely consistent, I thought perhaps someone here has encountered something similar before and can point me in the right direction to look for. I checked my access points and wifi network to see if maybe there it does a weird thing and dies because of something there, but can't find anything that even matches close to that time. So I'm totally at a loss.
I'm not averse to post the code, but decided to just ask generally first and see if someone has an idea, before subjecting people to my spaghetti.
Notice how everyone is guessing? Yeah. Probably time to post your code. You may even get advice on other issues or ideas on ways to do some things differently.
Please read the first post in any forum entitled how to use this forum. http://forum.arduino.cc/index.php/topic,148850.0.html .
Then look down to item #7 about how to post your code.
It will be formatted in a scrolling window that makes it easier to read.
I was looking at millis for a culprit, but unless I'm doing something terribly wrong here and didn't fully understand what I found, I have a feeling this is not it. It's only used in the loop at the very bottom of the code.
I had a look at String and eliminated that entirely from my code, though left it in place commented out for now to show where I used it. The error persists.
A quick recap what this is: A lamp that has 2 LED strips controlled by mosfets (I want to implement some dimming later, but first I need this bug ironed out). One is warm white, the other cold white. In the end I want to switch between with an mqtt message (some server decides when to switch). I also have a single RGB LED in place for some ambient display notifications. The code for that is a placeholder just to see if it works, I intend to do more with it later. At the moment it reacts to the numbers 1-3 given for a color to show and 0 to turn it off. There is a metal base that I use as a capacitive touch switch, to turn the light on or off when you are standing near it. And of course I can turn it on and off by using mqtt messages. It's all built on a feather ESP32 from adafruit.
Everything works fine a second after it gets power. I can use the touch switch, I can send messages and it reacts instantly like intended. It works that way for about 108 minutes then it stops working. Touch doesn't work, it doesn't react to mqtt messages. I got those times by observing when I get the status topic messages for connect and the last will message for disconnect from mqtt. These 108 minutes are extremely consistent. The seconds fluctuates a bit, but it's around the 44 seconds mark, so it would be around 6524 seconds or so in total that it runs before it dies.
I had to attach the code as it alone is over 9000 characters long which is the limit this forum allows me.
After a lot of println's the sketch died at the connect line for pubsub client, when it tried to reconnect. No idea why it would so consistently disconnect after 108 minutes, but that's another mystery. I found this issue about it on github concerning it:
One way that seems to help is to wait for 5 seconds before trying to reconnect, but there also are some inputs on using a newer esp32 general library and the updated version of the pubsuc client. I'll try those out and see if I'm having some success with that.
I checked the pubsubclient library and it was already at the latest version of 2.7 for me. Then I checked the boards. "esp32" by Espressif was at 1.0.2, but two newer versions were out. I updated to 1.0.4 and it stopped connecting to my hidden SSID, with the same code it used before. I connected it to a visible SSID and instantly everything started to work again. So that's a different thing to debug.
Now the project no longer breaks, it keeps on running and has now for several hours. It still however disconnected (and reconnects) every 108 minutes. Still hunting that one down, but at least it no longer completely dies.
Thanks for all the help, I learned a few new things and I hope this post might help someone else in the future.
ChristianRiesen:
Now the project no longer breaks, it keeps on running and has now for several hours. It still however disconnected (and reconnects) every 108 minutes. Still hunting that one down, but at least it no longer completely dies.
Is it just the Pubsub client that disconnects or is Wifi also disconnecting.
If Wifi is also disconnecting then is it using a static IP or DHCP.
If DHCP then what how long is the lease time set for on the router.