WiFiNINA client unreliable on MKR 1010

I have completed a port of a messaging app to the Arduino MKR 1010. The app was developed on LINUX using a RPI 4.

Without too much trouble, the socket layers were moved to the WiFiNINA WiFi and Client APIs. The app opens a persistent socket to a server and then starts passing messages back and forth. The protocol negotiates, so there are quite a few (like a dozen) very short messages (< 15 bytes) initially.

The LINUX/RPI is rock solid.

The MKR 1010 experiences staged degradation until it simply becomes useless and requires powering off.

Stage 1: The outgoing messages partially stick on the WiFi MKR 1010 somewhere between Client.write() and the wire. No amount of flushing writing etc. seems to unjam it. It will sometimes wait 30 seconds or more. The data is all there, because ultimately the messages all get processed. When I restart the 1010, sometimes they'll all come through then. But this behaviour has rendered the app unusable.

Stage 2: The WiFiNINA begins disconnecting after a few minutes. It connects, then disconnects, over. This has been excellent at working out the bugs in the reconnect and recovery code in the app, since the RPI handles a lot of that at a lower level. But after that, it becomes inconvenient.

Stage 3: The WiFiNINA simply no longer can create sockets. The WiFi network disconnects. The recovery code reconnects and verifies a CONNECTED status. It then tries to open a socket which fails with no sockets available. The recovery/reconnect methods are closing the previous sockets , disconnecting the client, and deleting the previous client object and when the network comes back ups, creating a new one.

While this is occuring, I have the RPI version connected to the same server over the same wifi network, and it stays up with fast response times while the MKR is suffering. The router is 8 ft away in the same room, so good signal. I verified the firmware is the latest, made sure I'm using the latest WiFiNINA libraries etc. I also have several of these 1010's lying around, so I tried another with the same problem.

I'm rather disappointed. This seems like such a fun board, but the performance has me concerned. Anyone have any experience or successes stories with persistent socket connections to encourage me to keep trying to make this work? I see several chat board topics but never any clear resolutions. So I thought I'd post here. Esp wrt to the outgoing message delays.

Thank you.

I assume this is a software problem. I'm quite sure the hardware is able to hold a connection for a longer time.

A comparison to a Raspberry Pi 4 isn't fair, there you have full fledged operating system doing networking in an optimized way. Here you have a microcontroller that does all the network stuff over a serial connection to an ESP8266 running a special firmware.

I cannot guarantee that there isn't a bug in the WiFiNINA firmware that is responsible for the symptoms you describe but looking at the time frames you mention I would be very surprised if it isn't a bug in your code, which you're hiding from us by the way.

The instructions may help.

Thank you for your response! If you'e been able to get the MKR 1010 to hold a persistent connection, that was encouragement enough to continue. Thank you for the motivation :slight_smile:

The code is rather extensive, a port from Linux, as I mentioned. And there is server code as well. There were just so many files to post. It was more courtesy. But thank you for the offer to look it over. I think this general experience you have getting lasting persistent connections was very helpful though.

For those that might encounter similar issues, let me summarize some things I have learned since my encouragement:

  1. Part of the problem seemed to be a buried "printf" trace statement. In the port, I missed a few of these hidden in one file. I wouldn't have expected it to even compile, since I thought there was no printf on the Arduino. But it did. They made no output to the serial console either, so I didn't catch it. When these were ported correctly, things got much smoother with message flow. Not perfect, but much better. Very strange, but Client and printf both depend on streams, so maybe there is some kind of interaction. At the moment, I'm not even sure how printf was resolving.

  2. Closing sockets even after a wifi disconnect is important. The app was attempting to do this, but there was a glitch causing the close not to get executed. The wifi disconnect does not seem to automatically clear the existing sockets from the previous wifi connection. My understanding from my tests is there is a finite number of sockets, 256. After so many wifi disconnects and reestablishing new connections w/o closing the old ones, it just runs out and cannot connect anymore, requiring a hard reset.

  3. With those changes, longer running tests were possible and a regularity to the disconnects became apparent. Exactly one hour. Some things I have read suggest perhaps the WiFiNINA firmware might not be handing DHCP renewals smoothly. My particular router doesn't seem to have any settings for this, but this might be something for me to look at next. A question I have is whether a wifi disconnect automatically invalidates the open socket connection. Right now, when a disconnect is detected, the code immediately closes everything down and reestablishes the connection anew. Is that the best approach? The firmware won't reconnect on its own, I did run some tests in that regard. But can I temporarily stop all messages and just reconnect the wifi and then continue using the same socket connection? In this app, brand new connections require a lot of negotiation, so it would be more efficient not to have to renegotiate everytime the wifi glitches.

  4. These changes have made the app much better. The connection has been lost every hour at 32 minutes past the hour, suggesting an external factor such as the router. But that aside the app is still not 100%. The message flow still has small pause glitches, and it seems the server is occasionally even receiving corrupted data or maybe incomplete data due to the pauses. This is detected and retransmitted, so the app recovers, but it's still a problem. In such a simple configuration, there shouldn't be a need for that error handling. Therefore, I have more work to do here.

And I agree, the RPI/4 and the Arduino are definitely not equal. The comparison is provided mainly because, while there may yet be more errors in the app port, it manifests because of some difference. I was interested in wisdom from anyone who might have experience with significant considerations for these implementation differences. For example, I cannot seem to find any way to set socket options. How does it handle keepalive, Nagle etc? At the moment, I don't think keepalive is my issue, but if it were going across the Internet, it might be. The later could cause transmission delays depending on how it works. I'm happy just to read docs if there are any pointers. The online reference manuals just seem a bit terse when it comes to such nuances.

Also, I am interested in opinions on the MKR1010 or perhaps one of the other Arduino boards in more complex IoT environments. And if so, which is the best, most stable board to use. I like not having the complexity of the OS. But at the same time the os does have it's uses. I'm trying to understand what is best.

Again, thank you for your consideration.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.