IDE 2.0 : OTA mDNS discovery only intermittently finds nodes

Using IDE 1.8, OTA updates can be reliably used to transfer my sketch program to a couple of ESP8266 NodeMCU devices.

But when I try IDE 2.0 the nodes only appear intermittently in the Tools > Port > Network ports menu. Sometimes they are there, sometimes not.

So with the same test setup, IDE 1.8 can find and update OTA nodes 100% of the time. But IDE 2.0 mostly cannot find the same nodes.

Is this something I might have setup wrong in IDE 2.0? Does the OTA configuration in the sketch code need to be changed to work with IDE 2.0?

TIA for any pointers or fixes.

Other than installing the ESP8266 boards platform via Boards Manager, no other setup should be needed.

You might check to make sure you have the most recent version installed (I have done quite a lot of testing of OTA uploads, but only with the latest ESP8266 platform):

  1. Select Tools > Board > Boards Manager from the Arduino IDE menus to open the "Boards Manager" view in the left side panel.
  2. Scroll down through the list of boards platforms until you see the "esp8266 by ESP8266 Community" entry. Check the installed version number shown on the line below the title of the entry.

If you have the latest version installed, it should say "Version 3.0.2"

image

No. The same sketch should work for either IDE.

I do have something you can try. Please tell me which operating system you are using (e.g., "Windows" or "Linux") so that I can provide you with the correct instructions.

Thank you for the very quick response :slightly_smiling_face:

I confirm that I am using Version 3.0.2 for the board manager ESP8266.

Sorry - should have mentioned I'm using Windows 10 Home.

Note that I can ping my ESP8266 nodes from the Windows Terminal using the names I selected in their code using

ArduinoOTA.setHostname(NODE_NAME);

And as I said, if I simply close IDE 2.0 and load IDE 1.8 the nodes almost immediately appear and stay in the Tools > Port menu.

Excellent. So we know that things are working at a low level. We also know things are not working correctly at the high level of the Arduino IDE 2.x interface. There are multiple components between the low level network layer and the Arduino IDE 2.x interface, and these components are not used by Arduino IDE 1.x. So it is useful to determine exactly which component is not detecting the port. Then the investigation can be focused on that component, which might not be the Arduino IDE 2.x at all.

The lowest level component is a command line tool named mdns-discovery. This is a "discovery" tool, which identifies the Arduino board "ports" of a given protocol ("network" in this case) and then outputs a list of those ports. That list is eventually displayed in the Arduino IDE GUI (the IDE doesn't actually know anything about ports on its own, it only displays the list provided by the discovery tool).

You can run a simple experiment to see whether mdns-discovery is behaving and able to see the port of your board.

:exclamation: NOTE: This will not solve the problem. This is only intended to possibly gather some more information about the problem, which might provide a clue that leads to a solution.

I'll provide the instructions here:

During all this, keep an eye out for anything that doesn't match the expected behavior as described at each step.

  1. Open the following folder in Windows "File Explorer":
    C:\Users\<username>\AppData\Local\Arduino15\packages\builtin\tools\mdns-discovery
    
    Note that the AppData folder is hidden by default in "File Explorer". You can make it visible by opening the "View" menu, then checking the box next to "☐ Hidden items".
  2. The mdns-discovery folder will contain a subfolder for each of the versions of mdns-discovery which are installed on your computer.
    For example:
    mdns-discovery/
    ├── 0.9.2/
    ├── 1.0.2/
    └── 1.0.5/
    
  3. Hold the Shift key while clicking the right hand button on the mouse on the folder with the highest version (1.0.5 in the example above).
  4. From the context menu, click "Open PowerShell window here".
    PowerShell will now open.
  5. Type the following command:
    mdns-discovery
    
  6. Press the Enter key.
    mdns-discovery will now start. There won't be any obvious sign of this other than that there is no longer a command prompt at the cursor in the terminal.
  7. Type the following command:
    HELLO 1 "arduino-cli 0.21.0"
    
  8. Press the Enter key.
    You should now see a response printed exactly like this:
    {
      "eventType": "hello",
      "protocolVersion": 1,
      "message": "OK"
    }
    
  9. Disconnect your Arduino board from its power source if you have it powered.
  10. Type the following command:
    START_SYNC
    
  11. Press the Enter key.
    You should now see a response printed exactly like this:
    {
      "eventType": "start_sync",
      "message": "OK"
    }
    
    You might also see some additional objects in the output depending on which network ports are available on your computer.
  12. Power your Arduino board.
    You should eventually see a response printed that looks something like this:
    {
      "eventType": "add",
      "port": {
        "address": "192.168.254.127",
        "label": "esp32-b4e62dbf693d at 192.168.254.127",
        "protocol": "network",
        "protocolLabel": "Network Port",
        "properties": {
          ".": "node32s",
          "auth_upload": "no",
          "board": "node32s",
          "hostname": "esp32-b4e62dbf693d.local.",
          "port": "3232",
          "ssh_upload": "no",
          "tcp_check": "no"
        }
      }
    }
    
    This is only an example of what you might see. The output will be different depending on the board you connected.
  13. Disconnect your Arduino board from its power source.
    You should eventually see a response printed that looks something like this:
    {
      "eventType": "remove",
      "port": {
        "address": "192.168.254.127",
        "label": "esp32-b4e62dbf693d at 192.168.254.127",
        "protocol": "network",
        "protocolLabel": "Network Port",
        "properties": {
          ".": "node32s",
          "auth_upload": "no",
          "board": "node32s",
          "hostname": "esp32-b4e62dbf693d.local.",
          "port": "3232",
          "ssh_upload": "no",
          "tcp_check": "no"
        }
      }
    }
    
    This is only an example of what you might see. The port data should be the same as the "add" event you saw when you plugged the board in.

You should continue to see the same results if you repeat steps (10) and (11).

Once you are done with your experiments with the "mdns-discovery" tool, follow these instructions to exit:

  1. Type the following command:
    STOP
    
  2. Press the Enter key.
    You should now see a response printed exactly like this:
    {
      "eventType": "stop",
      "message": "OK"
    }
    
  3. Type the following command:
    QUIT
    
  4. Press the Enter key.
    You should now see a response printed exactly like this:
    {
      "eventType": "quit",
      "message": "OK"
    }
    
    You should now be back at the shell command line.

Please let me know if you have any questions or problems while following those instructions.

I have 1.0.6 installed (only).

Running the mdns-discovery tests worked exactly as you described. Here's my observations :

  1. I have two ESP8266 nodes online on my local wifi network with OTA code enabled.
  2. One node is located a bit remotely so I've left it online and tested with the other node.
  3. I unplugged, plugged, and eventually unplugged the other node per your instructions. When I plugged it in, I eventually received a eventType: add for the node after about 90 seconds to two minutes. Disconnecting the node also eventually results in an eventType: remove
  4. I left mdns-discovery running and watched the action. Periodically my two nodes would separately generate eventType: add notification followed anywhere from 30 seconds to maybe five minutes later by eventType: remove . Essentially, the nodes appeared to go to add and remove state at random. Sometimes both were "removed", sometimes both "add", and at times one was "add" while the other was "remove".
  5. I tried to watch the IDE 2.0 Tools > Port menu to see if there was a correlation between a node showing as in "add" or "remove" state and the IDE recognizing the node as existing. I was unable to see any immediate correlation between the two but it felt like the IDE eventually mirrored what the mdns-discovery app was showing although a node could go to "add" and then "remove" without the IDE 2.0 noticing.

I can continue to watch things looking for a pattern but thought I'd post this update now.

That is fine. The IDE will automatically install each new version as it is released, so you might accumulate multiple versions over time, but the IDE only uses the latest installed version so as long as 1.0.6 is there (I wrote the instructions some time ago when 1.0.5 was the latest), then everything is OK.

Very interesting. Does this phenomenon occur even when Arduino IDE 2.x is not running at the same time as mdns-discovery? I don't expect there would be any effect caused by the IDE running, but I ask because if the problem occurs even without the IDE running then we can conclusively rule it out as a factor in the problem and be able to focus our attention entirely on the other layers of the system.

Yes. I initially tested without the IDE running. The add / remove behavior happens without it active.

I figured I'd try for a correlation with the IDE in keeping with your words "keep an eye out for anything that doesn't match the expected behavior as described at each step." I'd suggest the IDE tracks roughly with what mdsn-discovery is seeing - nodes slowly appearing and disappearing at random. But it's not reacting to the changes as quickly.

Great. So we know the problem is either in mdns-discovery or lower.

Even though it is possible there is some difference at a lower level (I'm not very knowledgeable about the subject), I would expect it is pretty much the same for Arduino IDE 1.x as for mdns-discovery.

What I am wondering is whether Arduino IDE 1.x might just not be showing the intermittent loss of ports. I have noticed that Arduino IDE 1.x is generally less responsive when it comes to the appearance and disappearance of network ports. I have even observed that occasionally the network port will still be shown in the Tools > Port menu long after I powered off the board that produced the port.

What happens if you try using Arduino IDE 1.x to do an OTA upload to a board immediately after you have see the "remove" event appear in the mdns-discovery output?

The timing might be a bit tricky because the IDE goes through a whole compilation process before it starts the actual upload, so it wouldn't be an effective experiment if an "add" event happened before the IDE had gotten to the real upload process (which happens after you see the memory usage info in the output panel at the bottom of the Arduino IDE window:

Sketch uses 260089 bytes (24%) of program storage space. Maximum is 1044464 bytes.
Global variables use 27892 bytes (34%) of dynamic memory, leaving 54028 bytes for local variables. Maximum is 81920 bytes.

Yes, definitely a good idea. That is important information.

I've seen that too. In fact, IDE 1.x seems to know about the previously found IP addresses of OTA nodes immediately when it's loaded. Are they cached somewhere?

Tested this - OTA updates still work if IDE 1.x uploads after mdns-discovery has recently reported a "remove" event with no following "add" for the affected node.

But that kind of fits with my comments above. Once IDE 1.x has an IP address for a node it seems to hang onto it like grim death. And it feeds that to espota.py so that it knows where to send the upload. That process does not require mdns-discovery to know about the status of the node one way or the other.

IDE 2.0 doesn't seem to remember node to IP address mapping nearly as well.

As an aside, I'm guessing that simply running espota.py from the command line with my static IP addresses will get me past this issue if I continue to use IDE 2.0.

I left mdns-discovery running for hours and didn't get even a single "remove" event from my ESP8266 board running the "BasicOTA" example sketch from the ESP8266 ArduinoOTA library.

Does the problem still occur if you use that sketch (File > Examples > ArduinoOTA > BasicOTA)?

Do you have an adequate power source for your boards? The ESP8266 has momentary high current draws while operating the WiFi radio and these can cause unstable behavior if the power source can't keep up.

Thank you for doing that. I'm currently working on the assumption that I have an issue with how my ESP82666 NodeMCU's respond to mDNS requests.

It seems that delayed or intermittent mDNS responses cause IDE 2.0 to not find, or to drop, the nodes. IDE 1 seems to keep any response cached locally so can map IP addresses to nodes that are not necessarily responding well to mDNS requests.

Can you try one thing for me please? There is another command sequence you can use with mdsn-discovery that I think should give you a list of online nodes? I'm guessing that the IDE's may be using this format? Can you tell me what happens with your stable system? Mine only occasionally LISTs my nodes.

  1. HELLO 1 "arduino-cli 0.21.0"
  2. START
  3. LIST
  4. STOP
  5. QUIT

Yup - I have a few things related to signal strength (RSSI -70db), node too busy, or power that I need to try out. It seems the issue is how well my nodes are responding to mDNS requests. The IDE 2.0 vs 1.8 problem is likely just a side effect of that.

I pretty eliminated concerns about power supply droop issues using my oscilloscope when I studied why the ESP8266 AtoD converter is so intermittently noisy (readings shift by 6 to 10 counts occassionally over milliseconds intervals - presumably when the node transmits over wifi). So I currently don't think that's a problem - I'm using a 5V supply good for at least 2 amps.

EDITS :

  • same issues using BasicOTA as the program. In fact, I further reduced it to essetially just
    this without any improvement :

MDNS.begin("node1")) // in setup() with the wifi startup code

void loop() { MDNS.update(); } // main loop

  • moving a node next to the wifi hub improves signal strength to rssi: -64 but does not improve the situation with IDE 2.0 not reliably seeing the nodes while IDE 1.8 has far less issues with finding the same nodes
  • trying to ping a node using its mDNS name (e.g. ping node1.local) usually times out on the first attempt with a message ping: node1.local: Name or service not known but connects properly on the next attempt.
  • it really looks like IDE 2.0 uses the LIST command in mdns-discovery.exe. Running that command in a PowerShell window while IDE 2.0 is also running shows a pretty good correlation between what both programs can "see" or "not see". I wonder if that code is perhaps not waiting long enough for reponses to an mDNS poll? (ping seems to have a similiar issue though ..)
  • i've now left mdns-discovery.exe running all day. Curiously it sees periodic "add" events for both OTA nodes every 10 to 20 minutes or so, but never any "remove" events. I'm not sure if that means anything.

As I'm not making much progress other than gradually eliminating things that don't seem to matter, I decided to download, install and configure IDE 1.8 and 2.0 on one of my Linux machines.

Interestingly, the two versions of Arduino IDE do the exact same thing on the Linux machine as they do on my Windows 10 computer. IDE 1.8 finds my ESP8266 OTA nodes right away and holds onto them. IDE 2.0 has trouble finding my OTA nodes and loses them fairly quickly once one or the other is found.

Which seems to rule out the PC O/S as a factor in this puzzle? ESP8266's are really cheap devices so I can see them not being reliable. But that doesn't explain why IDE 1.8 mostly always finds my nodes and IDE 2.0 doesn't, or loses them when it does.

The port is shown reliably in the LIST output:

per@HAL MINGW64 ~/AppData/Local/Arduino15/packages/builtin/tools/mdns-discovery/1.0.6
$ ./mdns-discovery.exe
HELLO 1 "arduino-cli 0.21.0"
{
  "eventType": "hello",
  "message": "OK",
  "protocolVersion": 1
}
START
{
  "eventType": "start",
  "message": "OK"
}
LIST
{
  "eventType": "list",
  "ports": [
    {
      "address": "192.168.254.130",
      "label": "esp8266-880a45 at 192.168.254.130",
      "protocol": "network",
      "protocolLabel": "Network Port",
      "properties": {
        ".": "\\\"ESP8266_NODEMCU_ESP12E\\\"",
        "auth_upload": "no",
        "board": "\\\"ESP8266_NODEMCU_ESP12E\\\"",
        "hostname": "esp8266-880a45.local.",
        "port": "8266",
        "ssh_upload": "no",
        "tcp_check": "no"
      }
    }
  ]
}
STOP
{
  "eventType": "stop",
  "message": "OK"
}
START
{
  "eventType": "start",
  "message": "OK"
}
LIST
{
  "eventType": "list",
  "ports": [
    {
      "address": "192.168.254.130",
      "label": "esp8266-880a45 at 192.168.254.130",
      "protocol": "network",
      "protocolLabel": "Network Port",
      "properties": {
        ".": "\\\"ESP8266_NODEMCU_ESP12E\\\"",
        "auth_upload": "no",
        "board": "\\\"ESP8266_NODEMCU_ESP12E\\\"",
        "hostname": "esp8266-880a45.local.",
        "port": "8266",
        "ssh_upload": "no",
        "tcp_check": "no"
      }
    }
  ]
}
LIST
{
  "eventType": "list",
  "ports": [
    {
      "address": "192.168.254.130",
      "label": "esp8266-880a45 at 192.168.254.130",
      "protocol": "network",
      "protocolLabel": "Network Port",
      "properties": {
        ".": "\\\"ESP8266_NODEMCU_ESP12E\\\"",
        "auth_upload": "no",
        "board": "\\\"ESP8266_NODEMCU_ESP12E\\\"",
        "hostname": "esp8266-880a45.local.",
        "port": "8266",
        "ssh_upload": "no",
        "tcp_check": "no"
      }
    }
  ]
}
LIST
{
  "eventType": "list",
  "ports": [
    {
      "address": "192.168.254.130",
      "label": "esp8266-880a45 at 192.168.254.130",
      "protocol": "network",
      "protocolLabel": "Network Port",
      "properties": {
        ".": "\\\"ESP8266_NODEMCU_ESP12E\\\"",
        "auth_upload": "no",
        "board": "\\\"ESP8266_NODEMCU_ESP12E\\\"",
        "hostname": "esp8266-880a45.local.",
        "port": "8266",
        "ssh_upload": "no",
        "tcp_check": "no"
      }
    }
  ]
}
STOP
{
  "eventType": "stop",
  "message": "OK"
}
QUIT
{
  "eventType": "quit",
  "message": "OK"
}

I ran the command quite a few additional times without ever getting an empty ports array.

Thanks for that information. Definitely not what I am seeing.

So a couple more data points.

On my Linux machine, avahi-browse (a Linux mDNS cli browser) lists my to ESP8266 OTA node reliably every time I run it. And the IDE 1.8 version finds my nodes reliably but 2.0 still struggles to find and keep track of my OTA nodes. (EDIT: apparently avahi caches info - reloading a node with a new program & name results in avahi reporting both the old and new node as existing).

On my Windows machine, I took a gamble and remove the Bonjour app that iTunes had installed. That's supposed to be Apple's version of an mDNS interface. This had no effect on IDE 2.0 or the response from mdns-discovery or using ping from a windows terminal. But curiously IDE 1.8 now struggles to find any OTA nodes where previously it could always find them. Without bonjour performance is similar to IDE 2.0 at finding nodes.

Next step? Maybe wireshark while ESP8266 debug messages are logged to a terminal so that I can see what packets are actually going back and forth. A quick first look with wireshark shows IDE 1.8 handles mDNS at the packet level differently than IDE 2.0. But there are a whole lot of other devices on my LAN flinging around mDNS request and responses and maybe that's part of the problem. I'll try turning everything else off and quiet the LAN down to just IDE 2.0 and the ESP8266 nodes.

EDIT : reinstalling bonjour restores IDE 1.8's ability to find OTA nodes more reliably. Enabling and disabling the service and reloading the IDE several times confirms this.

I was seeing identical symptoms as waterwingz. I am on windows 10 using IDE 2.0.3 (tried build 2.0.4-nightly-20230129 with same results. (also I read somewhere that IDE 1.x does not use the pluggable msdn-discovery, but has built in msdn discovery, that explains the difference). So I will describe the symptoms first then tell you the work around. With ide 1.8.19 I get very responsive discovery but like waterwingz they don't go away very responsive. Using the Bonjour browser (has a refresh button) always sees both my network nodeMCU's and removal when turned off, pretty in sync with IDE 1.x. IDE 2.0.x I would wait hours before one would show up and disappear randomly. Running mdns-discovery.exe saw the same rarely add and usually remove soon after. I tried msdn-discovery 1.6 and 1.5 with same results. i noticed on one of the early build release notes it said query all network interfaces. So looking through the code I see it queries every 30 sec, but those queries time out after 15 secs. putting those together made me wonder if the problem was related to multiply network interfaces. IT WAS/IS. So I have a virtual interface for Docker that shows as active even when Docker is shut down. when I went to the netwoks interfaces page in settings i was able to right click and disable it, leaving only one network interface. bang mdns-discovery pick up both nodemcu's right away. I was running both V1.0.5 and V1,0,6 it was interesting that 1.0.5 was slightly more responsive than 1.0.6. hope this work around helps someone. I should probably jump on github and report an issue and the work around.

1 Like

Hi @johnpage. Thanks so much for sharing your findings. Please do submit an issue on GitHub. That would be very valuable:

https://github.com/arduino/mdns-discovery/issues/new/choose

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.