nRF24L01 mesh network with 16 nodes - healing issue

Hello, I have created a mesh network of 1 master and 16 nodes for controlling led strips on drums (dumline) from a master controller. I am using arduino uno boards with nRF24L01 transceivers. I have this project working well with less than 8 or so nodes, but as I bring additional nodes online and they join the network, some nodes become hung and are not rejoining the network. Since this is a visually synced lighting application, I need to ensure the nodes are constantly checking to ensure they are in the network and that they successfully receive the lighting commands from the master transmitter.

Some other details around the design:
I am using WS2812b led strips and the FASTLED library so I can create chase patterns with custom color palettes to animate around the drums.
To minimize the amount of necessary data transfer, I have stored around 30 scenes on each node. The master is simply transmitting a signal to all nodes containing instructions for which node should run which scene, (ex, run scene 3 on node 1, scene 6 on node 2, scene 1 on node 3, etc)
It would be nice to specify a command for each drum/node, individually, but in the interest of minimizing data transfer, I grouped all drums into 7 groups, and just send a scene command for each group. When I load the code to each board, I manually set a variable to uniquely identify the node id for each one.
Master scenes are stored on the master unit and I’m using up/down buttons to cycle thru scenes, then another button push to transmit the scene to the nodes. I use a loop to send the command multiple times as I was not able to determine how to use an ACK to ensure each node receives the command. So opportunity to improve robustness here, my priority is to get some expert advice on how to get these 16 nodes to robustly stay connected to the network and able to receive commands so I can control the lighting on the drums remotely. You’ll notice I have connected buttons to 2 pins on each node too to allow the drummer to manually cycle up/down thru the scenes. Optimally I would like the nodes to heal themselves into the network in the most feasible order (since the drummers will change formations and positions) but it would also be acceptable to just define a hierachy and statically address each node if there are issues with them stepping on each other, or not releasing the last id from the mesh. I’m just having a really hard time troubleshooting and don’t see issues until I have physically added 8 or so nodes to the network, so really need some expert advice here on how to make some quick fixes that I could test and get this network working :slight_smile: Thank you for your help! This is my first big arduino project and i’ve done my best to research this and resolve but really just need help getting over the hump and I’ll have a magnificent arduino project!! (793 KB)

I would not use a network structure in this case, it adds overhead and latency.

You could send the current state vector in a multicast packet of 32 bytes.

Each node could pick its mode/state from a fixed position.

Send the current global state x times per second, and if the state changes.

So you can control 32 / 1 byte state devices, or 16 / 2 byte state devices.

A small header with a packet number to make missed packets detectable
and some flags (same state as in last packet, bits to signal state changes per device, ...) could also be useful.

This would obviously have impact on the number of controllable devices, or the amount of state data per device.

The devices could report their state once a second or on change.
Changes from the master could be ignored after a local change until
the master sends the reported state once in a command packet.

The state reports could be sent to the well known controller address with retry and ack.