Please review my RF24 IoT network

Hello everyone, after a bit of thinking and playing around with IoT, I've created a very simple mesh-type network layer sitting on top of the RF24 library. I wanted to get it working to ensure my concepts were sound, then see what other people had to say to improve on it. I created this mainly for my own amusement, but with a practical purpose (somewhere down the line) of creating a very reliable and redundant IoT base. Be warned, if you look at the code you should see that I'm more of a BASIC sort of programmer, there are some ugly labels in there and I've completely ignored the usefulness of classes in favor of absolute simplicity. But in my defense, this is just the first working version of it all. Comments, improvements, criticism are welcomed.

github README.md:
RF24 based sensor-mesh (flood, addressless) network

*****This uses the modified RF24 library available here GitHub - gcopeland/RF24: Arduino driver for nRF24L01. You'll need to replace your RF24 library with this one to compile (this fork adds broadcast capability, and also claims minor performance improvements, various bug fixes, not exactly drop-in compatible so be warned).

Most of this was implemented using the original RF24 driver available here RF24: Driver for nRF24L01(+) 2.4GHz Wireless Transceiver
Some ideas and code were also taken from this project GitHub - mic159/ArduinoMesh: A wireless mesh network platform on arduino using the NRF24L01 modules. Work in progress.

It operates using broadcasts and as such does not use or need any addresses.

It is structured as follows:

[nodes >-> relay mesh >-> base]

Nodes only talk to relays, relays talk amongst each other, and also the talk to the base.

The relays create a mesh network amongst themselves. They broadcast every message they receive to every other relay. With this layout, only one relay has to be within range of the base, and each relay only needs to be within range of one other relay. The relays don't have addresses and broadcast blindly to each other. Logic is setup to prevent infinite rebroadcast loops between the relays.

Nodes broadcast their messages to relays, but not to each other. If more than one relay can listen to a node, each relay will re-transmit that reading along to the base, creating a bit of redundancy. Nodes have delivery acknowledgment.

The base potentially receives multiple copies of the same node message and only accepts the first instance of it.

ADVANTAGES:
-simple to setup.
-no addresses needed.
-nodes can be eliminated in favor of an all relay network.
-small, under 8k. Could get smaller with some tinkering (remove printf from RF24)

DISADVANTAGES:
-relay messages flood the network. The number of messages (should be) (number of relays)^2. So with a large number of relays, a huge amount of traffic is generated.
-The logic to stop infinite relay loops probably needs to be worked on more. It is very simple at this point and probably the most likely point of failure.
-At this time, no acknowledgement of [relay >-> base]messages.
-At this time, one way communication only.

To get this working, the most straight-forward setup is with three boards. Burn node, relay, and base to each and they should all start talking to each other. Other possible setups would be two or more relays, or relays and a base. There is some testing code in the relays to send out a broadcast (look at the bottom for Serial.read), negating the need for a node.

I tried this with one relay, and one base.

When I type in the Serial Monitor on the relay, I see dots for every character, but I haven't seen any response
on the base. Shouldn't I see "Got message from 0x" for each character?

I had trouble compiling with the gerg library until I moved rf24HQ out of my libraries directory. I didn't seem to have this problem with the original RF24 library. Strange.

Looks like an interesting implementation. Hope I can make it work.

-transfinite

Well, it's tough to debug, but the relay seems to be working on the software end. There is no error checking with relay -> base. I would put a node sketch on one and a relay sketch on another and run that. The node will ACK if the message was received. If it is telling you NACK, then you know something is likely wrong with your hardware setup. Make sure your CE and CSN pins are set correctly as those are the only things that might be different if you had the hardware working under a different library. Another part to check, because RF24 had to be replaced, in my experience, you can't just rename the old folder, it will confuse the IDE, the entire old folder needs to be moved out of the libraries folder. The last thing is something I learned the hard way. Unplug and let your modules power down. If you, for example, open a reading or writing pipe in another library, then install another library without powering down, whatever pipe was left open will continue to be open. This can cause some silent errors.

I got a chance to test this again yesterday with good results.

I did as you suggested, with one node and one relay. As long as they were in range, I was getting ACKs for each transmission, and I was seeing the relay respond that it was receiving the message.

Next I added the base. I moved the location of the relay roughly half way between the node and the base, and it worked perfectly. It really extended the range of the node, and it never missed a transmission. The node and base were in different rooms, and before adding the relay, the reception was hit and miss. Afterward, only hits. Pretty impressive!

Next I'll add a real sensor to the node, and see what kind of range I can get. I think I have one or two more nRF24's. I'll try adding another relay and see how that extends the range.

Thanks justind000.

-transfinite

I am puzzled by the "+ x" in the #define statments.

What is 'x'? I do not see a definition of it. Is it some sort of dynamic array notation that I've not seen before?

@transfinite glad you got it working. It's still pretty lax on real testing. Once I got it working, I've turned my attention to making boards for nodes so I've yet to get it going with more than three. Let me know if you find any problems.

@JohnHoward at the top of each file you'll see these lines

#define BASEBROADCAST(x) (0xBB00000000LL + x)
#define RELAYBROADCAST(x) (0xAA00000000LL + x)
#define NODEACK(x) (0xCC00000000LL + x)

I found that notation in ArduinoMesh, which I mentioned in my original posting. What it does is add whatever digit you specify in ()'s to the address ie

  radio.openReadingPipe( 1, RELAYBROADCAST(2) );    //relays send on this
  radio.openReadingPipe( 2, RELAYBROADCAST(1) );    //Nodes send on this

I set up a base, one relay and 3 nodes. Seems well behaved although occasionally the data gets scrozzled (from, ID, or Hops turns up a really huge value. They are spread out between 4 rooms of the house but no more than a 25 ft radius from the relay, I'd guess.

I soldered headers on 3 more pro minis this evening and will set up another relay and 2 nodes on that to see what happens.

I'm out of usb cables and computers to plug them into so I have to add 9V battery leads to the next batch.

Still don't understand where 'x' gets a value, though. the '+ x' makes sense as an offset of the address but I guess #define always interprets x = 0?

Hmm, detecting a problem. The relay stops every so often. Nodes keep going, solid NAKs. When I reset the relay, everything starts humming along again. Below are relay messages between resets.

relay starting...
Got message from 0xABCD ID:9341 Hops: 0
Got message from 0x2A89 ID:4492 Hops: 0
Got message from 0xABCD ID:C966 Hops: 0
Got message from 0xABCD ID:6715 Hops: 0
Got message from 0xABCD ID:4E10 Hops: 0
Got message from 0xABCD ID:760B Hops: 0
Got message from 0xABCD ID:B3FA Hops: 0
Got message from 0xABCD ID:194F Hops: 0
Got message from 0xABCD ID:9144 Hops: 0
Got message from 0xABCD ID:D392 Hops: 0
Got message from 0xABCD ID:CF81 Hops: 0
Got message from 0xABCD ID:2C43 Hops: 0
Got message from 0xABCD ID:3973 Hops: 0
Got message from 0xABCD ID:B29A Hops: 0
Got message from 0xABCD ID:7C86 Hops: 0
Got message from 0xABCD ID:E0BA Hops: 0
Got message from 0xABCD ID:A5FD Hops: 0
Got message from 0xABCD ID:834 Hops: 0
Got message from 0xABCD ID:94CA Hops: 0
Got message from 0xABCD ID:31C2 Hops: 0
Got message from 0xABCD ID:8C05 Hops: 0
Got message from 0xABCD ID:4ABE Hops: 0
Got message from 0xABCD ID:6005 Hops: 0
Got message from 0xABCD ID:B8D0 Hops: 0
Got message from 0xABCD ID:CDB2 Hops: 0
Got message from 0xABCD ID:7643 Hops: 0
Got message from 0xABCD ID:8419 Hops: 0
Got message from 0xABCD ID:FA0E Hops: 0
Got message from 0xABCD ID:1DEA Hops: 0
Got message from 0xABCD ID:76A0 Hops: 0
Got message from 0xABCD ID:D8CF Hops: 0
Got message from 0xABCD ID:EAE2 Hops: 0
Got message from 0xABCD ID:187E Hops: 0
Got message from 0xAB0FFCF ID:55FBFFFF Hops: 14723584
Got message from 0xABCD ID:8B8F Hops: 0
Got message from 0xABCD ID:2CE5 Hops: 0
Got message from 0xABCD ID:3E3B Hops: 0
Got message from 0xABCD ID:B03 Hops: 0
Got message from 0xABCD ID:4D58 Hops: 0
Got message from 0xA495BFDF ID:623B5F55 Hops: -1587736028
Got message from 0xABCD ID:9510 Hops: 0
Got message from 0xABCD ID:8307 Hops: 0
Got message from 0xABCD ID:5221 Hops: 0
Got message from 0x80B4BFCD ID:80D97C81 Hops: -2119359984
Got message from 0xABCD ID:D411 Hops: 0
Got message from 0xAAEA5709 ID:0 Hops: -1979056120
Got message from 0xABCD ID:A902 Hops: 0
Got message from 0xABCD ID:A902 Hops: 98323713

I'll try adding some more output to the relay to see if I can figure out what's going on in there when it stops.

Still don't understand where 'x' gets a value, though. the '+ x' makes sense as an offset of the address but I guess #define always interprets x = 0?

Think of the #define as a substitution.

#define NODEACK(x) (0xCC00000000LL + x)

Where ever you see NODEACK(x), the preprocessor will substitute (0xCC00000000LL + x)

So NODEACK(1) becomes (0xCC00000000LL + 1)

The relay stops every so often.

I'm seeing this also. When I added a second relay both relays would tend to report large hop numbers. Sometimes I would have to reset one of the relays to get any response on the base.

With 2 nodes and one relay it happened less often, but the relay did lock up at least once, and had to be reset.

-transfinite

The large values, I suspect, are from messages being corrupted. I would see that every now and then too, especially when I would try sending a byte rather than a long in the message. No idea why.

As for the relay needing to be restarted, I don't really have any solid ideas. What is the output prior to it not working? If the relay isn't getting anything at all, I don't know, but if a message is making it, my best guess would be the DupID function isn't working. I've suspected that the logic to prevent infinite relays needs reworking. If you make any progress, let me know and we can merge any fixes on github.

I'll be trying to debug my problem in the next few days -- to see whether it's a memory issue, does the large number come from the sender or in the relay problem and whether it could just be related to Serial.print(). I found that I could restart the relay traffic just by pressing 'enter' on the keyboard.

I saw where you took numerous ~100 millisecond delays out of RF24.cpp. Maybe it is a timing issue stemming from that.

I'll contemplate the DupID, too.

Sorry if I'm being dense but are you saying:

#define NODEACK(x) (0xCC00000000LL + x)

should be edited to replace 'x' with a number? I compiled verbatim, with the 'x'. I presume #define doesn't know what 'x' is and ends up using zero? I'm astonished there is no compiler error due to x being undefined, which it seems to be as I could not find anything anywhere declaring it in the code files.

JohnHoward:
Sorry if I'm being dense but are you saying:

#define NODEACK(x) (0xCC00000000LL + x)

should be edited to replace 'x' with a number? I compiled verbatim, with the 'x'. I presume #define doesn't know what 'x' is and ends up using zero? I'm astonished there is no compiler error due to x being undefined, which it seems to be as I could not find anything anywhere declaring it in the code files.

Google "C macro tutorial". #define is not just for constant values.

x is an argument in the macro definition, analogous to a argument declared in a function header. It doesn't have any value until you instantiate the macro, e.g., NODEACK(1) gets translated to (0xCC00000000LL + 1) by the preprocessor, and that's what the compiler sees.

JohnHoward, how are you supplying 3.3v to the RF module from the pro minis? Or are you using the 3.3v version of the mini?

I'm mostly using nano's, but since the pro minis are so cheap on ebay these days, I'm thinking I won't even use headers. I'll just solder wires directly from the mini to the rf24 module and to a temp sensor to minimize cost and space.

-transfinite

I found some small 5v-to-3v regulator breakouts a while back

http://www.aliexpress.com/snapshot/215915954.html

and am using those.

Since ultimately I plan to connect some PIR sensors that need a 5v supply, I run the minis at 5v (which means faster clock speed, too). I failed to notice the mini's didn't have 3v3 output when I bought them, otherwise I'd have a bunch of nanos.

pico:

JohnHoward:
Sorry if I'm being dense but are you saying:

#define NODEACK(x) (0xCC00000000LL + x)

should be edited to replace 'x' with a number? I compiled verbatim, with the 'x'. I presume #define doesn't know what 'x' is and ends up using zero? I'm astonished there is no compiler error due to x being undefined, which it seems to be as I could not find anything anywhere declaring it in the code files.

Google "C macro tutorial". #define is not just for constant values.

x is an argument in the macro definition, analogous to a argument declared in a function header. It doesn't have any value until you instantiate the macro, e.g., NODEACK(1) gets translated to (0xCC00000000LL + 1) by the preprocessor, and that's what the compiler sees.

Ah, the light comes on. I wasn't thinking in terms of a macro but as a constant declaration. After 15 years of writing macro assembler code on PDP-11's in the olden days, you'd think I'd have made the connection.

Maybe I found the reason for the relay stopping. If there is a dup detected, then the radio will be left in .stopListening() condition.

Now that I've said that and hit Post, I'll probably see it lock up again, but so far (15 minutes or so of watching it) and it has not.

I added a radio.startListening() right after this line, at around line 72:

radio.write( &header.ID, sizeof(header.ID), true ); //send out ack with the id of our received message

Edit: OK, it ran all night without locking up.

Looked at the node example after noting it would occasionally stop sending. Found a problem:

      if (src == header.ID)
        Serial.print("ACK: ");Serial.println(src, HEX);
        retries = 0;

There should be braces enclosing the 3 statements comprising the 'if' condition.

I noticed the missing braces yesterday while I was getting rid of the gotos from node.ino. I just couldn't deal with them any longer.
This seems to be working for me. I mostly just added a while loop. Posted in its entirety for clarity.

#include <SPI.h>
#include "nRF24L01.h"
#include "RF24.h"

RF24 radio(9,10);

#define RELAYBROADCAST(x) (0xAA00000000LL + x)
#define NODEACK(x) (0xCC00000000LL + x)

struct SENSOR{
  float temp;
  float humidity;
  float pressure;
};

struct HEADER{
  long type;
  long hops;
  long src;
  long ID;
  SENSOR sensor;
};

  HEADER header;

  long cnt[10] = {};
  byte cntID = 0;
  byte retries = 0;             //how many times have we tried to rx
  const byte MAX_RETRIES = 5;  //how many times will we try?
  

void setup(void){
  Serial.begin(57600);
  radio.begin();
  radio.setRetries(15,15);
  radio.enableDynamicPayloads();
  randomSeed(analogRead(0));
  Serial.println("node starting...");
  radio.openReadingPipe(1 ,NODEACK(1));
  radio.openWritingPipe(RELAYBROADCAST(2));
  radio.startListening();
 }

void loop(void){
  header.ID = random(1, 0xffff);    //this is above the label for testing
  while (retries < MAX_RETRIES) {

    radio.stopListening();
    header.type = 3;
    header.hops = 0;
    header.src = 0xabcd;
    
    header.sensor.temp = 78.8;
    Serial.println(header.ID, HEX);
    //send a relay broadcast to any relay that can hear
    radio.openWritingPipe(RELAYBROADCAST(1));
    radio.write( &header, sizeof(header), true );
    radio.startListening();
    // Wait here until we get a response, or timeout (250ms)
    unsigned long started_waiting_at = millis();
    bool timeout = false;
    while ( ! radio.available() && ! timeout )
      if (millis() - started_waiting_at > 200 )
        timeout = true;

    // Describe the results
    if ( timeout ){
      Serial.print("NACK");
      retries++;
    }
    else {
      long src;      //ack returns just the header.ID, so check what was returned with what was sent
      radio.read( &src, radio.getDynamicPayloadSize() );
      if (src == header.ID) {
        Serial.print("ACK: "); Serial.println(src, HEX);
        retries = 0;
        break;
      }
    }
  }
  
  retries = 0;
  delay(3000);

  //testing
if ( Serial.available() )
  {
    radio.stopListening();
    Serial.read();
    header.type = 3;
    header.hops = 0;
    header.src = 0xabcd;
    header.ID = random(1, 0xffff);

    Serial.println(header.ID, HEX);
  //send a relay broadcast
  radio.openWritingPipe(RELAYBROADCAST(2));
  bool ok = radio.write( &header, sizeof(header), true );
  radio.startListening();
  }
}

-transfinite