nrf24l01+ received message lost before it gets read?

Background: I have a project with 2 arduinos and 2 nrf24l01+ modules using the RF24 library from here RF24: Driver for nRF24L01(+) 2.4GHz Wireless Transceiver. One listens all the time, and the other sends a message when an interrupt is triggered. I have been having trouble with what I think is interference. One day, everything works very reliably, and the next, I will rarely be able to receive the messages (1% success). I've tried using different frequencies (channels), but which ones work well changes often. I decided to try to write my code so the transmitter and receiver will try several channels and hopefully work more often. The way I wrote it, the transmitter sends on one channel a few times, checking for success (ack) each time. If it worked, it stops sending. If not, it changes to a different channel and tries again. It has a list of channels it just keeps running through until it works or the timeout is reached. The receiver listens on a channel long enough for the transmitter to send the message on all the channels, then goes to the next one. The transmitter and the receiver both cycle through the same list of channels.

Problem: Most of the time this seems to work fine, but occasionally, the transmitter will start sending a message, stop sending the message before the timeout (this should mean it was received properly and the transmitter got an ack back), but a message is never read by the receiver code. I don't know what could cause this. Is something I'm doing clearing the receive buffer before I'm reading it or something? I looked around in RF24.cpp, but did not find anything that would do this. Any ideas? I included all my code below with comments to try to make it clearer. Any ideas of what's happening here, or a better way to implement the channel switching, or interference avoiding, etc. are welcome.

/*
 Copyright (C) 2011 J. Coliz <maniacbug@ymail.com>
 This program is free software; you can redistribute it and/or
 modify it under the terms of the GNU General Public License
 version 2 as published by the Free Software Foundation.
 
 TMRh20 2014 - Updates to the library allow sleeping both in TX and RX modes:
 TX Mode: The radio can be powered down (.9uA current) and the Arduino slept using the watchdog timer
 RX Mode: The radio can be left in standby mode (22uA current) and the Arduino slept using an interrupt pin
 */
 
 //Modified by David Hoff Aug-2014

#include <SPI.h>
#include <avr/sleep.h>
#include <avr/power.h>
#include "nRF24L01.h"
#include "RF24.h"
#include "printf.h"
// Set up nRF24L01 radio on SPI bus plus pins 10 & 9
RF24 radio(10,9);
// sets the role of this unit in hardware
// Connect to GND to set role bit to 0
// Leave open to set role bit high
// 3 role pins will set role
const byte role_pin1 = 14; // most significant bit in role
const byte role_pin2 = 15; // middle bit in role
const byte role_pin3 = 16; // least significant bit in role 
const byte knock_sensor_pin = 2; // on tx board, interrupt 0 will be triggered by piezo sensor here
// Radio pipe address for the nodes to communicate.
const uint64_t pipes[1] = { 
  0x7C3D43386BLL};   

// channels to hop to (trying to get around unpredictable interference problems)
const byte ch[5] = { 
  5, 25, 45, 65, 85 };
//which one to use right now
byte ch_index = 0;

unsigned long last_hop = 0;
unsigned long hop_time = 0;

// Role management
// Set up role.  This sketch uses the same software for all the nodes
// in this system.  Doing so greatly simplifies testing.  The hardware itself specifies
// which node it is.
const byte base_station = 0; // makes the loop below easier to read
// current role in this sketch (will get set to correct role later)
byte role = 0;

// Sleep declarations
void go_to_sleep(void);
void wake_up(void);

void setup(){
  // set up debugging light blinker pin
  pinMode(19, OUTPUT);
  // set up relay output pin
  pinMode(18, OUTPUT);
  // set up knock sensor pin
  pinMode(knock_sensor_pin, INPUT);
  // set up the role pins
  pinMode(role_pin1, INPUT);
  digitalWrite(role_pin1,HIGH);
  pinMode(role_pin2, INPUT);
  digitalWrite(role_pin2,HIGH);
  pinMode(role_pin3, INPUT);
  digitalWrite(role_pin3,HIGH);
  delay(100); // Just to get a solid reading on the role pins
  // read the role pins, establish our role
  role = digitalRead(role_pin1) << 1;
  role = role | digitalRead(role_pin2);
  role = role << 1;
  role = role | digitalRead(role_pin3);

  // disconnect internal pullups to save power (otherwise grounded pins would draw current)
  digitalWrite(role_pin1,LOW);
  digitalWrite(role_pin2,LOW);
  digitalWrite(role_pin3,LOW);

  // show role with debugging led
  if (role == 0){
    role = 8;
  }
  for(byte count = 1; count <= role; count ++){
    digitalWrite(19,HIGH);
    delay(300);
    digitalWrite(19,LOW);
    delay(300);
  }
  if (role == 8){
    role = 0;
  }
  
  // blink debugging led / relay to show program starting
  delay(1000);
  digitalWrite(19,HIGH);
  digitalWrite(18,HIGH);
  delay(1000);
  digitalWrite(19,LOW);
  digitalWrite(18,LOW);
  delay(1000);
  ///////////////////////

  Serial.begin(9600);
  printf_begin();
  //  printf("\n\rRF24/examples/pingpair_sleepy/\n\r");
  //  printf("David's corn cannon project mod\n\r");
  //  printf("ROLE: %s\n\r",role);

  // Setup and configure rf radio
  radio.begin();
  radio.setAutoAck(1);
  radio.setPayloadSize(sizeof(role));	
  radio.setPALevel(RF24_PA_HIGH);
  radio.setDataRate(RF24_250KBPS);
  radio.setRetries(5,4);
  // hop_time is how long base station waits on each channel listening
  // has to wait long enough for tx to send messages on all the channels so it has a chance to receive the message
  hop_time = 38; // (delay 5 = 1500us) * (1 send + 4 retry ) * 5 channels = 37500us = approx 38ms 
  radio.setChannel(ch[ch_index]); // set first channel to use
  
  // setup pipe. rx listens on it, tx sends on it
  if ( role == base_station ) { // if role is base station
    radio.openReadingPipe(1, pipes[0]); // read on pipe 0
    radio.startListening(); // make base station listen    
    last_hop = millis(); // set start time for listening on this channel          
  } 
  else { // if role is target
    radio.openWritingPipe(pipes[0]); // write on pipe 0
  }
  // Dump the configuration of the rf unit for debugging
  // radio.printDetails();
}

void loop(){
  if (role == base_station) {
    byte target_hit = 0; // reset this each time
    // I only care about most recent message, so...
    while (radio.available()) { // while there is something to read
      radio.read( &target_hit, sizeof(target_hit) ); // read it and put value in target_hit
      //        printf("Target %d hit",target_hit);     
    }
    if(target_hit){ // if anything was read (should never get a message of 0, so this is ok
      switch(target_hit){
      case 1: // only message it should ever receive (only roles I'm using are 0 and 1, and 0 is rx)
        digitalWrite(19,HIGH); // led
        digitalWrite(18,HIGH); // relay
        delay(500);
        digitalWrite(19,LOW); // led
        digitalWrite(18,LOW); // relay
        delay(500);
        break; 
      default: // this means it was a junk message as only message it should receive is 1
      // this is just so I can tell if it somehow got a bad message
      digitalWrite(19,HIGH); // led
        delay(5000);
        digitalWrite(19,LOW); // led
        delay(500);
        break;  
      }
    }
    if (millis() - last_hop >= hop_time){ // if rx has been on this channel long enough for tx to send on all channels
      // go to the next channel
      if(ch_index<4){
        ch_index++;
      }
      else{
        ch_index = 0;
      }
      radio.setChannel(ch[ch_index]);
      last_hop = millis(); // reset last_hop to now
    }
  }   
  else { // if role is not base station
    radio.powerUp(); // Power up the radio after sleeping                                 
    //    printf("Now sending... %d \n\r",role); // print role
    unsigned long trytime = 3000; // max transmit time
    unsigned long startsend = millis(); // when transmit started
    while( millis() - startsend < trytime){ // for up to trytime, keep sending
      if(radio.write(&role, sizeof(role))){ // if it got an ack, stop trying to send
        break;
      }
      else{ // if it didn't get an ack
        // pick the next frequency
        if(ch_index<4){
          ch_index++;
        }
        else{
          ch_index = 0;
        }
        // go to that frequency
        radio.setChannel(ch[ch_index]);
      }
    }
    // by now, it either sent ok or timed out
    
    delay(100);                   // why delay here? it was in an example, so I copied it.
    // Power down the radio.  
    radio.powerDown();            // NOTE: The radio MUST be powered back up again manually
    go_to_sleep();                // Sleep the MCU.
  }
}


void wake_up(void){
  // disable sleep when mcu wakes up
  sleep_disable();
  // force interrupt pin low again (knock sensor has some capacitance)
  pinMode(knock_sensor_pin, OUTPUT);
  digitalWrite(knock_sensor_pin, LOW);
  pinMode(knock_sensor_pin, INPUT); // return to normal
}


void go_to_sleep(void)
{
  digitalWrite(19,LOW); // turn off led before sleeping
  set_sleep_mode(SLEEP_MODE_PWR_DOWN); // sleep mode is set here
  sleep_enable();
  attachInterrupt(0,wake_up,RISING); // this will let it wake up again
  sleep_mode();                        // System sleeps here
  // interrupt 0 (knock sensor) wakes the MCU from here
  // redundant sleep_disable()? oh well. won't hurt. other one is in wake_up()
  sleep_disable(); // System continues execution here when knock sensor triggers interrupt 0  
  detachInterrupt(0); // to avoid triggering several times
  digitalWrite(19,HIGH); // turn led on to show mcu is awake
}

Not sure if this is the entire problem, but the receiving radio is only properly using 1 channel since it can't change channels when actively listening.

    if (millis() - last_hop >= hop_time){ // if rx has been on this channel long enough for tx to send on all channels
      // go to the next channel
      if(ch_index<4){
        ch_index++;
      }
      else{
        ch_index = 0;
      }
      radio.stopListening();     //< -- Need this
      radio.setChannel(ch[ch_index]);
      radio.startListening();    //< -- And This
      last_hop = millis(); // reset last_hop to now
    }

Thanks for the info. I had wondered about that, but didn't see anything about it in the documentation. I'll give it a try. I wonder if I make it stop listening, and then check if anything was received one more time after it stops, if that would make sure I always read it out.

I tried changing the receiver code to the following. Now it stops listening, checks for anything received one more time, then changes channels, and starts listening again. I still have instances where the transmitter thinks the message went through, but the receiver doesn't read the message out. Ideas?

if (role == base_station) {
    byte target_hit = 0; // reset this each time
    // I only care about most recent message, so...
    while (radio.available()) { // while there is something to read
      radio.read( &target_hit, sizeof(target_hit) ); // read it and put value in target_hit
      //        printf("Target %d hit",target_hit);     
    }
    if (millis() - last_hop >= hop_time){ // if rx has been on this channel long enough for tx to send on all channels
      // go to the next channel
      if(ch_index<4){
        ch_index++;
      }
      else{
        ch_index = 0;
      }
      radio.stopListening();
      // check for messages again before changing channels
      while (radio.available()) { // while there is something to read
        radio.read( &target_hit, sizeof(target_hit) ); // read it and put value in target_hit
        //        printf("Target %d hit",target_hit);     
      }
      radio.setChannel(ch[ch_index]);
      radio.startListening(); //start listening again
      last_hop = millis(); // reset last_hop to now
    }
    if(target_hit){ // if anything was read (should never get a message of 0, so this is ok
      switch(target_hit){
      case 1: // only message it should ever receive (only roles I'm using are 0 and 1, and 0 is rx)
        digitalWrite(19,HIGH); // led
        digitalWrite(18,HIGH); // relay
        delay(500);
        digitalWrite(19,LOW); // led
        digitalWrite(18,LOW); // relay
        delay(500);
        break; 
      default: // this means it was a junk message as only message it should receive is 1
        // this is just so I can tell if it somehow got a bad message
        digitalWrite(19,HIGH); // led
        delay(5000);
        digitalWrite(19,LOW); // led
        delay(500);
        break;  
      }
    }
  }

I'm not sure, in testing with a delay instead of sleep, the issue doesn't happen for me, so would lean towards it being hardware related. If that is the case, there are many threads etc. regarding power supplies and capacitors being used to improve these situations, so I won't go into details.

Very interesting. I'll try it with a delay and see if that works for me too. Maybe it will help me figure something out.

I changed the sleep section of the code to delay a random amount of time instead of sleeping (the section I changed is below). It does not seem to change the behavior for me. I let it go through the transmit loop 50 times, and the receiver only reacted 41 of those. All 50 times the led to indicate when the transmitter is not sleeping came on and went off very quickly instead of staying on for 3 seconds as it should if it never receives an ack. TMRh20, was this working all the time for you? Can you post the exact code you used to test, please? I appreciate your help.

void go_to_sleep(void)
{
  digitalWrite(19,LOW); // turn off led before sleeping
  set_sleep_mode(SLEEP_MODE_PWR_DOWN); // sleep mode is set here
  
 // sleep_enable();
 // attachInterrupt(0,wake_up,RISING); // this will let it wake up again
 // sleep_mode();                        // System sleeps here
 
 delay(random(1000,3000));
  // interrupt 0 (knock sensor) wakes the MCU from here
  // redundant sleep_disable()? oh well. won't hurt. other one is in wake_up()
  sleep_disable(); // System continues execution here when knock sensor triggers interrupt 0  
  detachInterrupt(0); // to avoid triggering several times
  digitalWrite(19,HIGH); // turn led on to show mcu is awake
}

Sure, below is the code I'm using, and I'm at 19684 payloads without missing a beat. You would likely just have to change the radio CE/CS pins from 7 & 8. As I mentioned, many people suggest a filter and decoupling capacitor to reduce issues.

Output:

Device 1: 
Ch 45
19684
Ch 65
19685
Ch 85
19686
Device 2:
Ch 45
19684
Ch 65
19685
Ch 85
19686
/*
 Copyright (C) 2011 J. Coliz <maniacbug@ymail.com>
 This program is free software; you can redistribute it and/or
 modify it under the terms of the GNU General Public License
 version 2 as published by the Free Software Foundation.
 
 TMRh20 2014 - Updates to the library allow sleeping both in TX and RX modes:
 TX Mode: The radio can be powered down (.9uA current) and the Arduino slept using the watchdog timer
 RX Mode: The radio can be left in standby mode (22uA current) and the Arduino slept using an interrupt pin
 */
 
 //Modified by David Hoff Aug-2014

#include <SPI.h>
#include <avr/sleep.h>
#include <avr/power.h>
#include "nRF24L01.h"
#include "RF24.h"
#include "printf.h"
// Set up nRF24L01 radio on SPI bus plus pins 10 & 9
RF24 radio(7,8);
// sets the role of this unit in hardware
// Connect to GND to set role bit to 0
// Leave open to set role bit high
// 3 role pins will set role
//const byte role_pin1 = 14; // most significant bit in role
//const byte role_pin2 = 15; // middle bit in role
//const byte role_pin3 = 16; // least significant bit in role 
//const byte knock_sensor_pin = 2; // on tx board, interrupt 0 will be triggered by piezo sensor here
// Radio pipe address for the nodes to communicate.
const uint64_t pipes[1] = { 0x7C3D43386BLL};   

// channels to hop to (trying to get around unpredictable interference problems)
const byte ch[5] = { 
  5, 25, 45, 65, 85 };
//which one to use right now
byte ch_index = 0;

unsigned long last_hop = 0;
unsigned long hop_time = 0;

// Role management
// Set up role.  This sketch uses the same software for all the nodes
// in this system.  Doing so greatly simplifies testing.  The hardware itself specifies
// which node it is.
const byte base_station = 0; // makes the loop below easier to read
// current role in this sketch (will get set to correct role later)
byte role = 0;

// Sleep declarations
void go_to_sleep(void);
void wake_up(void);
uint32_t sendCounter = 0;

void setup(){
  // set up debugging light blinker pin
  //pinMode(19, OUTPUT);
  // set up relay output pin
  //pinMode(18, OUTPUT);
  // set up knock sensor pin
  //pinMode(knock_sensor_pin, INPUT);
  // set up the role pins
  //pinMode(role_pin1, INPUT);
  //digitalWrite(role_pin1,HIGH);
  //pinMode(role_pin2, INPUT);
  //digitalWrite(role_pin2,HIGH);
  //pinMode(role_pin3, INPUT);
  //digitalWrite(role_pin3,HIGH);
  delay(100); // Just to get a solid reading on the role pins
  // read the role pins, establish our role
  //role = digitalRead(role_pin1) << 1;
  //role = role | digitalRead(role_pin2);
  //role = role << 1;
  //role = role | digitalRead(role_pin3);

  // disconnect internal pullups to save power (otherwise grounded pins would draw current)
  //digitalWrite(role_pin1,LOW);
  //digitalWrite(role_pin2,LOW);
  //digitalWrite(role_pin3,LOW);

  // show role with debugging led
  if (role == 0){
    role = 8;
  }
  for(byte count = 1; count <= role; count ++){
    //digitalWrite(19,HIGH);
    //delay(300);
    //digitalWrite(19,LOW);
    //delay(300);
  }
  if (role == 8){
    role = 0;
  }
  
  // blink debugging led / relay to show program starting
  //delay(1000);
  //digitalWrite(19,HIGH);
  //digitalWrite(18,HIGH);
  //delay(1000);
  //digitalWrite(19,LOW);
  //digitalWrite(18,LOW);
  //delay(1000);
  ///////////////////////

  Serial.begin(115200);
  printf_begin();
  //  printf("\n\rRF24/examples/pingpair_sleepy/\n\r");
  //  printf("David's corn cannon project mod\n\r");
  //  printf("ROLE: %s\n\r",role);

  // Setup and configure rf radio
  radio.begin();
  radio.setAutoAck(1);
  radio.setPayloadSize(sizeof(role));	
  radio.setPALevel(RF24_PA_HIGH);
  radio.setDataRate(RF24_250KBPS);
  radio.setRetries(5,4);
  // hop_time is how long base station waits on each channel listening
  // has to wait long enough for tx to send messages on all the channels so it has a chance to receive the message
  hop_time = 38; // (delay 5 = 1500us) * (1 send + 4 retry ) * 5 channels = 37500us = approx 38ms 
  radio.setChannel(ch[ch_index]); // set first channel to use
  
  // setup pipe. rx listens on it, tx sends on it
  if ( role == base_station ) { // if role is base station
    radio.openReadingPipe(1, pipes[0]); // read on pipe 0
    radio.startListening(); // make base station listen    
    last_hop = millis(); // set start time for listening on this channel          
  } 
  else { // if role is target
    radio.openWritingPipe(pipes[0]); // write on pipe 0
  }
  // Dump the configuration of the rf unit for debugging
  // radio.printDetails();
}

void loop(){
  if (role == base_station) {
    byte target_hit = 0; // reset this each time
    // I only care about most recent message, so...
    while (radio.available()) { // while there is something to read
      radio.read( &target_hit, sizeof(target_hit) ); // read it and put value in target_hit
      //        printf("Target %d hit",target_hit);    
     sendCounter++; 
     Serial.println(sendCounter);
    }
    if(target_hit){ // if anything was read (should never get a message of 0, so this is ok
      switch(target_hit){
      case 1: // only message it should ever receive (only roles I'm using are 0 and 1, and 0 is rx)
        //digitalWrite(19,HIGH); // led
        //digitalWrite(18,HIGH); // relay
        delay(500);
        //digitalWrite(19,LOW); // led
        //digitalWrite(18,LOW); // relay
        delay(500);
        break; 
      default: // this means it was a junk message as only message it should receive is 1
      // this is just so I can tell if it somehow got a bad message
      //digitalWrite(19,HIGH); // led
        delay(5000);
        //digitalWrite(19,LOW); // led
        delay(500);
        break;  
      }
    }
    if (millis() - last_hop >= hop_time){ // if rx has been on this channel long enough for tx to send on all channels
      // go to the next channel
      if(ch_index<4){
        ch_index++;
      }
      else{
        ch_index = 0;
      }
      radio.stopListening();
      radio.setChannel(ch[ch_index]);
      radio.startListening();
      printf("Ch %d\n",ch[ch_index]);
      last_hop = millis(); // reset last_hop to now
    }
  }   
  else { // if role is not base station
    radio.powerUp(); // Power up the radio after sleeping                                 
    //    printf("Now sending... %d \n\r",role); // print role
    unsigned long trytime = 3000; // max transmit time
    unsigned long startsend = millis(); // when transmit started
    while( millis() - startsend < trytime){ // for up to trytime, keep sending
      if(radio.write(&role, sizeof(role))){ // if it got an ack, stop trying to send
      sendCounter++;
        break;
      }
      else{ // if it didn't get an ack
        // pick the next frequency
        if(ch_index<4){
          ch_index++;
        }
        else{
          ch_index = 0;
        }
        // go to that frequency
        printf("Ch %d\n",ch[ch_index]);
        radio.setChannel(ch[ch_index]);
      }
    }
    // by now, it either sent ok or timed out
    
    delay(1000);                   // why delay here? it was in an example, so I copied it.
    // Power down the radio.  
    radio.powerDown();            // NOTE: The radio MUST be powered back up again manually
    Serial.println(sendCounter);
    //go_to_sleep();                // Sleep the MCU.
  }
}


//void wake_up(void){
//  // disable sleep when mcu wakes up
//  sleep_disable();
//  // force interrupt pin low again (knock sensor has some capacitance)
//  pinMode(knock_sensor_pin, OUTPUT);
//  digitalWrite(knock_sensor_pin, LOW);
//  pinMode(knock_sensor_pin, INPUT); // return to normal
//}


void go_to_sleep(void)
{
  digitalWrite(19,LOW); // turn off led before sleeping
  set_sleep_mode(SLEEP_MODE_PWR_DOWN); // sleep mode is set here
  sleep_enable();
  //attachInterrupt(0,wake_up,RISING); // this will let it wake up again
  sleep_mode();                        // System sleeps here
  // interrupt 0 (knock sensor) wakes the MCU from here
  // redundant sleep_disable()? oh well. won't hurt. other one is in wake_up()
  sleep_disable(); // System continues execution here when knock sensor triggers interrupt 0  
  detachInterrupt(0); // to avoid triggering several times
  digitalWrite(19,HIGH); // turn led on to show mcu is awake
}

*Edit to add: I also manually set the role : byte role = 0;

Hey TMRh2O, first of all, thanks for your help. I feel like I'm starting to get somewhere. I got another chance to work on this today and found out something I don't understand. Could you test one more thing for me? Everything seems to work great for me (with your code and mine) if the delay between transmissions is always the same. So your code works. However, my transmissions in the real application will not be at regular intervals (triggered by a sensor). As soon as I simulate this by putting a random delay in the transmit loop, like below, I start to get instances where the tx increments sendcount, and the rx does not. Can you replace this section of your code with the code below and see if it still works without glitches?

else { // if role is not base station[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   radio.powerUp(); // Power up the radio after sleeping                                 [color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   //    printf("Now sending... %d \n\r",role); // print role[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   unsigned long trytime = 3000; // max transmit time[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   unsigned long startsend = millis(); // when transmit started[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   while( millis() - startsend < trytime){ // for up to trytime, keep sending[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
     if(radio.write(&role, sizeof(role))){ // if it got an ack, stop trying to send[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
     sendCounter++;[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
       break;[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
     }[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
     else{ // if it didn't get an ack[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
       // pick the next frequency[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
       if(ch_index<4){[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
         ch_index++;[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
       }[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
       else{[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
         ch_index = 0;[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
       }[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
       // go to that frequency[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
       printf("Ch %d\n",ch[ch_index]);[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
       radio.setChannel(ch[ch_index]);[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
     }[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   }[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   // by now, it either sent ok or timed out[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   [color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   delay(random(1000,1500));                   // why delay here? it was in an example, so I copied it.[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   // Power down the radio.  [color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   radio.powerDown();            // NOTE: The radio MUST be powered back up again manually[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   Serial.println(sendCounter);[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
   //go_to_sleep();                // Sleep the MCU.[color=#222222][font='DejaVu Sans Mono', Monaco, Consolas, monospace][/font][/color]
 }

Hey TMRh2O, first of all, thanks for your help. I feel like I'm starting to get somewhere. I got another chance to work on this today and found out something I don't understand. Could you test one more thing for me? Everything seems to work great for me (with your code and mine) if the delay between transmissions is always the same. So your code works. However, my transmissions in the real application will not be at regular intervals (triggered by a sensor). As soon as I simulate this by putting a random delay in the transmit loop, like below, I start to get instances where the tx increments sendcount, and the rx does not. Can you replace this section of your code with the code below and see if it still works without glitches?

else { // if role is not base station
   radio.powerUp(); // Power up the radio after sleeping                                 
   //    printf("Now sending... %d \n\r",role); // print role
   unsigned long trytime = 3000; // max transmit time
   unsigned long startsend = millis(); // when transmit started
   while( millis() - startsend < trytime){ // for up to trytime, keep sending
     if(radio.write(&role, sizeof(role))){ // if it got an ack, stop trying to send
     sendCounter++;
       break;
     }
     else{ // if it didn't get an ack
       // pick the next frequency
       if(ch_index<4){
         ch_index++;
       }
       else{
         ch_index = 0;
       }
       // go to that frequency
       printf("Ch %d\n",ch[ch_index]);
       radio.setChannel(ch[ch_index]);
     }
   }
   // by now, it either sent ok or timed out
   
   delay(random(1000,2000));  /////////////// Randomized delay ///////////////// I changed this part///////////
   // Power down the radio.  
   radio.powerDown();            // NOTE: The radio MUST be powered back up again manually
   Serial.println(sendCounter);
   //go_to_sleep();                // Sleep the MCU.
 }

It defiitely seems like there are a few issues here, but it gets a bit complicated.

7.3.3.2 PID (Packet identification)

The 2 bit PID field is used to detect if the received packet is new or retransmitted. PID prevents the PRX 
device from presenting the same payload more than once to the receiving host MCU. The PID field is 
incremented at the TX side for each new packet received through the SPI. The PID and CRC fields (see 
section 7.3.5 on page 30) are used by the PRX device to determine if a packet is retransmitted or new. 
When several data packets are lost on the link, the PID fields may become equal to the last received PID. 
If a packet has the same PID as the previous packet, nRF24L01+ compares the CRC sums from both 
packets. If the CRC sums are also equal,the last received packet is considered a copy of the previously 
received packet and discarded.

To test the above, I changed the following lines on the sender and got much better results:

uint8_t test = random(1,254);
      if(radio.write(&test, sizeof(test))){ // if it got an ack, stop trying to send

What does this mean? Essentially, you are flipping through the channels at a very high rate, and introducing a large number of errors. When this happens, chances of the above scenario are increased, because the sketch is sending the same value ( 1 ) every time, so the CRC will always match. Synchronizing the timing of the channel changes can reduce this issue as well. The above change appears to reduce or resolve the errors, but now I am getting 'extra' payloads on the receiving end.

The resulting issue, I believe, is due to the small number of retries, since there is an increased chance that the receiver will get a payload, but the transmitter will not receive the auto-ack. To test this, change the following lines:

radio.setRetries(10,8);
hop_time = 160;

The result is that the number of 'extra payloads' on the receiver are reduced as the hop time and retry values are increased. It seems that these extra issues are reduced if using stopListening() and startListening() on the receiver when changing channels also.

All in all, if working in a noisy situation, I would recommend utilizing the automatic retry features of the radio to their full degree, and possibly using the extended retry functions of the library instead of attempting to change the frequency so rapidly, since I assume this would result in a higher chance of corrupt packets, invalid crc, etc. as well, and is probably part of why odd things happen when the devices are not in sync.

Note: It also helps to move the last Serial.println(sendCounter); above the delay(random...

Forgot to add, the the long delays on the receiver should be removed if possible as well.

Wow! How in the world did you figure out the pid things issues? Sounds like you've given me several pieces of info I can definitely use there. I'll try to work them into my code and check back when I get a chance to test it out this weekend. Thanks again for your help here.

Using your suggestions has it working great! Instead of sending just the same message (1) all the time, I have it increment the number it sends back each time it changes channels. I also increased the number of retries slightly. I have not had a failure since making the changes. Thanks so much for the help. I learned quite a bit more than I knew before about how these modules work.

drhoff:
Wow! How in the world did you figure out the pid things issues? Sounds like you've given me several pieces of info I can definitely use there. I'll try to work them into my code and check back when I get a chance to test it out this weekend. Thanks again for your help here.

Glad to hear you got things working. I'm not sure if you are really interested, or if I can really explain how I figured it out, people often ask me to explain, and most of the time, I just laugh. Its not a secret or anything, just a bit different.

I basically just visualize the entire system as though it were mechanical, and watch the packets going back and forth and being handled ,etc. At first, after phsyical testing, all it looked like in the middle was chaos, since nothing seemed to make sense. With some testing and more watching of the results, the image becomes clearer as the process is understood in more detail. This can be done with process diagramming, etc. but is much easier and faster if done mentally. In any case, I could see clearly that packets were basically dissappearing mid-flight, and that lead me to the theory that payloads were being dumped, since quantum payloads are not logical. A quick consultation of the data sheet regarding the potential causes of dumped payloads (CRC checking and payload identification) lead to the discovery that it was indeed just payloads being dumped.

In any case, this works with any system, whether electrical, mechanical, or logical, it is just a matter of understanding the system, breaking down the processes into individual, detailed steps, and finding the point of failure. Again, not sure this is really useful to anybody, just thought I would share the solution.

1 Like