NRF24L01+ Getting Started Sketch Timing Problem

I have a pair of NRF24L01+ radios connected to two Mega boards using cables I built. I loaded the getting_started sketch and RF24 library from GitHub - maniacbug/RF24: Arduino driver for nRF24L01

Pinout of Cable

Mega Radio
GND <---> GND 1
3.3v <---> 3.3 2
DIO 9 <---> CE 3
DIO 10 <--> CS 4
SCK 52 <--> SCK 5
MOSI 51 <--> MOSI 6
MISO 50 <--> MISO 7

There appears to be a bit of timing problem. Here's the output I am seeing from the two radios. From the output below it looks like the receiver is getting the all the transmissions but the sender is timing out from time to time due to not receiving the response back. Looks like a 250ms delay before timing out according to the code.

The radios are about 1 foot apart.

Any ideas on what the problem may be?

Transmitter on Restart

RF24/examples/GettingStarted/
ROLE: Pong back
*** PRESS 'T' to begin transmitting to the other node
STATUS = 0x0e RX_DR=0 TX_DS=0 MAX_RT=0 RX_P_NO=7 TX_FULL=0
RX_ADDR_P0-1 = 0xf0f0f0f0d2 0xf0f0f0f0d2
RX_ADDR_P2-5 = 0xc3 0xc4 0xc5 0xc6
TX_ADDR = 0xf0f0f0f0d2
RX_PW_P0-6 = 0x20 0x20 0x00 0x00 0x00 0x00
EN_AA = 0x3f
EN_RXADDR = 0x03
RF_CH = 0x4c
RF_SETUP = 0x07
CONFIG = 0x0f
DYNPD/FEATURE = 0x00 0x00
Data Rate = 1MBPS
Model = nRF24L01+
CRC Length = 16 bits
PA Power = PA_HIGH

Reciever on Restart
RF24/examples/GettingStarted/
ROLE: Pong back
*** PRESS 'T' to begin transmitting to the other node
STATUS = 0x0e RX_DR=0 TX_DS=0 MAX_RT=0 RX_P_NO=7 TX_FULL=0
RX_ADDR_P0-1 = 0xf0f0f0f0e1 0xf0f0f0f0d2
RX_ADDR_P2-5 = 0xc3 0xc4 0xc5 0xc6
TX_ADDR = 0xf0f0f0f0e1
RX_PW_P0-6 = 0x20 0x20 0x00 0x00 0x00 0x00
EN_AA = 0x3f
EN_RXADDR = 0x03
RF_CH = 0x4c
RF_SETUP = 0x07
CONFIG = 0x0f
DYNPD/FEATURE = 0x00 0x00
Data Rate = 1MBPS
Model = nRF24L01+
CRC Length = 16 bits
PA Power = PA_HIGH

Transmitter
Now sending 73069...ok...Got response 73069, round-trip delay: 29
Now sending 74098...ok...Failed, response timed out.
Now sending 75407...ok...Got response 75407, round-trip delay: 53
Now sending 76462...ok...Got response 76462, round-trip delay: 67
Now sending 77530...ok...Failed, response timed out.
Now sending 78842...ok...Failed, response timed out.
Now sending 80146...ok...Failed, response timed out.
Now sending 81454...ok...Failed, response timed out.
Now sending 82761...ok...Failed, response timed out.
Now sending 84065...ok...Failed, response timed out.
Now sending 85372...ok...Got response 85372, round-trip delay: 45
Now sending 86418...ok...Failed, response timed out.
Now sending 87731...ok...Failed, response timed out.
Now sending 89038...ok...Got response 89038, round-trip delay: 32
Now sending 90071...ok...Got response 90071, round-trip delay: 45
Now sending 91117...ok...Got response 91117, round-trip delay: 27
Now sending 92145...ok...Failed, response timed out.

Reciever

Got payload 73069...Sent response.
Got payload 74098...Sent response.
Got payload 75407...Sent response.
Got payload 76462...Sent response.
Got payload 77530...Sent response.
Got payload 78842...Sent response.
Got payload 80146...Sent response.
Got payload 81454...Sent response.
Got payload 82761...Sent response.
Got payload 84065...Sent response.
Got payload 85372...Sent response.
Got payload 86418...Sent response.
Got payload 87731...Sent response.
Got payload 89038...Sent response.
Got payload 90071...Sent response.
Got payload 91117...Sent response.
Got payload 92145...Sent response.

Update:

The getting_started sketch that is included with the library has some lines comment out. I found another version of this code at http://maniacbug.github.com/RF24/GettingStarted_8pde-example.html. One of the differences is that the line that invokes the setPayLoadSize method is commented out. This appears to set the payload size from 32 bytes to 8 bytes.

This seems to have improved things greatly but I am still seeing timeouts. I am wondering if there is some interference on the default channel.

I find that RF24 communications can be quite flakey sometimes, and then work perfectly fine for several days, and then go back to being a bit flakey for no apparent reason. I assume it's just interference.

The pingpair example sketch doesn't check or print the returned status of the write() call sending the reply. It might be useful to include that in the output so you can see whether the RF24 thinks the response was transferred successfully. You may also find that reliability can be improved by increasing the retry count and interval between retries using setRetries(15, 15) and by reducing the data rate.

I've seen comments suggesting that a smaller payload can improve reliability, but it's not something I've tested.

you use PA_HIGH transmit power.

Try to make more space between the two RF24 antenna. May be you will see some better results.

I use them since 2 years until 80 meter for tx (at 250kbps) and I had never troubles with them (sparfun with rpm antenna : SparkFun Transceiver Breakout - nRF24L01+ (RP-SMA) - WRL-00705 - SparkFun Electronics).

hope it will help you...

Thanks for the information.

I cranked the speed down to 250kbs and the receiver is getting every single packet but RF24 library is still not properly processing the ACK messages and reporting a timeout. I don't think it is timing out at all.
Looking deeper into the RF24 library I think there may be either an issue with the write method. I noticed that after every write operation it puts the radio to sleep. I am wondering if this is causing a timing issue. I'm going to keep digging and see if I can resolve this. I'll post findings.

Several people have reported issues with the nRF24XX and Mega 2560's it seem that there isn't enough bypassing on the 3V3 rail. A 22 uF electrolytic and a .1uF ceramic capacitor should fix the issue or that was what was reported. It seems that the Mega hasn't enough bypassing because in every case the Uno worked with the RF24LXX module. It was substitution of a 3V3 source from an Uno that located the Mega 3V3 source fault.

Bob

So your saying there needs to be two capacitors tied from 3.3V to Ground.

Several people in another thread have indicated that, yes. It is very believable in that I have had other issues with the Mega 3V3 supply and radio's in general. In virtually every situation where a complex device like a radio is connected to a "Cable" meaning a set of wires more than about 3 cm in length a bypass capacitor across the power leads at the end of the cable that connects to the external device. In the cases where my sensors were not working properly a 10uF tantalum (what I had to hand) and a 100 nF cap fixed everything but my bad code... Didn't help that one bit.

Bob

Here's more detail on the exact problem I have observed and my solution for it.

In the GettingStarted sketch the pong responder does not check the status of the write method. I added a retry loop to re-attempt the write. The pong sender checks the status of the write method but continues on. I have not corrected this but intend to as some point.

Here's the simple change made to facilitate this:

// First, stop listening so we can talk
radio.stopListening();
// add delay J. Isaac 02/15/2013
delay(50);

//Add retry logic for Write J. Isaac 02/15/2013
// Send the final one back.
int i = 0;
bool ok = false;
while (i < 30) {
i++;
delay(30);
ok = radio.write( &got_time, sizeof(unsigned long) );
if (ok) break;
}

printf("Sent response.\n\r");


I opened up the RF24 library and uncommented a printf command that shows the tx status bits in the status register

printf("%u%u%u\r\n",tx_ok,tx_fail,ack_payload_available);

This command shows the TX bits in the status register after the write operation.

Looking at the status registers after each write, it appears that there is some type of race condition or bug that doesn't properly set the TX status bits in the status register. It appears that some successful writes set TX_FAIL. If the issue is a power problem as described in the previous threads, then this could be the root cause of the issue I am seeing. At any rate the write retry reduced the ping failure rate considerably.

I also commented out the radio powerdown method invocation in the write method. This seemed to improved reliability as well.

After making these minor tweaks I am having good success.

Docedison:
Several people in another thread have indicated that, yes. It is very believable in that I have had other issues with the Mega 3V3 supply and radio's in general. In virtually every situation where a complex device like a radio is connected to a "Cable" meaning a set of wires more than about 3 cm in length a bypass capacitor across the power leads at the end of the cable that connects to the external device. In the cases where my sensors were not working properly a 10uF tantalum (what I had to hand) and a 100 nF cap fixed everything but my bad code... Didn't help that one bit.

V interesting. Could you please give a link to that discussion?

I recently ordered some of these devices but have not yet used any. Without the above in mind, I was looking at putting them directly onto a proto shield anyway but, even then, I could be struggling to keep the cables below 3 cm.

It seems that a temporary lashup with flying leads is not such a good idea.....

Another update on this topic.

I believe I found the issue in the RF24 library that was causing some reliability issues.

The startListening and stopListening methods flush both the tx and rx fifos. If the RX FIFO has data in it before we issue a read it will wack it. So if we send data and then wait for a response on the same pipe the data will get flushed from the fifo if the remote data arrives before the flush command which I believe happens on a fairly regular basis.

The FIFOs should clear automatically during normal operation so the flush commands are not necessary when switching from modes(RX/TX) on radio. It's not a bad idea to flush out the tx buffer after writing is completed successfully or the RX buffer on reading so I left this logic in the code. I did however remove the flush_rx command from startListening and the flush_tx from the stopListening and my reliability went from about 75% over 99.99999% with an average round trip time of about 112ms at 1Mbs using a payload size of 8 bytes for 200,000 pings.

The modified methods:

void RF24::startListening(void)
{
write_register(CONFIG, read_register(CONFIG) | _BV(PWR_UP) | _BV(PRIM_RX));
write_register(STATUS, _BV(RX_DR) | _BV(TX_DS) | _BV(MAX_RT) );

// Restore the pipe0 adddress, if exists
if (pipe0_reading_address)
write_register(RX_ADDR_P0, reinterpret_cast<const uint8_t*>(&pipe0_reading_address), 5);

// Flush buffers
//remove flush_rx 02/17/2013 jgi
//flush_rx();
flush_tx();

// Go!
ce(HIGH);

// wait for the radio to come up (130us actually only needed)
delayMicroseconds(130);
}

/****************************************************************************/

void RF24::stopListening(void)
{
ce(LOW);
// remove flush_tx jgi 02/17/2013
//flush_tx();
flush_rx();
}

That does sound like a bug - good work tracking it down. I recommend that you let maniacbug know, if you haven't already, so that he can incorporate this fix into the published library.

Hello. Any idea if this fix has been pulled into maniacbug's code?