Another update on this topic.
I believe I found the issue in the RF24 library that was causing some reliability issues.
The startListening and stopListening methods flush both the tx and rx fifos. If the RX FIFO has data in it before we issue a read it will wack it. So if we send data and then wait for a response on the same pipe the data will get flushed from the fifo if the remote data arrives before the flush command which I believe happens on a fairly regular basis.
The FIFOs should clear automatically during normal operation so the flush commands are not necessary when switching from modes(RX/TX) on radio. It's not a bad idea to flush out the tx buffer after writing is completed successfully or the RX buffer on reading so I left this logic in the code. I did however remove the flush_rx command from startListening and the flush_tx from the stopListening and my reliability went from about 75% over 99.99999% with an average round trip time of about 112ms at 1Mbs using a payload size of 8 bytes for 200,000 pings.
The modified methods:
void RF24::startListening(void)
{
write_register(CONFIG, read_register(CONFIG) | _BV(PWR_UP) | _BV(PRIM_RX));
write_register(STATUS, _BV(RX_DR) | _BV(TX_DS) | _BV(MAX_RT) );
// Restore the pipe0 adddress, if exists
if (pipe0_reading_address)
write_register(RX_ADDR_P0, reinterpret_cast<const uint8_t*>(&pipe0_reading_address), 5);
// Flush buffers
//remove flush_rx 02/17/2013 jgi
//flush_rx();
flush_tx();
// Go!
ce(HIGH);
// wait for the radio to come up (130us actually only needed)
delayMicroseconds(130);
}
/****************************************************************************/
void RF24::stopListening(void)
{
ce(LOW);
// remove flush_tx jgi 02/17/2013
//flush_tx();
flush_rx();
}