i2c bus occasionally losing data packets

Hi!

I have created an i2c bus for one Arduino Uno R3 master to communicate with two Colorduino slaves. Most of the time the communication is fine, however occasionally I lose a packet. The timing appears to be random. My i2c bus is at 400khz and my pull-up resistors are currently 10k, I have also tried 4.7k and 2.2k. Power is supplied by a 5V, 1200mAh power supply.

I have no idea what the problem could be, does anyone have any idea where to start looking?

Details

I added a diagram (i2c_bus.png) of my current setup. In this diagram the two Uno’s on the right are actually Colorduino’s.

The master has the following code: master.c · GitHub

The slaves have the following code (one of them has i2c address 2, the other 3):

The master receives commands from a PC via the Serial port, SERIAL_RGB_COMMAND, the master will reply with a SERIAL_ACK_COMMAND when it receives a SERIAL_RGB_COMMAND from the PC. This way I can check if the commands are received and processed by the master, in all cases they are, therefore it seems to me that the problem is not in the serial communication between master and PC.

Upon receiving SERIAL_RGB_COMMAND (with including 8x8 array of RGB LED data) the master sends three packets over i2c, one 8x8 red packet, one 8x8 green packet and one 8x8 blue packet. I have set up a Serial connection between the slave and PC to check if all packets arrive (by Serial.println upon receiving a R, G or B packet). And I notice that sometimes one of these R, G, or B packets is not received (it is not printed to the Serial port).

TLDR

Upon receiving a Serial SERIAL_RGB_COMMAND packet from the PC, the master generates 3 i2c packets, but occasionally one of these i2c packets is lost. Most of the time it works, the failures seem very much random.

I hope I included everything, if there is some information missing please let me know!

does anyone have any idea where to start looking?

The "diagram" shows the Arduinos about 1/2" apart. Are they really?

PaulS: The "diagram" shows the Arduinos about 1/2" apart. Are they really?

In practice they are not. The bus is probably 10-15cm (4-6 inches) in length, with the Arduinos evenly spaced in between. I apologise if my jargon is a little off.

#define I2C_MASTER_ADDRESS 1
 Wire.begin(I2C_MASTER_ADDRESS); // join i2c bus (address optional for master)

The address is not optional for the master it must not be specified. If you do you should at least use the master address which is 0, not 1. This is probably not the source of your problems though.

You don't check if the I2C transmission was successful. Although you return the result code of Wire.endTransmission() you don't handle any error. BTW, the delay() after the return is never reached.

if(Wire.available()>67)

Did you patch the Wire library? If not, this will never be true as the default Wire buffer size is 32.

On the slave you should do the reception of the bytes in the ISR method (receiveEvent() in your case) and just process it in the loop. Don't forget to declare all shared variables volatile and use a flag to tell the loop that data has arrived. That way you don't need the begin and end marker bytes.

pylon: The address is not optional for the master it must not be specified. If you do you should at least use the master address which is 0, not 1. This is probably not the source of your problems though.

Thank you, I was unaware!

pylon: You don't check if the I2C transmission was successful. Although you return the result code of Wire.endTransmission() you don't handle any error. BTW, the delay() after the return is never reached.

This is a very good point, I will have to check the return value of Wire.endTransmission, and probably Wire.write() too. Thanks for spotting the delay() mistake!

pylon: Did you patch the Wire library? If not, this will never be true as the default Wire buffer size is 32.

I did indeed patch Wire.h and twi.h to 70 bytes, iirc.

pylon: On the slave you should do the reception of the bytes in the ISR method (receiveEvent() in your case) and just process it in the loop. Don't forget to declare all shared variables volatile and use a flag to tell the loop that data has arrived. That way you don't need the begin and end marker bytes.

What is the advantage of performing the collection in the ISR method? Marking it volatile makes sense, but I don't fully understand how it prevents me from requiring begin and end markers though. Is it because the ISR method's parameter specifies the number of bytes?

This is a very good point, I will have to check the return value of Wire.endTransmission, and probably Wire.write() too.

Checking the result code of Wire.write() isn’t that important because if you don’t overrun the send buffer you always get a 1 back.

What is the advantage of performing the collection in the ISR method?

You know how many bytes the master sent. You get exactly the same message as the master sent. I2C is not the same as a serial (UART) connection where you must have some protocol to know where the message starts and where it ends. I2C has a defined start condition and a defined end condition, so the slave knows where the message starts and where it ends.

Marking it volatile makes sense, but I don’t fully understand how it prevents me from requiring begin and end markers though.

Declaring the variables volatile shows the compiler that they are accessed from the main program and at interrupt level. This is important to choose the correct optimization methods. It’s unrelated to the begin and end markers.

Is it because the ISR method’s parameter specifies the number of bytes?

Yes, and I explained above why this is the case.

As for the pull-ups: 10k is a to high value if you use a 400kHz clock. Use 4k7 or better 3k3 to allow a faster return to Vcc on the two signal lines. 2k2 is outside the specs at 5V so sensitive devices might get damaged.

So, I applied all your suggestions, and it improved the stability quite a bit. Thanks for helping me understand, coming from a software engineering background this is quite a different perspective.

However, what I have noticed is that endTransmission occasionally returns 2 "received NACK on transmit of address". This is indeed the "packet loss" I have observed. But I am unsure how to interpret this or find the root cause. What is more, the situation becomes much worse when I introduce the second slave (one slave is more stable than two slaves).

What could cause this? Should I move away from i2c to a different bus-like solution?

What is more, the situation becomes much worse when I introduce the second slave (one slave is more stable than two slaves).

That let me suppose that the capacitance of the bus is too high for the high speed. Have you tried going back to 100kHz? Is that better for the stability? Did you check the signal with a scope? What pull-up do you use currently?

pylon: That let me suppose that the capacitance of the bus is too high for the high speed. Have you tried going back to 100kHz? Is that better for the stability? Did you check the signal with a scope? What pull-up do you use currently?

That indeed worked a lot better, 100kHz with resistors between 3.3k and 10k (currently using 3.3k) as pull-up produces stable results. The downside is that there is now a visible delay between each Colorduino as the speed of the bus is not high enough, I assume. At 400kHz there was no visible delay, but packet loss occurred.

Is there a way to somehow decrease the delay while still remaining at 100kHz? Or is there a way I can have a stable 400 kHz bus?

Unfortunately I do not have a scope to visualise what is going on.

To increase the speed, reduce the capacitance of the bus and increase the pullup effort (reduce resistance on the pullups.)

Capacitance is a function of area and distance. If you have a ribbon cable with SCL and ground running close together then they have a small distance over the length of the cable. If you split the cable into separate wires and try to route those wires as far apart as possible, the capacitance will be reduced.

10-15cm should be no problem with capacitance unless you used coax or twisted-pair. I would not spend a lot of effort resolving that. Try the pullups first.

Or is there a way I can have a stable 400 kHz bus?

If you have nothing than the Arduinos on the I2C bus you can go as low as 1k5 for the pull-up. This should give you considerably better results. Trying the tips of MorganS to decrease the capacitance is definitely a good idea too.