Data received on UART RX causes I2C to hang (Photos)

Hey all,

TL;DR
I2C on slave device hangs when data is received on slaves UART RX.

Project:
I'm developing a device that has a master Arduino Mega2560 and two slave Arduino Mega2560s. The slaves each communicate with their own MPU6050 IMU sensor over I2C (using a logic level converter) and are located remote (~5m) from the master. The master and slaves communicate over RS485, using a MAX485 UART to RS485 breakout board to convert UART TTL to RS485. The master will have two of these boards, so that each slave has it's own seperate connection to the master. The master will request motion data at 20Hz from the slaves, compare the values and then drive some LEDs based on these values. The request from the master is in the form of the character "1" being sent to the slave using Nick Gammon's RS485 library. I'm using the blocking version on the master, and the non-blocking version on the slave.

Libraries:
Libraries being used are as follows;

Schematic:
The schematic for the project, which shows only one slave, can be seen below. The cable used in my test setup is 3m CAT5E, unshielded twisted pair. I'm using just the one pair. My pullup resistor configuration results in a sink current of ~2.5 mA for the driving I2C device. Maximum is 3 mA so I can't go much lower with my pullup resistor values.

Photos of Test Setup:
These photos show the test setup on my desk. The Uno and connected hardware on one half of the breadboard are not being used.

Problem:
I'm testing with just one slave at the moment. The issue I'm seeing is that the 'request' data packet that is sent from the master to the slave at 20Hz causes the I2C on the slave to hang. The slave hangs a large majority of the time in the loop() part of the sketch in calls which request data from the MPU6050 sensor, primarily 'mpu.getFIFOCount()' and 'mpu.getFIFOBytes'. Other times it hangs at startup, after a reset for example, on initializing the sensor during setup() such as 'mpu.dmpInitialize'. I've tried using the sensor without the built in DMP feature (so just reading raw accel and gyro values) and still get the hangs.

I'm using a timeout in the Wire library which I believe worked in a previous project but it is not 'timing out' the hang in this case. Whether this is because the timeout isn't implemented properly or whether the issue at hand is not related to the type of problem that the timeout solves I'm not sure. [Update: My Arduino IDE was not recompiling the modified timeout version of the Wire library when uploading to the Mega2560, that's why the timeout wasn't working. It's now fixed and the timeout works.] However the timeout is a band aid fix though and I2C hangs can generally be due to a hardware issue so I don't want to rely on the timeout, I want to find the underlying issue. Same goes for watchdog timers.

I'm viewing the I2C signals and the data requests on UART RX on the slave using an oscilloscope. If I disable the request signal, such as by turning off the master, the I2C runs fine indefinately. Also the slower I send the request the less often the hang.

Here are some oscilloscope screenshots showing the hang. The channels from top to bottom are SDA (yellow), SCL (light blue), UART RX (purple), UART TX (dark blue) on the slave. Note the oscilloscope mode I used allowed me to capture the hang (I don't have timeout trigger) but resulted in low resolution/sample-rate display, so some information may not be shown (noise/spikes).

There are others who have had the same problem with UART RX signals causing I2C hangs:

An Apparent Solution:
Something that seems to work is to use a resistor(am testing ~300 ohms) in series on SDA and SCL. I'm also trialing another resistor on the RX line (am testing ~1k ohms). This approach was taken from this thread here and also this webpage under the heading 'Don't "despike" your signal lines, add a resistor instead'. So far testing this approach allows the device to run for hours without hanging while accepting the request signal. It seems that the series resistor and the line capacitance creates a low pass filter of sorts and helps reduce noise on the line.

Understanding the Cause:
So the reason for this post is that I don't understand why this solution works so far and I want to discuss it here for my sake and for future enthusiasts who come across the same issue. This device will also be something that will be used by others and I don't feel comfortable moving forward with it without understanding the issue properly and ensuring it is solved properly.

Two of the oscilloscope screenshots show the SCL signal going LOW but stops before it reaches LOW and returns to HIGH. I have seen this many times when looking at these hangs on my oscilloscope, though it doesn't always happen. It's almost as if the Mega2560 releases the SCL whilst in the middle of pulling it LOW, because of the incoming signal on UART RX. I thought perhaps it's an interrupt issue but the series resistor solution seems to rule that out(?).

The series resistor solution seems to suggest noise being the issue, but then does that mean the RX signal is noisy? And if so why does a noisy UART RX signal hang the I2C? They must be electrically coupled somewhat?

I have tested leaving the RS485 transmit enabled on the master but with no actual request signal being sent to the slave to see if external EMI/RF noise is being coupled into the 3m CAT5E cable but I didn't get any hangs with this test.

Any input much appreciated.

Check whether your MPU6050 board is 5V compliant, and remove the level shifter.

I just read about RS-485 drivers burning down after short time. For testing purposes I'd also remove these, and connect the RX/TX lines directly. Then you know whether the hardware is your only problem.

DrDiettrich,
Thanks for the suggestions. Good idea to simplify the setup down to try and find any hardware issues.

DrDiettrich:
Check whether your MPU6050 board is 5V compliant, and remove the level shifter.

The MPU6050 sensor itself is not 5V tolerant when powered with Vcc at 3.3V. The MPU6050 breakout board schematic shows that 5V is supplied to the VCC pin of the breakout board, which is regulated down to 3.3V for supply to the MPU6050 sensor. The MPU6050 datasheet states the absolute maximum input/VLOGIC voltage on SDA/SCL is Vcc+0.5V, therefore 3.3V+0.5V, so 3.8V. So while I couldn’t run the I2C bus with pullups to 5V in order to remove the logic level shifter, I was able to pull it up to 3.3V instead by doing the following.

Mega2560 Internal and External Pullups to 5V:
I wanted to make a note for anybody reading this in the future, the Mega2560 has internal pullup resistors inside the processor to 5V (with a value of 20k-50k) and external pullup resistors on SCL/SDA on the board to 5V (with a value of 10k). To avoid pulling the I2C bus up to 5V and possibly damaging the MPU6050, I desoldered the external pullups on the Mega2560 board (they are in the form of a small package of four resistors, but only two are used for SDA/SCL, the other two are not connected so is safe to remove). The internal pullups are disabled by changing the following lines in the ‘twi.c’ library file that comes with Arduino IDE. Simple replace the 1’s with 0’s as follows.

Before:

// activate internal pullups for twi.
  digitalWrite(SDA, 1);
  digitalWrite(SCL, 1);

After:

// do not activate internal pullups for twi.
  digitalWrite(SDA, 0);
  digitalWrite(SCL, 0);

Then save it, restart your Arduino IDE and reupload your sketch. When ‘Wire.begin()’ is called in your sketch, this code is part of the code that is executed. If one was to not make this change and instead just add these lines after ‘Wire.begin()’ in their sketch like so;

In sketch:

void setup() {
    Wire.begin()
    digitalWrite(SDA, 0);
    digitalWrite(SCL, 0);
}

… then what happens is the internal pullups are enabled when ‘Wire.begin()’ is executed (if you didn’t modify the twi.c file) and then immediately disabled when the two ‘digitalWrite…’ lines are executed. The result is a quick voltage spike which could damage your non-5V-tolerant sensor. Here is an oscilloscope screenshot of the voltage spike shown on the top two channels SDA and SCL. They got to 4.4V and 4.2V respectively which is above the 3.8V limit of my sensor.

Re-testing with Logic Level Shifter removed:
So by disabling all pullups to 5V I ran the bus with 1.8k pullups only to 3.3V. The VIH of the Mega2560 is 0.6*Vcc, so 3V, so it was low enough to work. I am running the I2C at 100kHz clock rate and the measured rise time was ~500ns, where the required is <1000ns for 100kHz, so my setup is within spec. With all this, it did not change the hanging situation. It still behaved as before where it would hang within 1 to 50 seconds without the series resistors on SCL/SDA/RX, and hasn’t hanged yet with the series resistors.

DrDiettrich:
I just read about RS-485 drivers burning down after short time. For testing purposes I’d also remove these, and connect the RX/TX lines directly. Then you know whether the hardware is your only problem.

I removed the UART to RS485 breakout boards from the test setup and connected TX to RX and RX to TX directly with ~20cm jumper wires and the test results are unchanged. The hanging behaviour is the same as above, where it hangs quickly without series resistors, and hasn’t hanged yet with series resistors.

New Test Setup Schematic:
The new test setup is now as follows (the series resistors are not shown).

Test Setup Power Arrangement:
I forgot to mention the power supply arrangement, which isn’t shown in the schematics. I have used two different power supply sources but both result in the exact same behaviour as described above. One is powering both master and slave by separate USB cables to my Windows PC, and the other is powering the slave via a 12V, DC, centre-pin-positive, 1A power adapter (wall-wart) and passing power to the master by connecting Vin and GND on the slave to Vin and GND on the master via ~20cm jumper wires.

I2C Timeout in Wire Library:
The reason my Wire timeout wasn’t working was because my Arduino IDE was using previously compiled Wire library files when uploading sketches to the Mega2560, and not the newly modified ones that included the timeout code. I managed to fix this by enabling ‘verbose output’ in File > Preferences for ‘compilation’ and ‘upload’ in Arduino IDE. It showed where it was grabbing the previously compiled Wire.h library files from and so I just modified them there. The new Wire timeout code worked and the I2C connection reset shortly after each hang. But this is somewhat of a bandaid fix and I would prefer to understand the underlying issue.

I’ll continue to test other things to see if I can find any other clues about the issue.

All input welcome!

If a timeout is ever required, it indicates defective hardware to me, by design or hazard. With nowadays fake chips and boards I'd try to find a better replacement.

Because SCL and SDA are actively driven low only, I see no reason for series resistors. An output cannot draw more current than supplied by the pullup resistors, and AFAIR up to 3mA are allowed. The data sheet also says that the SCL/SDA lines must not be configured for output, instead the TWEN bit activates the special open-drain TWI circuitry. But if you have series resistors in place, you can use them to measure the direction and amount of current to the bus lines. Removing the on-board pullup resistors is not a good idea, except eventually for devices in the middle of a bus.

I've never used a MPU6050 myself, just for the 5V/3.3V issue, but now it might be time to make my hands dirty...

Until then the series resistors suggest to me very bad wiring, or a defective voltage regulator on the module. The voltage regulator may be broken in several ways, supplying (much) more or less than 3.3V to the MPU, or it may oscillate. I'd think that you can check the voltage, supplied to the MPU, just in case of a hanging connection. If you use the 3.3V from the master, to power the 3.3V pullups, that voltage also may be faulty. I'd prefer to power the 3.3V side of the level shifter, and the related pullups, from the 3.3V output of the MPU board.

Every bus should be terminated at both ends, and if a level shifter is used, also at both sides of the level shifter. Otherwise reflections may occur on the bus lines, with possibly dangerous consequences. Did you check the signal shapes and levels at either end of the bus?

DrDiettrich,

Thanks for your reply.

DrDiettrich:
If a timeout is ever required, it indicates defective hardware to me, by design or hazard. With nowadays fake chips and boards I'd try to find a better replacement.

I agree with your I2C timeout remark. I'm not comfortable with using a timeout as a 'solution' and instead prefer to find the underlying issue. One Mega2560 I own is a clone with no brand, and the other Mega2560 is a 'Funduino' branded board. A third board that I'm using as a second slave at times is a 'Duinotech' branded Uno, which exhibits the same hanging behaviour as described and is 'fixed' by using series resistors also. I'll be using genuine Arduino boards in the final setup.

DrDiettrich:
I've never used a MPU6050 myself, just for the 5V/3.3V issue, but now it might be time to make my hands dirty.

If you were interested in duplicating my test setup and seeing if you have the same issue that would be very helpful. I'd be happy to purchase the MPU6050 and have it sent to you, they're very cheap. The breakout board is a GY-521 model.

DrDiettrich:
Until then the series resistors suggest to me very bad wiring, or a defective voltage regulator on the module. The voltage regulator may be broken in several ways, supplying (much) more or less than 3.3V to the MPU, or it may oscillate.

I have multiple of these MPU6050 breakout boards (GY-521) and of the three I've tested so far, each exhibits the same behaviour. Not ruling the possibility out however.

DrDiettrich:
I'd think that you can check the voltage, supplied to the MPU, just in case of a hanging connection. If you use the 3.3V from the master, to power the 3.3V pullups, that voltage also may be faulty. I'd prefer to power the 3.3V side of the level shifter, and the related pullups, from the 3.3V output of the MPU board.

Unfortunately there is no readily accessible 3.3V pin or node on the MPU6050 breakout board, and the components are too small for me to solder directly onto. Therefore I can't attach my oscilloscope probe anywhere for measuring the 3.3V supply on the sensor, nor can I attach additional pullups to that same 3.3V supply.

I'd like to point out that my test setup schematic in post #2 shows 1.8k pullup resistors to 3.3V. This is actually 1.8k equivalent - a combination of the 4.7k pullup resistor already on the MPU6050 breakout board (which is connected to the on board regulated 3.3V supply), and an additional 2.2k external pullup resistor that I installed which connects to the 3.3V supply from the Mega2560. When I used the level shifter I powered the LV side of the shifter from the Mega2560 3.3V supply.

Moving to a Better Test Setup:

I purchased a new MPU6050 sensor today, soldered header pins to it and put it straight into the test setup replacing the previous MPU6050 sensor and it didn't hang in a couple of tests. I thought ok maybe it's sensor related, so I put the old sensor back in and it also didn't hang in a couple of tests. This has left me a little bewildered and is making me think again it could be noise related or connections on the breadboard or something. My current test/prototype setup uses a lot of cheap and somewhat long jumper wires as shown in the photo in post #1, an eBay breadboard, an unshielded CAT5E cable, I believe an unshielded power adapter which is plugged into a power board (which is plugged into another power board, and that power board is plugged into another power board which plugs into the wall. 3 power boards total, eek!). There are around 9 devices connected to these power boards. When I had an I2C hang I was able to 'unhang' it by turning off my fan in the room which provided a large enough spike to, I believe, provide an SCL clock signal. I've also been able to hang the I2C by turning the fan on and off sometimes, but not always.

So I've decided to move away from the current test/prototype setup and begin moving towards the final setup, which will use genuine Arduino boards, house each board in its own shielded case, with all hardware on PCBs with no jumper wires where possible, using shielded RS485 cables, and a better regulated, noise free power supply. If the problem persists with the new setup then I'll pursue more tests again.

DrDiettrich:
Every bus should be terminated at both ends, and if a level shifter is used, also at both sides of the level shifter. Otherwise reflections may occur on the bus lines, with possibly dangerous consequences. Did you check the signal shapes and levels at either end of the bus?

Another good point. As above I'll begin moving towards the final setup and check all signal waveforms and levels then. Should be within the next week or so, I'll keep updating the thread as I go.

Many thanks.