Strange Problem with I2C Communication

Hi,

I wrote a library for the MCP3221 (a 12-Bit ADC converter) some time ago and published it on Github.

The library works perfectly with a single MCP3221 device connected to the Arduino, but when 2 MCP3221s (with different I2C address, of course) are connected in parallel, communication with the first of the 2 devices is lost.

I’ve spent a lot of time trying to track down the exact issue causing this, and I think I managed to narrow things down quite significantly.

However, for the life of me, I can’t figure out what the exact problem is or how to fix it.

Here a summary of what I’ve discovered so far in my investigation:

  • I tested the 2 devices independently and without the library (i.e. by interfacing with them directly via Wire) and verified that both are working correctly.

  • Each of the devices has a different I2C address which is pre-set in hardware (0x4D & 0x4E), so definitely no address clash.

  • I tested the 2 devices in parallel but again without the library and they work perfectly together - so I know for certain that the problem lies somewhere in the library code itself.

  • I’m almost certain the problem lies with the constructor function of the library.

The reason I say this is that when I connect just 1 of the 2 devices and run the library’s built-in I2C device info sketch (included in the ‘examples’) , there is no problem as can be seen in the attached screen-shots below.

However, if I connect both devices in parallel and run a modified version of the said device info sketch (tweaked for checking 2 devices):

#include "MCP3221.h"
#include "utility/MCP3221InfoStr.h"

MCP3221 slave1(0x4D);
MCP3221 slave2(0x4E);

void setup() {
    Serial.begin(9600);
    Wire.begin();
    while(!Serial);
    print_current_settings(slave1);
    print_current_settings(slave2);
}

void loop() {}

void print_current_settings(MCP3221 slave) {
    Serial.print(F("\n\nCURRENT SETTINGS:\n"));
    Serial.print(MCP3221InfoStr(slave));
    Serial.print(F("\n\n"));
}

Then the feedback is that the I2C address of the first device (0x4D) has been deleted somehow (or set to ‘0’) by the mere presence of the constructor for the second device (0x4E) - also see attached screen-shot below.

Reversing the order of construction yields the exact same results only for the other device (communication with 0x4E is good, while the I2C address of 0x4D gets erased).

So it looks like constructing a second MCP3221 instance somehow messes-up the first MCP3221, but I’m totally baffled as to why that should happen :o

The library’s constructor defines each MCP3221 instance as having its own unique address, and as far as I can tell, it shouldn’t be interfering with existing ones.

Help please?

mcp3221-slave1.png

mcp3221-slave2.png

Help please?

Where is your code? How are we supposed to help you determine what you did wrong in your code when you haven't posted your code?

PaulS:
Where is your code? How are we supposed to help you determine what you did wrong in your code when you haven't posted your code?

But I have posted it... look above again: there's a link to Github (hyper-linked with the word: Github)

Regardless, here it is again more explicitly: https://github.com/nadavmatalon/MCP3221

  1. Does the problem exhibit itself even if you don't use the function MCP3221InfoStr(slave) ? That function adds a considerable complexity to any troubleshooting.
  2. Have you tried the I2C scanner with both MCP322 devices connected simply to rule out problems with I2C pull up resistors etc ?

I totally agree with 6v6gt, I get the same feeling.

Looking into the code: The function MCP3221InfoStr() creates a buffer on the stack. Then it creates an object that uses that buffer, which is on the stack. Then it returns that object… the object with a pointer to the buffer on the stack that just has vanished when the function returns.

6v6gt:

  1. Does the problem exhibit itself even if you don't use the function MCP3221InfoStr(slave) ? That function adds a considerable complexity to any troubleshooting.

  2. Have you tried the I2C scanner with both MCP322 devices connected simply to rule out problems with I2C pull up resistors etc ?

To Question 1:

Yep, unfortunately the error occurs even if the 'info str' function isn't used.

I only used 'info str' to see what the device settings looked like and that's where I spotted the problem with the I2C address of the first device getting deleted when the second device is constructed as described above.

When I started investigating, I've tried the simplest of sketches using the library, i.e. construct 2 x MCP3221 devices and just use the 'getVoltage()' function and the problem persisted (i.e. couldn't get a voltage reading from the first device).

To Question 2:

Yep. I've used I2C scanner and it identifies the 2 devices at the correct I2C addresses without a problem.

We like to see what you are trying.

That is how a forum works. Don't tell us that you have run some tests, but show your tests and show the result. You have to show us what is not working and what is working. You could be making the same mistake over and over again. That is why you ask our help in the first place :wink:

Can you use the function getRawData() and nothing else ? and determine if the data is indeed from each seperate sensor. Maybe some initialization is needed. Please show us the sketch.

You use a lot of SRAM memory, can you try a Arduino Mega 2560 instead of an Arduino Uno / Nano ?

Koepel:
I totally agree with 6v6gt, I get the same feeling.

Looking into the code: The function MCP3221InfoStr() creates a buffer on the stack. Then it creates an object that uses that buffer, which is on the stack. Then it returns that object... the object with a pointer to the buffer on the stack that just has vanished when the function returns.

You may be right and there may be a problem there too (though notice that everything works well with a single device).

However, as I noted in my response to 6v6gt, the problem persists even if the whole 'com str' extension isn't used at all, just the core library functions.

Koepel:
We like to see what you are trying.

That is how a forum works. Don't tell us that you have run some tests, but show your tests and show the result. You have to show us what is not working and what is working. You could be making the same mistake over and over again. That is why you ask our help in the first place :wink:

Can you use the function getRawData() and nothing else ? and determine if the data is indeed from each seperate sensor. Maybe some initialization is needed. Please show us the sketch.

You use a lot of SRAM memory, can you try a Arduino Mega 2560 instead of an Arduino Uno / Nano ?

Ok, Koepel, though a slightly less aggressive and condescending tone would be nice...

I didn't just say I 'run some tests' as you put it. I actually explained in detail what tests I run and the results I got. I'd be very happy to run additional tests, and post the results in any way that might be helpful - all you need to do is ask.

Btw, I'm not a newbie to arduino or to programming. Before posting this thread, I've spent many hours running multiple tests and checking both the code and hardware. That, of course, doesn't mean that I haven't commited the same mistake repeatedly, but if this was a simple bug, I'm fairly certain I would have pinpointed it by now.

In general, please notice that the whole 'info str' and 'com str' structures are merely optional add-ons and aren't necessary for running the core library. I've done most of the testing using just the core functions and only used the 'info str' in the descriptions above as it gives a clear and concise overview of the problem.

To your specific question, I did test a setup with the 2 devices connected and just the 'getRawData()' to retrieve voltage readings, however, the same problem occurred, i.e. connection with the first device failed.

Well, you could try the library used in this example to see if you get the same effect :

And, if so, maybe then try stronger pullup resistors on SDA and SCL and some decoupling capacitors.

@SnowCrashAr sorry if I did something wrong. I was only trying to help and English is not my native language. Can you tell what could be better, because I have no clue :smiley_cat:

Some on this forum have the punchline: the problem is in the part that you are not showing.
I made up Koepel's law: "Fifty percent chance that the problem is in the section that was ignored from the beginning, because you thought that the problem could not be caused by that section" :smiley:
Showing the complete sketch and showing the result is something that is also mentioned a lot on forums.
When you do not show the sketch that you used for a test, I can not follow the tree of functions that are called.

Again, I totally agree with 6v6gt, try another library.

Koepel:
@SnowCrashAr sorry if I did something wrong. I was only trying to help and English is not my native language. Can you tell what could be better, because I have no clue :smiley_cat:

Some on this forum have the punchline: the problem is in the part that you are not showing.
I made up Koepel's law: "Fifty percent chance that the problem is in the section that was ignored from the beginning, because you thought that the problem could not be caused by that section" :smiley:
Showing the complete sketch and showing the result is something that is also mentioned a lot on forums.
When you do not show the sketch that you used for a test, I can not follow the tree of functions that are called.

Again, I totally agree with 6v6gt, try another library.

No prob, Koepel, and thanks for this. There's no doubt that sometimes things that sound very reasonable in one language, can sound very differently in another. I think it'd be best if we put it down to miscommunication and forget all about it :slight_smile:

As for 6v6gt's suggestion of trying another library, I'm not sure what the benefit will be as I'll explain in the next post.

Also, I've got another reason to suspect the constructor above anything else as I'll explain in the post after that.

6v6gt:
Well, you could try the library used in this example to see if you get the same effect :

https://github.com/SparkysWidgets/MCP3221-Library/blob/master/examples/MCP3221Example.ino

And, if so, maybe then try stronger pullup resistors on SDA and SCL and some decoupling capacitors.

Thanks for the suggestion, 6v6gt, but I can’t see how using another library would help figuring out the problem with the current library. Here’s why:

If we put all libraries to the side and use direct communication with the 2 MCP3221 devices in the following way:

#include <Wire.h>

unsigned long timeNow;

void setup() {
      Serial.begin(9600);
      Wire.begin();
      timeNow = millis();
}

void loop() {
  if (millis() - timeNow >= 1000) {  
      Wire.requestFrom((byte)0x4D, byte(2));
      byte howmanyA = Wire.available();
      Serial.print("got ");
      Serial.print(howmanyA);
      Serial.print(" bytes from slave A\n\n");
      Serial.println((Wire.read()) << 8 | (Wire.read()));      
      Wire.requestFrom((byte)0x4E, byte(2));
      byte howmanyB = Wire.available();
      Serial.print("got ");
      Serial.print(howmanyB);
      Serial.print(" bytes from slave B\n\n");
      Serial.println((Wire.read()) << 8 | (Wire.read()));

      timeNow = millis();
  }
}

Then everything works fine.

That means there isn’t a problem with the hardware-side (pull-up resistors, etc.).

The problem must be in the software-side, and more specifically, with the core library code.

We also know that functions like ‘getRawData()’, ‘getVoltage()’, etc. aren’t problematic because they work with a single device (and even with 2 devices hooked-up, they always run per device, not per multiple devices at the same time).

All that only leaves the functions that relate to the use of more than one device and, unless I’m totally missing something, that is only the constructor.

In the next post, I’ll elaborate on another reason to suspect the constructor as the culprit.

Here is the other reason for suspecting the constructor as the source of the problem.

Compare the two codes below which are identical apart from the location of the constructors of the 2 x MCP3221 devices.

In the first example, the constructors of the 2 slaves are places at the beginning of the sketch:

#include "MCP3221.h"

const byte SLAVE1 = 0x4D;
const byte SLAVE2 = 0x4E;

unsigned long timeNow;

MCP3221 mcp3221A(SLAVE1);
MCP3221 mcp3221B(SLAVE2);

void setup() {
    Serial.begin(9600);
    Wire.begin();
    timeNow = millis();
}

void loop() {
    if (millis() - timeNow >= 750) {
        
        Serial.print(F("Slave A reading:\t"));
        Serial.print(mcp3221A.getVoltage());
        Serial.print(F("mV\n\n"));
        Serial.print(F("Slave B reading:\t"));
        Serial.print(mcp3221B.getVoltage());
        Serial.print(F("mV\n\n"));
        timeNow = millis();
    }
}

The above code doesn’t work, specifically: communication with the first device is lost and no readings can be obtained from it (see first attached screen-shot).

However, if we move the constructors into loop(), thereby constructing 2 new slave devices at each cycle like so:

#include "MCP3221.h"

const byte SLAVE1 = 0x4D;
const byte SLAVE2 = 0x4E;

unsigned long timeNow;

void setup() {
    Serial.begin(9600);
    Wire.begin();
    timeNow = millis();
}

void loop() {
    if (millis() - timeNow >= 750) {
        MCP3221 mcp3221A(SLAVE1);
        Serial.print(F("Slave A reading:\t"));
        Serial.print(mcp3221A.getVoltage());
        Serial.print(F("mV\n\n"));
        MCP3221 mcp3221B(SLAVE2);
        Serial.print(F("Slave B reading:\t"));
        Serial.print(mcp3221B.getVoltage());
        Serial.print(F("mV\n\n"));
        timeNow = millis();
    }
}

This code works fine and, that is, we obtain correct readings from both devices (see second screen-shot attached below).

Can anyone make sense of this?

bad_reading.png

good_reading.png

Thanks for the suggestion, 6v6gt, but I can't see how using another library would help figuring out the problem with the current library. Here's why:. . .

I’d have thought it would make perfect sense. It is easy to do. If the Sparkywidget version works, then look for differences between the two e.g. yours uses an explicit destructor and namespaces and his does not. I’m not saying it is relevant but as an example.

Edit:
In your latest test where you have created an object in the loop immediately before use, you could have demonstrated more. Better would have been also to create the two objects together at the top of the loop, then attempt to use them. This would demonstrate if the creation of the second object destroys the environment of the first.
My suggestion would be to start hacking stuff out of your library until only the bare minimum is left for a basic functionality test, checking each time if starts working. Then putting stuff back until it breaks again.

'static' members of a class are shared with all objects. I'm not sure if that is also for a static variable in a function of a object.
In the function smoothData() there is a "static unsigned int emAvg".

Can you turn off the smoothing by setting it to NO_SMOOTHING and try again ?

I'm not sure if that is the only problem, perhaps there is something else that I'm missing. Could you also print the sizeof( mcp3221A) ?

In the function smoothData() there is a "static unsigned int emAvg".

Well spotted ! Static variables and multiple instances don't mix well. Fortunate is that it failed so cleanly instead of sporadically exhibiting mysterious behaviour.

6v6gt:
I’d have thought it would make perfect sense. It is easy to do. If the Sparkywidget version works, then look for differences between the two e.g. yours uses an explicit destructor and namespaces and his does not. I’m not saying it is relevant but as an example.

Edit:
In your latest test where you have created an object in the loop immediately before use, you could have demonstrated more. Better would have been also to create the two objects together at the top of the loop, then attempt to use them. This would demonstrate if the creation of the second object destroys the environment of the first.
My suggestion would be to start hacking stuff out of your library until only the bare minimum is left for a basic functionality test, checking each time if starts working. Then putting stuff back until it breaks again.

As suggested, tested the 2 x device setup with the 2 constructors at the beginning of loop() as follows:

#include "MCP3221.h"

const byte SLAVE1 = 0x4D;
const byte SLAVE2 = 0x4E;

unsigned long timeNow;

void setup() {
    Serial.begin(9600);
    Wire.begin();
    timeNow = millis();
}

void loop() {
    if (millis() - timeNow >= 750) {
        MCP3221 mcp3221A(SLAVE1);
        MCP3221 mcp3221B(SLAVE2);
        Serial.print(F("Slave A reading:\t"));
        Serial.print(mcp3221A.getVoltage());
        Serial.print(F("mV\n\n"));
        Serial.print(F("Slave B reading:\t"));
        Serial.print(mcp3221B.getVoltage());
        Serial.print(F("mV\n\n"));
        timeNow = millis();
    }
}

No difference in the result - same problem (the second constructor messes-up the first object) - see attached screen-shot of the result.

Edit: forgot to mention that I downloaded the ‘SW_MCP3211’ library and tried to run their own example, but got a bunch of compiling errors so gave up on it (no point in trying to diagnose a bug in my library by comparing it to another buggy library).

bad_reading_2.png

Koepel:
‘static’ members of a class are shared with all objects. I’m not sure if that is also for a static variable in a function of a object.
In the function smoothData() there is a “static unsigned int emAvg”.

Can you turn off the smoothing by setting it to NO_SMOOTHING and try again ?

I’m not sure if that is the only problem, perhaps there is something else that I’m missing. Could you also print the sizeof( mcp3221A) ?

Thanks for the suggestions, Koepel.

I’ve noticed the ‘static’ as well a few days ago and removed it, but it made no difference to the problem.

I’ve also tried testing with no smoothing and still no difference.

Although you’re absolutely right that the ‘static’ should be removed, it can’t be the source of the problem here because the problem persists even if I convert ‘getRawVoltage()’ to a public function and run it without any filtering.

Regarding the ‘sizeof()’ question, the response is ‘53’ for both object when using this code:

#include "MCP3221.h"

const byte SLAVE1 = 0x4D;
const byte SLAVE2 = 0x4E;

unsigned long timeNow;

void setup() {
    Serial.begin(9600);
    Wire.begin();
    timeNow = millis();
}

void loop() {
    if (millis() - timeNow >= 750) {
        MCP3221 mcp3221A(SLAVE1);
        Serial.print(F("Slave A:\t"));
        Serial.print(sizeof(mcp3221A));
        MCP3221 mcp3221B(SLAVE2);
        Serial.print(F("\n\n"));
        Serial.print(F("Slave B:\t"));
        Serial.print(sizeof(mcp3221B));
        Serial.print(F("\n\n"));
        timeNow = millis();
    }
}[\code]

See a printout of the result in the attached screenshot.

Btw, did you get a chance to have a close look at the constructor() function in the library? I still believe that’s where the issue truly lies, but can’t find anything wrong with it…

sizeof.png

OK. This is how I would consider proceeding:

Make getRawData() public.
Change your test sketch to use getRawData() instead of getVoltage(). Add #include <Wire.h> to this sketch.
Strip everything out of the .h and .cpp file that is not absolutely necessary to support getRawData().
If you succeed in getting a working version of your library, then add back the code until it breaks.
Post both the “pre-break” code and the “post-break” versions of your sketch, .h and .cpp code.

If you don’t even get that far, post the most minimal version of your test sketch, .h and .cpp code
which demonstrates the problem.