Is this a bug?

Hello,

I have a sizeable (30K) chunk of code running on a mega and after adding the latest tweaks, I started getting odd behaviour. So I took the 'bones' of the problem code and ran it in a new sketch.

So, here goes ....

With code looking like this ..

char * loc[3] = {"abcd", "abcd", "abcd"};

char name [5];

void setup () { 
  Serial.begin(9600);  
  name = {'1','2','3','4','\0'};  
  strcpy(loc[0], name);
  Serial.println(loc[0]);
  Serial.println(loc[1]);
  Serial.println(loc[2]);
}

void loop () {
  
  // whatever
  
}

The output is this ....

1234
1234
1234

When I change the declaration of 'loc' to .....

char * loc[3] = {"abcd", "efgh", "ijkl"};

The output is ...

1234
efgh
ijkl

When I change the declaration of 'loc' back to the original (all initialisers the same) and replace the strcpy with ...

loc[0] = "1234";

I get this ...

1234
abcd
abcd

Am I missing something? :fearful:

You're probably missing some RAM. We're missing your code.

this doesn't look right name = {'1','2','3','4','\0'};

pretty sure array definitions must be beside the declaration

char name[] = {'1','2','3','4','\0'};

EDIT: also

strcpy(loc[0], name);

writes to whatever the byte value is eg. (char*) '1'.

I think you need the address

strcpy(&loc[0], name);

You're probably missing some RAM. We're missing your code.

My code uses plenty of strings (it generates a web page when it gets a GET) - the 'constant' strings are held in flashmem. I do a check on the 'spare' ram on startup - and for the moment it's fine.

As for the code, do you mean the original? I didn't want to post 1600 lines of code - anyway, in trying to figure out what was going on, I boiled it all down to the example given - which seems to be acting oddly.

The project is monitoring 7 sht71 temp/humidity sensors and presenting the readings as a web page. The sensors are named as per their locations and I initialise the location array as

char* loc[nSHT] = {" NC ", " NC ", ... and so on (NC = Not Connected)

I have a config file which indicates their actual location (Shed, Atelier, Tree, etc.) and I replace the 'NC' locations with the actual location in setup (I use a config file, because I don't want users messing with the core code - setup does all kinds of integrity checks to make sure the configurer hasn't done something daft).

My problems started then. I discovered that the array 'loc' (well, the items pointed to) had all taken the last value 'overwritten' with strcpy. In fact after posting this, I've changed the initialisation of 'loc' to ....

char* loc[] = {" NC1", " NC2", and so on ...

.. and it works fine.

It looks as though the declaration of 'loc' - where all the initialisers are identical - generates a single string and makes all the pointer elements point to this same (single) string - so replacing one replaces all.

The last case works because an assignment (as opposed to a strcpy) generates a new string, pointed to by loc0

@pYro_65

Thanks for the advice. In fact I'm sure you're correct about the initialiser - sadly either way has no effect. I've also tried all possible combinations of * and & (clutching at straws) to no avail.

As mentioned above, it seems to me that ...

a) Defining a char* array where all initialisations strings are identical results in an array of pointers all pointing to the same string. Thus changing any one results in them all (apparently) changing.

b) Doing the above with all initialisers different, results in separate and distinct strings.

c) The fact that loc[0]="newstr" gives the result it does, must mean that a new string is created and pointer[0] now points to that. loc[1] and loc[2] still point to the same string.

I hadn't realised strings were so tricky.

I hadn't realised strings were so tricky.

You misspelled pointers.

g8vkv: As for the code, do you mean the original? I didn't want to post 1600 lines of code - anyway, in trying to figure out what was going on, I boiled it all down to the example given - which seems to be acting oddly.

Let me use the analogy of visiting a doctor. Posting only a bit of code is like going to the doctor and saying "Here is what is wrong with me, please prescribe something." No Doctor I have ever visited let that pass.

You might be right about the code that is acting up. The trouble is, if you are running out of RAM then the code snippets you are posting may very well be just fine. When a microcontroller runs out of RAM its behavior becomes unpredictable.

Attaching your .pde to a post is an option.

Keep in mind the Mega 2560 only has 8K of RAM. Each string (and variable) you use, consumes RAM. If you are using the Ethernet library that takes a healthy chuck of RAM as well.

Let me use the analogy of visiting a doctor. Posting only a bit of code is like going to the doctor and saying "Here is what is wrong with me, please prescribe something." No Doctor I have ever visited let that pass.

You might be right about the code that is acting up. The trouble is, if you are running out of RAM then the code snippets you are posting may very well be just fine. When a microcontroller runs out of RAM its behavior becomes unpredictable.

Attaching your .pde to a post is an option.

Keep in mind the Mega 2560 only has 8K of RAM. Each string (and variable) you use, consumes RAM. If you are using the Ethernet library that takes a healthy chuck of RAM as well.

Thank you for the analogy. However, if you go to the quack with a pain in the neck, he normally doesn't demand to remove your shoes and examine your feet. Correspondingly, the behaviour of the code I posted was, in itself, strange - at least to me. It replicated the puzzling behaviour of the complete program and the solution I eventually found was the same for both.

The full code is pretty hefty and in 8 different files and I don't think posting it now would be especially useful.

Anyway, thanks for all the helpful and enlightening comments, but I think I have it figured out myself, as I stated previously.

Once again, the fact that the behaviour is exactly the same in the extracted code as in the complete gubbins and the solution is exactly the same for both, is a pretty good indicator that I've nailed the cause and found the solution - quod erat demonstrandum and all that.

The crux of the matter is that char* [n] array elements, when initialised, "share" identical string initialisers. Maybe this is a well-known fact, but I certainly hadn't realised (or even considered) it.

Also, as stated previously, I am well aware of the cost of strings - that's why I hold as many as I can in flash memory (thanks to Arduiniana, run by a really helpful and knowledgeable guy). I also keep an ongoing check on the amount of free RAM and it's doing pretty well at the moment, but I guess it might eventually turn into a problem.

Ciao

What are you using to build your example code? It does not compile for me when using the IDE.

The compiler does not like this line:

  name = {'1','2','3','4','\0'};

--- bill

What are you using to build your example code?
It does not compile for me when using the IDE.

Hi bill,

I’m using 0022 on ubuntu 11.10 - it goes OK for me - strange.

I think it really should be …

char* name[5] = {‘1’, ‘2’, ‘3’, ‘4’, ‘\0’};

… and remove “name =” which might be an issue.

I’m using standard libraries, but have had to patch MsTimer2 - but I can’t see that having an effect.

(I don't use Windoze either)

Post the exact .pde file that you are using.

--- bill

OK - attached (I hope)

The output I get from this file is …

1234
1234
1234

Here’s the code (just in case)…

char * loc[3] = {"abcd", "abcd", "abcd"};

char name [5] = {'1','2','3','4','\0'};

void setup () {
  
  Serial.begin(9600);
  
  strcpy(loc[0], name);
  
  Serial.println(loc[0]);
  Serial.println(loc[1]);
  Serial.println(loc[2]);
       
     

}


void loop () {
  
  // whatever
  
}

sketch_nov14d.pde (290 Bytes)

I had to go look this up.
In C, literal strings like “abcd” are considered constant so attempting to write to them
is undefined.

The problem that you are seeing is that the compiler/linker, with the const assumption in mind,
is smart enough to recognize the duplication of strings and collapses them down into a single
storage area. So while there are 3 pointers, they all point to the same RAM location.
So if you modify one (which you not supposed to do as they are literal constants), you end up modifying them all,
which is allowed since modifying literal strings is not allowed and doing so creates undefined
behavior.

If you want/need to allocate storage that gets filled in with the string contents that
can be later be modified, you need to declare the storage differently.

Example:
Use this:

char loc[][5] = { "abcd",  "abcd", "abcd"};

instead.

This will allocate an array of pointers that points to locations in RAM that
are each 5 bytes in size and pre-fills each 5 byte location with the specified string data.

— bill

Hi bill,

That's absolutely brilliant - thank you (have a virtual beer on me!).

It's many years since I last used C in anger (and it was pretty good at making me angry) but your thoughts stir a deep and distant memory.

Thanks again for responding to my question in such a helpful way.

Cheers Keith