Why does the compiler overwrite 4 bytes with $00 when a string array is initialized?

I just got a new UNO R4 Wifi and started to look at the examples.

Something weird happened when I played with the Game Of Life example. I added a fifth starting pattern of my own and suddenly the first 4 Bytes of the first pattern name "Glider" got overwritten with $00s. I reduced the problem to the following demo, where "Glider" has been replaced by "@@@@@", and the name of the fifth pattern is empty (""):

  String patternNames[] = {
    "@@@@@",
    "Light-weight spaceship",
    "R-Pentomino",
    "Diehard",
    ""
  };

void setup() {
  Serial.begin(9600);
  delay(1000);

  Serial.println("Looking at the first characters of the first element i the String-Arrazy");
  Serial.println(String(patternNames[0].charAt(0), HEX));
  Serial.println(String(patternNames[0].charAt(1), HEX));
  Serial.println(String(patternNames[0].charAt(2), HEX));
  Serial.println(String(patternNames[0].charAt(3), HEX));
  Serial.println(String(patternNames[0].charAt(4), HEX));

  Serial.end();
}

I expected to see 5x $40 for the "@@@@@" on the Serial Monitor.

But I got:

Looking at the first five characters of the first element in the String-Array
0
0
0
0
40

When I remove the fifth (empty="") element of the array, I get the expected result:

Looking at the first five characters of the first element in the String-Array
40
40
40
40
40

(this also is the original state of the code Game Of Life example, before I added my own pattern)

And the - at least for me - surprising bit: when I move the definition of the 5-Element String array into the setup() function, I also get the expected result.

Any idea what is going on? If it was my code, I would probably have put all the definitions into the setup() function. I don't know why the author has the definition outside setup(). But why would the definition be treated differently (and not correctly) by the compiler when it is outside setup()?

IDE 2.1.1Version: 2.1.1
Date: 2023-06-30T16:04:40.277Z
CLI Version: 0.32.3

Version 1.0.2 of the Arduino UNO R4 Boards

memory issue ? you did not post the loop, does it do anything?

if you try this, what do you see ?

String patternNames[] = {
  "@@@@@",
  "Light-weight spaceship",
  "R-Pentomino",
  "Diehard",
  ""
};

void setup() {
  Serial.begin(115200);
  while (!Serial);
  Serial.println(patternNames[0].charAt(0), HEX);
  Serial.println(patternNames[0].charAt(1), HEX);
  Serial.println(patternNames[0].charAt(2), HEX);
  Serial.println(patternNames[0].charAt(3), HEX);
  Serial.println(patternNames[0].charAt(4), HEX);
}

void loop() {}

Nothing happens in loop(). And waiting for Serial as you suggest does not make a difference.

When you write "memory issue" what do you mean? It must be some pointer or indexing error behind the scenes, yes, but the example does not use too much memory so it should not be related to a lack of memory.

If you use a string other than "@@@@@" (e.g. "@ABCD") you actually see that the $00 actually replace the characters of the initializing string:

Looking at the first five characters of the first element in the String-Array
0
0
0
0
44

where $44 corresponds to the "D".

I suspect it is a bug in the compiler related to how arrays are initialized with a list. And it appears only to be relevant to glabal variables, since defining it inside setup() works without problem.

When I declare the array in the global scope and then define the 5 names separately in the setup() like so

  patternNames[0] = "Glider";
  patternNames[1] = "Light-weight spaceship";
  patternNames[2] = "R-Pentomino";
  patternNames[3] = "Diehard";
  patternNames[4] = "New Pattern";

it also works fine.

What does my exact code prints?

I don’t have this arduino so can’t try out.

Regarding the memory comment I was wondering if you had other stuff in the loop allocating static memory which would make the String usage you have (needless) to fail

The result of your code is exactly the same with the $00's replacing the first 4 "@", i.e.

0
0
0
0
40

The demo i post #1 above is exactly the code that I am running (with an empty void loop() {}) on the Arduino.

And if you do

const char * patternNames[] = {
  "@@@@@",
  "Light-weight spaceship",
  "R-Pentomino",
  "Diehard",
};

void setup() {
  Serial.begin(115200);
  while (!Serial);
  Serial.println(patternNames[0][0], HEX);
  Serial.println(patternNames[0][1], HEX);
  Serial.println(patternNames[0][2], HEX);
  Serial.println(patternNames[0][3], HEX);
  Serial.println(patternNames[0][4], HEX);
}

void loop() {}

Do you see the same problem ?

That is a good one. No, the problem disappears. I used the string "@ABCD" and get the correct ASCII values back:

40
41
42
43
44

So the bug is related to the implementation of the String type.

Interesting

The compilers should guarantee that global instances got their constructor called (are initialized) before main is called. So apparently that’s not the case for the String class - possibly meaning there is an issue with the way its constructor works?

The initialization process is described here

https://en.cppreference.com/w/cpp/language/initialization

May be worth looking at it in details

If you move the empty String to a different position in the array does anything change? Do all the Strings prints out correctly?

What happens if you use a random number as an index into the array, instead of a fixed patternNames[0]? You can still use element 0 by subtracting an appropriate number from the random number, since random() always produces an identical sequence, but this prevents the compiler from attempting to optimize the array away completely.

String patternNames[] = {
    "@ABCD",
    "Light-weight spaceship",
    "R-Pentomino",
    "Diehard",
    ""
  };

void setup() {
  Serial.begin(115200);
  while (!Serial);

  Serial.println("Looking at the first five characters of the first element in the String-Array");
  Serial.println(String(patternNames[0].charAt(0), HEX));
  Serial.println(String(patternNames[0].charAt(1), HEX));
  Serial.println(String(patternNames[0].charAt(2), HEX));
  Serial.println(String(patternNames[0].charAt(3), HEX));
  Serial.println(String(patternNames[0].charAt(4), HEX));
  Serial.println(patternNames[0]);
  Serial.println(patternNames[1]);
  Serial.println(patternNames[2]);
  Serial.println(patternNames[3]);
  Serial.println(patternNames[4]);
  Serial.println("End of test");
  Serial.println();

  Serial.end();
}

void loop() {}

results in

Looking at the first five characters of the first element in the String-Array
0
0
0
0
44
☐☐☐☐D
Light-weight spaceship
R-Pentomino
Diehard

End of test

When I replace the empty string with "New Pattern", I get:

Looking at the first five characters of the first element in the String-Array
0
0
0
0
44
☐☐☐☐D
Light-weight spaceship
R-Pentomino
Diehard
New Pattern
End of test

When I move the emtpy string into the middle (index 2), I get:

Looking at the first five characters of the first element in the String-Array
0
0
0
0
44
☐☐☐☐D
Light-weight spaceship

R-Pentomino
Diehard
End of test

I suspect that the empty element simply returns a null pointer.
Here is an example of that code with an extra pattern called rubbish at the end. This is the first pattern that gets called when it is run.

/*
  Example developed starting from Toby Oxborrow's sketch
  https://github.com/tobyoxborrow/gameoflife-arduino/blob/master/GameOfLife.ino
*/
// adding an extra pattern - Grumpy Mike
#include "Arduino_LED_Matrix.h"

// grid dimensions. should not be larger than 8x8
#define MAX_Y 8
#define MAX_X 12

// time to wait between turns
#define TURN_DELAY 200

// how many turns per game before starting a new game
// you can also use the reset button on the board
#define TURNS_MAX 60

// number of patterns in predefined list
#define MAX_PATTERNS 5

// how many turns to wait if there are no changes before starting a new game
#define NO_CHANGES_RESET 4

int turns = 0;       // counter for turns
int noChanges = 0;  // counter for turns without changes

// game state. 0 is dead cell, 1 is live cell
uint8_t grid[MAX_Y][MAX_X] = {
    {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 
    {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
    {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 
    {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
    {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 
    {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
    {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 
    {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
};

int currentPattern = 4;

String patternNames[] = {
  "Glider",
  "Light-weight spaceship",
  "R-Pentomino",
  "Diehard",
  "Rubbish"
};

// custom starting grid patterns
boolean cGrids[][MAX_Y][MAX_X] = {
    { /* Glider */
        {0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
    },
    { /* Light-weight spaceship */
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0},
        {0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0},
        {0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
    },
    { /* R-Pentomino */
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
    },
    { /* Die hard */
        {0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0},
        {1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
    },
    { /* Rubbish */
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
        {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
    }
};


ArduinoLEDMatrix matrix;

void setup() {
  Serial.begin(9600);
  delay(1000);

  Serial.println("Conway's game of life on Arduino LED Matrix");
  matrix.begin();
  
  resetGrid();
  displayGrid();

}

void loop() {
  delay(TURN_DELAY);

  playGoL();

  turns++;

  // reset the grid if no changes have occured recently
  // for when the game enters a static stable state
  if (noChanges > NO_CHANGES_RESET) {
    resetGrid();

  }
  // reset the grid if the loop has been running a long time
  // for when the game cycles between a few stable states
  if (turns > TURNS_MAX) {
    resetGrid();
  }

  displayGrid();
}

// play game of life
void playGoL() {
  /*
    1. Any live cell with fewer than two neighbours dies, as if by loneliness.
    2. Any live cell with more than three neighbours dies, as if by
    overcrowding.
    3. Any live cell with two or three neighbours lives, unchanged, to the next
    generation.
    4. Any dead cell with exactly three neighbours comes to life.
    */

  boolean newGrid[MAX_Y][MAX_X] = {
      {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
      {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
      {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 
      {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
      {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 
      {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
      {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 
      {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
  };

  for (int y = 0; y < MAX_Y; y++) {
    for (int x = 0; x < MAX_X; x++) {
      int neighboughs = countNeighbours(y, x);
      if (grid[y][x] == 1) {
        if ((neighboughs == 2) || (neighboughs == 3)) {
          newGrid[y][x] = 1;
        } else {
          newGrid[y][x] = 0;
        }
      } else {
        if (neighboughs == 3) {
          newGrid[y][x] = 1;
        } else {
          newGrid[y][x] = 0;
        }
      }
    }
  }

  // update the current grid from the new grid and count how many changes
  // occured
  int changes = 0;
  for (int y = 0; y < MAX_Y; y++) {
    for (int x = 0; x < MAX_X; x++) {
      if (newGrid[y][x] != grid[y][x]) {
        changes++;
      }
      grid[y][x] = newGrid[y][x];
    }
  }

  // update global counter when no changes occured
  if (changes == 0) {
    noChanges++;
  }
}

// count the number of neighbour live cells for a given cell
int countNeighbours(int y, int x) {
  int count = 0;

  // -- Row above us ---
  if (y > 0) {
    // above left
    if (x > 0) {
      count += grid[y - 1][x - 1];
    }
    // above
    count += grid[y - 1][x];
    // above right
    if ((x + 1) < 8) {
      count += grid[y - 1][x + 1];
    }
  }

  // -- Same row -------
  // left
  if (x > 0) {
    count += grid[y][x - 1];
  }
  // right
  if ((x + 1) < 8) {
    count += grid[y][x + 1];
  }

  // -- Row below us ---
  if ((y + 1) < 8) {
    // below left
    if (x > 0) {
      count += grid[y + 1][x - 1];
    }
    // below
    count += grid[y + 1][x];
    // below right
    if ((x + 1) < 8) {
      count += grid[y + 1][x + 1];
    }
  }

  return count;
}

// reset the grid
void resetGrid() {
  Serial.print("Current pattern: ");
  Serial.println(patternNames[currentPattern]);
  noChanges = 0;
  turns = 0;

  for (int y = 0; y < MAX_Y; y++) {
    for (int x = 0; x < MAX_X; x++) {
      grid[y][x] = cGrids[currentPattern][y][x];
    }
  }
  currentPattern++;
  if(currentPattern >= MAX_PATTERNS){
    currentPattern = 0;
  }
}

// display the current grid to the LED matrix
void displayGrid() {
  matrix.renderBitmap(grid, 8, 12);
}

I do not think it is the empty string. If I use "New Pattern 1" instead, the problem ist still there:

Looking at the first five characters of the first element in the String-Array
0
0
0
0
44
☐☐☐☐D
Light-weight spaceship
New Pattern 1
R-Pentomino
Diehard
End of test

When I add a sixth element "New Pattern 2", the problem disappears:

Looking at the first five characters of the first element in the String-Array
40
41
42
43
44
@ABCD
Light-weight spaceship
New Pattern 1
R-Pentomino
Diehard
New Pattern 2
End of test

@Grumpy_Mike , what does the serial monitor print, when you add the rubbish pattern? That is exactly what I did and then "Glider" became " ☐ ☐ ☐ ☐er" with four $00 replacing the "Glid".

(Will need to sign off now, will come back later to see if somebody has found the root cause.)

It is only the first string that gives you zeros, all the other names print fine. This is what I get when I print out all the names.

Looking at the first five characters of names
0
0
0
0
0
0
end
1
4c
69
67
68
74
end
2
52
2d
50
65
6e
end
3
44
69
65
68
61
end
4
64
0
0
0
0
end

Light-weight spaceship
R-Pentomino
Diehard
d
End of test

it's really weird

The number of overwritten characters happens to be the number of remaining items in the array. Is this coincidence?

Has anyone tried it on another Arduino board?

Does it work properly on it?

Maybe longshot, but wonder if maybe there is an issue with calling mallor/realloc from constructor or global object, in that when the startup code calls these constructors, maybe the heap is not fully or properly initialized.

We were running into issues with classes who called attachInterrupt, in their constructor, not working. And about the only exciting thing that attachInterrupt does is to allocate a new buffer object to store the data. And this data was corrupted.

Was investigating earlier, but punted after many suggested you should not do that, but instead do it from a begin/init method... Wonder if the String class should be changed to not call memory stuff from constructor?