Pointers, References and Bears, Oh My! Help me understand

#ifndef MyTypes_h
#define MyTypes_h

typedef struct {
  int main_menu_selection;
  char* sub_menu_items;
} MENU;

#endif

========================

#include "MyTypes.h"

char *sub_items[] = {"Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8"};

MENU menu1 = {0, *sub_items};

void setup() {
  // Debugging output
  Serial.begin(9600);

  for(int i = 0; i<8; i++){
    Serial.println(sub_items[i]);
  }
  Serial.print(main_items[menu1.main_menu_selection]);
  Serial.print(&menu1.sub_menu_items[0]);
  Serial.print(menu1.sub_menu_items[0]);
}
void loop() {
}

(Note: the dashed line indicates separate files)

In the code, above, the line Serial.println(sub_items[i]); prints the 8 elements of sub_items.

The line Serial.print(&menu.sub_menu_items[0]); prints “Sub1”. (note the & - dereference)

The following line (no &), Serial.print(menu1.sub_menu_items[0]); prints “S”.

I am supposing that this is the intended behavior, but I am not understanding why and how. Why/how does dereferencing the struct member get me the behavior I want (treating the char array as, essentially, an array of strings), while without the & the member is treated as a ‘simple’ array of char?

Code tags round your example code lines would have been a good idea

You have mixed things up. Your struct definition has an int and a char pointer.

You have initialized the instance of it with an int and an array of char pointers. The two need to be consistent.

Got it.

In C and C++ array indexing can be thought of as "syntactic sugar" for pointer operations.

There's no real difference between

*(a+4) and a[4]

So when you declare

char *sub_items[] = {"Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8"};

and then do:

MENU menu1 = {0, *sub_items};

You are doing the equivalent of

MENU menu1 = {0, sub_items[0]};

In other words the menu1 sub_menu_items field is set to the string "Sub1".

I think you mean to declare the struct thus:

typedef struct {
  int main_menu_selection;
  char ** sub_menu_items;
} MENU;

and then initialize with

MENU menu1 = {0, sub_items};

The subtlety is that array types have lengths fixed at compile-time, so you need the struct to
be declared as pointer-to-pointer as the size isn't knowable to the struct at runtime.

is syntactic sugar for * in types - its not the same as giving at explicit size

So, when I execute char *sub_items[] = {"Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8"}; I am creating (and initializing) 8 arrays of char, an array of 8 pointers (one for each of the char arrays), and a pointer to the array of pointers?

Since, for types, the is "syntactic sugar for *" I could also have written char **sub_items = {"Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8"}; ??

To avoid the danger of your code accidentally changing the pointer values (and stranding the string literals in memory) or changing the string literals themselves, you should use:

const char * const sub_items[] = {"Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8"};

Re-writing the code as follows:

#ifndef MyTypes_h
#define MyTypes_h

typedef struct {
  int main_menu_selection;
  char* sub_menu_items[];
} MENU;

#endif

=======================

#include "MyTypes.h"

const char * const sub_items[] = {"Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8"};

MENU menu1 = {0, *sub_items};

void setup() {
  // Debugging output
  Serial.begin(9600);

  Serial.println("Serial.println(sub_items[i])");
  for(int i = 0; i<8; i++){
    Serial.print(i);
    Serial.print(": ");
    Serial.println(sub_items[i]);
  }
  Serial.println("Serial.println(menu1.sub_menu_items[i])");
  for(int i = 0; i<8; i++){
    Serial.print(i);
    Serial.print(": ");
    Serial.println(menu1.sub_menu_items[i]);      
  }
  
}
void loop() {
}

produces the following result:

Serial.println(sub_items[i])
0: Sub1
1: Sub2
2: Sub3
3: Sub4
4: Sub5
5: Sub6
6: Sub7
7: Sub8
Serial.println(menu1.sub_menu_items[i])
0: Sub1
1: Sub1
2: Sub2
3: Sub3
4: Sub4
5: Sub5
6: Sub6
7: Sub7

How did I lose the char array containing “Sub8”?

MarkT:
is syntactic sugar for * in types - its not the same as giving at explicit size

This is confusing at best.
might be the same as * in function arguments, but not in many other cases. Like in the following for example:

wwoofbum:

typedef struct {

int main_menu_selection;
 char* sub_menu_items;
} MENU;

Your sub_menu_items member variable is a flexible array member, a GCC extension, and highly discouraged. (See Zero Length (Using the GNU Compiler Collection (GCC)))

Using standard C++, you'll get a warning:

sketch.ino: warning: ISO C++ forbids flexible array member 'sub_menu_items' [-Wpedantic]
   const char* sub_menu_items[];
                              ^

Unfortunately, Arduino suppresses warnings like these by default ...

If you want to store a pointer to your array of menu items, the member should be a pointer: const char ** sub_menu_items;
If you want to store the menu items inside of the struct, the array size has to be fixed and known at compile time, either by hardcoding it, or by making it a template argument.

Pieter

Using a ‘const char * const * const’ just to be sure pointers and literals don’t get changed:

typedef struct {
  int main_menu_selection;
  const char * const * const sub_menu_items;
} MENU;

const char * const sub_items[] = {"Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8"};

MENU menu1 = {0, (const char * const * const) sub_items};

void setup() {
  // Debugging output
  Serial.begin(115200);
  delay(1000);

  Serial.println("Serial.println(sub_items[i])");
  for (int i = 0; i < 8; i++) {
    Serial.print(i);
    Serial.print(": ");
    Serial.println(sub_items[i]);
  }
  Serial.println("Serial.println(menu1.sub_menu_items[i])");
  for (int i = 0; i < 8; i++) {
    Serial.print(i);
    Serial.print(": ");
    Serial.println(menu1.sub_menu_items[i]);

  }


}
void loop() {
}
Serial.println(sub_items[i])
0: Sub1
1: Sub2
2: Sub3
3: Sub4
4: Sub5
5: Sub6
6: Sub7
7: Sub8
Serial.println(menu1.sub_menu_items[i])
0: Sub1
1: Sub2
2: Sub3
3: Sub4
4: Sub5
5: Sub6
6: Sub7
7: Sub8

If you want to store the menu items inside of the struct, the array size has to be fixed and known at compile time, either by hardcoding it, or by making it a template argument.

To hardcode the array size I can either

#ifndef MyTypes_h
#define MyTypes_h

typedef struct {
  int main_menu_selection;
  char* sub_menu_items[8];
} MENU;

#endif

, in which case I had better always have 8 elements in the array, or may I change the struct to include a length, as in

#ifndef MyTypes_h
#define MyTypes_h

typedef struct {
  int main_menu_selection;
  int sub_menu_elements;
  char* sub_menu_items[sub_menu_elements];
} MENU;

#endif

??

wwoofbum:
or may I change the struct to include a length

No, the array size must be a compile-time constant. In your example, sub_menu_elements is a variable, not a compile-time constant.

If you want a "variable" number of elements, you either have to use a template (creates different types), or dynamic memory allocation (can cause heap fragmentation).
Alternatively, don't store the array in the struct at all, just store a pointer.

The type of your array is still wrong. It needs to be at least const char * sub_menu_items[8], and probably const char * const sub_menu_items[8], see gfvalvo's replies on the matter.

Thanks to all for your responses!

I have applied the suggestions of gfvalvo and I am getting the results I was looking for.

Given that I do not propose to alter the contents of the sub_items arrays (one for each main menu option) it seems that storing a pointer to the sub_items array in the struct is the most sensible, and will use the least of the limited memory space available in my arduino board.

wwoofbum:
So, when I execute char *sub_items[] = {"Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8"}; I am creating (and initializing) 8 arrays of char, an array of 8 pointers (one for each of the char arrays), and a pointer to the array of pointers?
Since, for types, the is "syntactic sugar for *" I could also have written char **sub_items = {"Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8"}; ??

You are not initializing a pointer to the array of pointers. An initializer is not the same thing as an expression.

The subtlety is that when an identifier that is the name of an array is evaluated as an expression, its value has type 'pointer to (the first element of the array)'.

char *sub_items[] = {"Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8"};
char **sub_ptr = sub_items;

Serial.println((int)(void*)&sub_items); // not allowed to do this - 'sub items' is the name that the linker knows about, it's not stored anywhere in memory
Serial.println((int)(void*)&sub_ptr); // the address of the sub_ptr variable
Serial.println((int)(void*)sub_items); // the address of the 0th element of the array
Serial.println((int)(void*)sub_ptr); // also the address of the 0th element of the array
Serial.println((int)(void*)*sub_items); // the address of the String 'Sub1'
Serial.println((int)(void*)*sub_ptr); // also the address of the String 'Sub1'
Serial.println((int)**sub_items); // ascii value of character 'S'
Serial.println((int)**sub_ptr); // also ascii value of character 'S'

Serial.println(sub_items[1]); // prints "Sub2"
Serial.println(sub_ptr[1]); // prints "Sub2"
Serial.println(sub_ptr ==  sub_items); // prints 1 (true)
Serial.println(sub_ptr ==  &sub_items[0]); // prints 1 (true)
sub_items++; // not allowed to do this.
sub_ptr++; // this is ok
Serial.println(sub_ptr[1]); // now prints "Sub3"
Serial.println(sub_ptr[1]); // now prints "Sub3"
Serial.println(sub_ptr ==  sub_items); // prints 0 (false)
Serial.println(sub_ptr ==  &sub_items[1]); // prints 1 (true)

Serial.println(sizeof(sub_items)); // prints 16 - 8*2-byte pointer
Serial.println(sizeof(sub_ptr)); // prints 2 - a 2-byte pointer
Serial.println(sizeof(*sub_items)); // prints 2 -  a 2-byte pointer
Serial.println(sizeof(*sub_ptr)); // prints 2 - a 2-byte pointer

It can help to draw this stuff on paper. the address of 'sub_items' is something known to the linker. Wherever it appears in code, the linker overwrites the binary with the address that it has allocated.