Inconsistent behavior of bitfield in union

Hi,

I'm experimenting with unions and bitfields and am not getting expected results. Here's a piece of code:

/* Declaration of a 32bit type, where 31 bit are used for a Time value, and 1 bit as a flag (Act).
   It's set up as a union, where all 32 bit can be accessed at once.*/
typedef union {
  struct {
    uint32_t Act: 1;
    uint32_t Time: 31;
  } Activity;
  uint32_t All;
} timelast;

/* Declaration of a nearly identical type, where only the sequence of Time and Act are the other way round.*/
typedef union {
  struct {
    uint32_t Time: 31;
    uint32_t Act: 1;
  } Activity;
  uint32_t All;
} timefirst;

void setup() {
  Serial.begin(115200);

  timelast tl;
  timefirst tf;
  
  tl.All = 0xFEDD;
  tf.All = 0xFEDD;
  
  Serial.println("timelast:");
  // Send the 32 bit value:
  Serial.print(" tl.All:           "); Serial.print(tl.All, BIN); Serial.print(" "); Serial.println(tl.All, HEX);
  //send the 31 bit which represent Time (works as expected):
  Serial.print(" tl.Activity.Time: "); Serial.print(tl.Activity.Time, BIN); Serial.print("  "); Serial.println(tl.Activity.Time, HEX);
  //send the 1 bit which represents Act (works as expected):
  Serial.print(" tl.Activity.Act: "); Serial.println(tl.Activity.Act, BIN);

  Serial.println();

  Serial.println("timefirst:");
  // Send the 32 bit value:
  Serial.print(" tf.All:           "); Serial.print(tf.All, BIN); Serial.print(" "); Serial.println(tf.All, HEX);
  //send the 31 bit which represent Time (**does not work as expected**):
  Serial.print(" tf.Activity.Time: "); Serial.print(tf.Activity.Time, BIN); Serial.print(" "); Serial.println(tf.Activity.Time, HEX);
  //send the 1 bit which represents Act (**does not work as expected**):
  Serial.print(" tf.Activity.Act: "); Serial.println(tf.Activity.Act, BIN);
}

void loop() {}

This is the result:

timelast:
 tl.All:           1111111011011101 FEDD
 tl.Activity.Time: 111111101101110  7F6E
 tl.Activity.Act: 1

timefirst:
 tf.All:           1111111011011101 FEDD
 tf.Activity.Time: 1111111011011101 FEDD
 tf.Activity.Act: 0

The union "timelast" works as expected: I receive a 31 bit value, if I access the 31 bit bitfield.
With "timefirst" this does not work: I'm getting a 32 bit value when accessing the 31 bit bitfiled. And the 1 bit flag returns a wrong value.

What am I overlooking?

TIA,
Christian

What am I overlooking?

the fact that the C++ specification mentions you should not second guess what the compiler does and where the data goes

The union is only as big as necessary to hold its largest data member. The other data members are allocated in the same bytes as part of that largest member. The details of that allocation are implementation-defined, and it's undefined behavior to read from the member of the union that wasn't most recently written. Many compilers implement, as a non-standard language extension, the ability to read inactive members of a union

allocation varies. for example if you try this code

/* Declaration of a 32bit type, where 31 bit are used for a Time value, and 1 bit as a flag (Act).
   It's set up as a union, where all 32 bit can be accessed at once.*/
typedef union {
  struct {
    uint32_t Act: 1;
    uint32_t Time: 31;
  } Activity;
  uint32_t All;
} timelast;

/* Declaration of a nearly identical type, where only the sequence of Time and Act are the other way round.*/
typedef union {
  struct {
    uint32_t Time: 31;
    uint32_t Act: 1;
  } Activity;
  uint32_t All;
} timefirst;

void setup() {
  Serial.begin(115200);

  timelast tl;
  timefirst tf;

  tl.All = 0xFEDD;
  // tf.All = 0xFEDD;
  tf.Activity.Time = 0x7F6E;
  tf.Activity.Act = 1;

  Serial.print(F("Size of timelast = ")); Serial.println(sizeof(timelast));
  Serial.println("timelast:");
  // Send the 32 bit value:
  Serial.print(" tl.All:           "); Serial.print(tl.All, BIN); Serial.print(" "); Serial.println(tl.All, HEX);
  //send the 31 bit which represent Time (works as expected):
  Serial.print(" tl.Activity.Time: "); Serial.print(tl.Activity.Time, BIN); Serial.print("  "); Serial.println(tl.Activity.Time, HEX);
  //send the 1 bit which represents Act (works as expected):
  Serial.print(" tl.Activity.Act: "); Serial.println(tl.Activity.Act, BIN);

  Serial.println();


  Serial.print(F("Size of timefirst = ")); Serial.println(sizeof(timefirst));
  Serial.println("timefirst:");

  // Send the 32 bit value:
  Serial.print(" tf.All:           "); Serial.print(tf.All, BIN); Serial.print(" "); Serial.println(tf.All, HEX);
  //send the 31 bit which represent Time (**does not work as expected**):
  Serial.print(" tf.Activity.Time: "); Serial.print(tf.Activity.Time, BIN); Serial.print(" "); Serial.println(tf.Activity.Time, HEX);
  //send the 1 bit which represents Act (**does not work as expected**):
  Serial.print(" tf.Activity.Act: "); Serial.println(tf.Activity.Act, BIN);
}

void loop() {}

you'll see in the console

[sub][color=purple]Size of timelast = 4
timelast:
 tl.All:           1111111011011101 FEDD
 tl.Activity.Time: 111111101101110  7F6E
 tl.Activity.Act: 1

Size of timefirst = 4
timefirst:
[color=red] tf.All:           10000000000000000111111101101110 80007F6E
[/color] tf.Activity.Time: 111111101101110 7F6E
 tf.Activity.Act: 1
[/color][/sub]

so you can see that the compiler decided to have the 1 bit as the MSB and the 31 bits as the LSB of a 32 bit field, whereas in the first code you had they were all packed in the LSBs and the 15 top bits were kept to 0 given you were using 16 bits data

Have you tried to use "DEC" (or no) formatter instead of "BIN" for Serial.print and check the results?

What J-M-L said. If you want to be specific about what bits go where, then you'll need to use the bit-fiddling operators.

Thank you all for your replies, especially J-M-L for his example code. That made clear where my wrong thinking actually was:

CTreffler:
The union "timelast" works as expected: I receive a 31 bit value, if I access the 31 bit bitfield.
With "timefirst" this does not work: I'm getting a 32 bit value when accessing the 31 bit bitfiled. And the 1 bit flag returns a wrong value.

If you check my example, you can see that I was using only a 16 bit value (0xFEDD) to test my union. I was expecting to see a truncated return of that value for tl.Activity.Time all the time.
(This forum editor needs a facepalm smiley :confused:)

I should have used something like 0xFEDDFEDD, then I would have seen truncation for both versions of the union.

In the end I don't care how the compiler organizes the different parts of the union, as long as I get a data structure with 4 bytes. I'll not store one member format and read in another.
J-M-L's example shows that I need to initialize the biggest member of the union in the declaration, otherwise the compiler might optimize it to something smaller.

Cheers,
Christian

In the end I don’t care how the compiler organizes the different parts of the union

The compiler is allowed to add “padding” to align various members to “convenient” memory address boundaries (even address, divisible by 4 address, etc.). If you really want to specify that the bits should be contiguous, you have to use the packed attribute:

    typedef union {
      struct {
        uint32_t Act: 1;
        uint32_t Time: 31;
      } __attribute__((packed))    // <----
          Activity ;
      uint32_t All;
    } timelast;

Use sizeof to see how big the compiler actually made your type:

    Serial.print( sizeof(timelast) );

In your example, the compiler has added padding if the size of the union is bigger than 4.

This is a GCC compiler extension (not part of the standard language definition). See this stackoverflow Q&A.