Naming a union

I am definitely not a C++ programmer. I need to use a union so I can send floating variables as serial bytes. In the code below, the union seems to be named "u_tag", but that name doesn't seem to have any function. The variables are addressed using the "u." prefix, so why is "u_tag" needed? I'm building a standard message the sends 18 variables to the network, so something like this code fragment will be used 18 times to build the memory space for the variables.

union u_tag {
   float temp_float ; 
   byte temp_byte[4] ;
} u;

u.temp_float = 14.3;
Serial.write(u.temp_byte[0]); 
Serial.write(u.temp_byte[1]);
Serial.write(u.temp_byte[2]);
Serial.write(u.temp_byte[3]);

Instead of using a union, why can't I use a pointer, like so:

int *temp_float ;
temp_float = 14.3 ;
*temp_float = &temp_float ;
Serial.write(*temp_float, 4) ;

Naming the union is optional. This post also includes the pointer alternative which you may wish to compare with your version:

The tag (u_tag in your code) has a function, if you want to define another variable later in the code, just to use “union u_tag u2”.
Yes, you can use a cast with pointers instead of unions but the union is preferred solution since it avoids some logical mistakes. In addition, in 8-bit world it is efficient way how to assemble multi byte data from single bytes without additional overhead.
For an example:

byte hb, lb;
...
int i = ((int)hb << 8) | lb;

It will be translated like math operation with some overhead, but in case with an union it will be just filling to register pair.

Dr_Quark:
Instead of using a union, why can’t I use a pointer, like so:

int *temp_float ;

temp_float = 14.3 ;
*temp_float = &temp_float ;
Serial.write(*temp_float, 4) ;

Well let’s just say something very vaguely like that. :wink:

Try this code Dr Quark

  float f1 = 14.3;
  byte* fb = (byte*) &f1;

  Serial.println(fb[0]);    // Take a look at the bytes
  Serial.println(fb[1]);
  Serial.println(fb[2]);
  Serial.println(fb[3]);

thanks, BudVar10, I thought it might be something like that. That will shorten the code text significantly. It does seem that using union instead of a cast is preferred.

Stuart0, I've looked and I can't find what the following asterisk does, eg, what's the diff between "byte" and "byte*"? It looks to me like your code would print addresses (pointers) instead of values.

byte* means pointer to type byte.

BTW. Essentially byte* fb and byte *fb mean the same thing. That is, you can either postfix the '*' to the variable type or you can prefix it to the variable name in an assignment and it has the same meaning. I usually prefer the former.

It looks to me like your code would print addresses (pointers) instead of values.

Then you should run the code and see what it actually does.

BTW. Note that fb is a pointer in my above example, however the array brackets "dereferences" the pointer in much the same way as *fb would.

fb[0] dereferences to the same thing as *fb. Similarly with fb[1] and *(fb+1) for a byte pointer.

I don't think that using union instead of a cast is preferred in general. Cast is advanced technique and have to be used carefully. Modern compilers has an option to generate or not a warning if simple cast is used between different data types. Arduino does it in this way - warns you about use of cast.

Use of union, in such cases as I described, takes little bit more of C code and looks slightly complicated but in result it is more efficient (machine code).

stuart0:
byte* means pointer to type byte.

BTW. Essentially byte* fb and byte *fb mean the same thing.

Even byte * fb is the same. :slight_smile: Personally, I’m using byte *fb style.

stuart0:
byte* means pointer to type byte.

...

OK, this throws me a bit. "Type byte" constructs an 8-bit number. I don't understand how a pointer, which is 16-bits or longer, can have any meaning when it's truncated to 8 bits. I say this because my understanding of "pointer" is that it contains an address that is the smallest addressable memory location, usually, in a microprocessor, that would be a byte location. Never-the-less, the pointer itself is always wider than 8 bits. Are you telling me that in C++ a "pointer" has an inherent quality that includes whether it's pointing at a byte, word, or long word?

I know some systems have instructions that have an inherent limitation to operate on word boundaries only, but their pointers still address bytes.

I usually prefer the former.

Until you do something like:

   int* ptr1, ptr2;

and then can't figure out why ptr2 isn't a pointer.

int *ptr1, ptr2;

makes it clearer, to me, that only ptr1 is a pointer.

YMMV.

(Not that I recommend defining multiple variables of different types like that...)

PaulS:
(Not that I recommend defining multiple variables of different types like that...)

Newlines are free, after all.

Jiggy-Ninja:
Newlines are free, after all.

New lines can make things worse.

   int* ptr1,
          ptr2;

Makes it look even more like ptr2 is a pointer.

I meant like this:

int* ptr1;
int* ptr2;

I was agreeing with you.

I was agreeing with you.

I wasn’t feeling agreeable then.

Dr_Quark:
OK, this throws me a bit. “Type byte” constructs an 8-bit number. I don’t understand how a pointer, which is 16-bits or longer, can have any meaning when it’s truncated to 8 bits. I say this because my understanding of “pointer” is that it contains an address that is the smallest addressable memory location, usually, in a microprocessor, that would be a byte location. Never-the-less, the pointer itself is always wider than 8 bits. Are you telling me that in C++ a “pointer” has an inherent quality that includes whether it’s pointing at a byte, word, or long word?

I know some systems have instructions that have an inherent limitation to operate on word boundaries only, but their pointers still address bytes.

The pointer is an address and its size depends on system. It can be 16-bits, 32, 64 etc. The C/C++ is distinguishing between data types. It is kind of check, how many bytes at the address (which is starting point) belongs to the variable and how they have to be interpreted. Let’s talk about the UNO - ATmega328P. It has 16 bit address, any pointer has 16 bits, so the byte *_i is the pointer to the byte in the memory but its size is exactly 2 bytes. If the i = *_i statement is used, the compiler knows that at the address _i is a byte which have to be copied to the i.
However, the C/C++ allows ‘advanced techniques’ (like a cast) which can bypass this check. Like this:

int i = *(int *)_i;

From previous, there is a byte of data at _i, but 2 bytes are taken to the i. One of them is unknown for me. It can belongs to any other variable.

EDIT:
I remember form MS-DOS that there were short pointer (16) and far pointer (32). Same or similar on other systems. Short pointer saves me a code but memory behind 64k was not reachable.

Dr_Quark:
Never-the-less, the pointer itself is always wider than 8 bits. Are you telling me that in C++ a "pointer" has an inherent quality that includes whether it's pointing at a byte, word, or long word?

Yes. Exactly that!

A pointer may be just an address, but C/C++ generally needs to know what type of variable it points to in order to perform type checking and also to correctly dereference it. (BTW dereferencing just refers to the act of getting the actual data from the pointer reference).

So for example, byte* bp declares bp as a pointer (16 bit or 32 bit or whatever depending on the architecture), however it also tells C/C++ that the thing that is being pointed to is a byte.

You may well ask: why does C/C++ even need to know what type it points to? Well aside from the obvious reason of type checking, C also uses this information for doing pointer arithmetic.

For example, with the above definition of bp:

*bp accesses the byte at that location
*(bp+1) accesses the next byte and so on.

HOWEVER, if the pointer was defined as pointing to long, for example long* lp, then it would work completely differently.

Specifically:

*lp would access the first long integer (32 bits) at the address pointed by lp.
*(lp+1) would access the next long integer. In other words the +1 would actually advance the pointer by four bytes!

BTW. Both *(lp+1) and lp[1] give the same result.

Dr_Quark:
OK, this throws me a bit. "Type byte" constructs an 8-bit number. I don't understand how a pointer, which is 16-bits or longer, can have any meaning when it's truncated to 8 bits.

The pointer isn't being trucated to 8 bits.