ANDing 16bit values

Hello everyone

I am trying to split a 16bit value into 2 eight bit values.

I am ANDing the 16bit value with 0x00FF and then 0xFF00, and putting the resulting value into an 8bit variable.

The LSB works fine but the MSB returns zero and I don't understand why.

An explanation would be gratefully received - code below
Thanks

uint16_t ADDR_FRAM = 0x1234;
uint16_t ADDR_MSB_MASK  = 0xFF00; //1111 1111 0000 0000
uint16_t ADDR_LSB_MASK = 0x00FF;  //0000 0000 1111 1111
uint8_t val;
uint8_t val2;

void setup() {

  Serial.begin(9600);
  
  val = ADDR_FRAM & ADDR_MSB_MASK;
  Serial.print(val,HEX);//Print val in hex
    Serial.print(" ");  //Print space
      val2 = ADDR_FRAM & ADDR_LSB_MASK;
  Serial.print(val2,HEX); //Print val2 in hex
Serial.println();     //Print new line
 }

You need to shift (ADDR_FRAM & ADDR_MSB_MASK) right 8 times; val2 is set to the lower 8 bits of (ADDR_FRAM & ADDR_MSB_MASK) and they are always 0.

Try:

val = (ADDR_FRAM & ADDR_MSB_MASK) >> 8;

You won't even have to mask it; the LSB is "pushed out" to the right, so:

val = ADDR_FRAM >> 8;

will also do.

val2 = ADDR_FRAM; will automatically discard the MSB, so no need for masking there either.

val2 = (uint8_t)ADDR_FRAM; is a bit clearer and won't generate a warning.

val and val2 are bytes. A simple shift instead of anding will do for the msb. You can also use LowByte and HighByte to get the two parts.

uint16_t ADDR_FRAM = 0x1234;
uint16_t ADDR_MSB_MASK  = 0xFF00; //1111 1111 0000 0000
uint16_t ADDR_LSB_MASK = 0x00FF;  //0000 0000 1111 1111
uint8_t val;
uint8_t val2;

void setup() {

  Serial.begin(115200);

  val = ADDR_FRAM & ADDR_LSB_MASK;
  Serial.print(val, HEX); //Print val in hex

  Serial.print(" ");  //Print space

  val2 = (ADDR_FRAM & ADDR_MSB_MASK) >> 8;
  // or val2 = ADDR_FRAM >> 8;
  Serial.print(val2, HEX); //Print val2 in hex
  Serial.println();     //Print new line
}

void loop() {}

Hi ocrdu

Thank you for the reply

I'll try that after posting this, would you mind explaining a little more to aide my understanding is this an 8bit thing or a compiler thing?

Looking at your reply do I understand it correctly that even though I might declare a 16/32 or even a 64bit value I am only ever able to access the lower 8bits 'natively'

Thank you again

The only way to obtain individual byte access to a multibyte variable in this environment, is with assembly language. In C, you have to use the shift operator. Behind the scenes, the compiler may optimize a >> in multiples of 8, into a direct byte access if it is possible. That is hidden from view.

It looks like you went back and edited the code in the original post. Never do that. That will be extremely confusing for anyone who tries to read the thread now. Especially if you expect explanations of what you did wrong.

Also, your edit is wrong. It seems like you didn't understand any of the posts that followed. Take some time to sit down and grasp it. The code I posted actually works.

Hi just noticed other replies

Thank you all, so what I alluded to in my previous post holds true. Because I am on an 8bit MCU, presumably I am only ever to access the lower 8bits directly.

Reference the lowbyte and highbyte mention are these an Arduino thing or a C/CC++ thing and what are the implications for code size/speed using these type of functions

Thanks again

Docara:
Because I am on an 8bit MCU, presumably I am only ever to access the lower 8bits directly.

Nonsense. In C, the same thing is true on a 16,24,32,64 or 256 bit processor. Slow down and read what's posted.

Also, nobody now can explain what you did wrong, because you substituted another mistake for your first mistake, in the posted code that you modified.

Your question about performance, highByte() and lowByte() are Arduino specific macros that you can be sure, will work as fast, and probably with less code overhead, than anything you can write.

No need for the attitude and belittling me

I posted an innocent enough question and was trying to more fully understand something - the time line of posting to the internet made things out of order.

Your original post answered my question for which you have my thanks - no further replies are necessary!

You have to have some consideration for forum members who are trying to help you. That requires thinking in terms of what it is like to be someone else reading the thread. The time line of posting didn't change by itself. You changed it and made it seem out of order. I'm only telling you about a mistake you made, which has often resulted in confusion in the past. I'm also mentioning it not just for your information, but for the clarification of any other people who encounter this thread.

Think carefully about your question, to paraphrase, "why does this code behave this way"... it's fixed for you, so you could go and look, but you ask for specific clarification, "where exactly did I go wrong"... but by then you've erased your mistake.

Do you not see how that could frustrate someone?

You can view a 16-bit quantity as two separate 8-bit quantities (Fig-1) using union concept of set theory.
intToBtes.png
Figure-1:

Coding:

union
{
   int x;
   byte myArray[2];
};

intToBtes.png

A C union has very little to do with the union in set theory. For example, in a C union the order is important, in set theory it isn't. Also, using a union to split bytes correctly in this way, is not guaranteed by any C standard. It can fail to take into account processor endian-ness. It can fail to "pack" data the way you expect. It just happens to work in AVR and maybe some other compilers. The safest, most portable, and readable way, is to use the shift operator. It will be optimized to byte selection in most cases, as I already mentioned.

In set theory, the union (denoted by ∪) of a collection of sets is the set of all elements in the collection. It is one of the fundamental operations through which sets can be combined and related to each other.

The concept of union data structure/keyword in C fantastically agrees with the above quote. In the following union declaration, the same storage space can be viewed as one piece of 16-bit data, and it can also be viewed as 2 pieces of separate 8-bit data.

union
{
     int x;
     byte myArray[2];

};

To me the "Union" concept of Set Theory does not contradict with the "union" concept of C Language.

Unions were not designed for data conversion. They were designed for memory re-use in an era when 4k RAM memory was always not enough. If you look below the surface convenience, you will find that unions are recommended for that purpose. It just so happens, that with most compilers you can use it for that without problems, but there are not enough guarantees in the C spec to ensure that it will always work reliably and be fully portable.

aarg:
but there are not enough guarantees in the C spec to ensure that it will always work reliably and be fully portable.

This has been hashed over many times here. Still some folks refuse to believe it. From K&R, 2nd Edition, Section 6.8: "... the results are implementation-dependent if something is stored as one type and extracted as another."

Obviously, not very portable.

Granted, we're talking C++ here, not C. But, I believe the same caution applies.

Would it be reliable to create the int and byte array separately, then use memcpy to copy the int into the array?

The union method would always have the problems of being dependent on the endian-ness, and the size of int, on a specific processor, although you could account for that in the code. The shift and && method would offer more portability.

gfvalvo:
This has been hashed over many times here. Still some folks refuse to believe it. From K&R, 2nd Edition, Section 6.8: "… the results are implementation-dependent if something is stored as one type and extracted as another."

Obviously, not very portable.

Granted, we’re talking C++ here, not C. But, I believe the same caution applies.

IIRC, in C11, type punning using unions is allowed, in earlier versions, it’s not.

But like you said, Arduino uses C++ for its sketches, so C is irrelevant here.

In C++, it has never, and will never be allowed. It undermines the type system, is inconsistent with deterministic lifetimes specified by the C++ standard, and is much more confusing and error-prone than the correct way to type pun: using memcpy or std::bit_cast.
This has been pointed out to GolamMostafa multiple times, but for some reason he keeps spreading the same misinformation.

In this case, you don’t even need special type punning constructs, the standard allows you to inspect the underlying representation of any type as an array of characters or bytes:

uint16_t value = 0x4321;
uint8_t *as_byte_array = reinterpret_cast<uint8_t *>(&value);
uint8_t lsb = as_byte_array[0];
uint8_t msb = as_byte_array[1];

This is both shorter and easier to understand than the (illegal) union approach. It is clear that the memory is being re-interpreted as an array of bytes.

As a first remark, I agree with aarg that the approach using bit shifts is preferable, because it handles things like Endianness correctly.

I want to stress that using reinterpret_cast is only valid here because we’re casting to a character/byte type (uint8_t). In almost every other circumstance, you are not allowed to use reinterpret_cast for type punning.

david_2018:
Would it be reliable to create the int and byte array separately, then use memcpy to copy the int into the array?

Yes. That is in fact the only correct way to perform general type punning.
In C++20, you can use std::bit_cast, which just calls memcpy under the hood.

Pieter

PieterP:
This has been pointed out to GolamMostafa multiple times, but for some reason he keeps spreading the same misinformation.

C++ contains C.

The union data structure of C works well in my ATmega328P and Arduino IDE. How am I spreading misinformation to others while I am enjoying the benefit of using "union structure" in my limited applications?

I am always safe if you know what I am doing and the limitations of my environment.

GolamMostafa:
C++ contains C.

It doesn't. They are two separate languages.

C++ might have originated from C in the 80s, but since then, both languages diverged. C++ is not a superset of C.
The rules for unions in C are irrelevant when programming in C++: C allows type punning using unions, whereas C++ does not.

GolamMostafa:
The union data structure of C works well in my ATmega328P and Arduino IDE. How am I spreading misinformation to others while I am enjoying the benefit of using "union structure" in my limited applications?

I am always safe if you know what I am doing and the limitations of my environment.

What you do in your own code doesn't matter, and if you really like unions and you know what you're doing, go ahead.

The point is that you're presenting a solution that is 1) invalid C++ code and 2) less readable than the valid alternative.
On top of that, you don't mention that it's invalid, neither do you specify in which cases it would be valid.
People reading your answer will learn to abuse unions following your example, while being unaware of the caveats and the valid alternatives that exist.

PieterP:
It doesn’t. They are two separate languages.

C++ might have originated from C in the 80s, but since then, both languages diverged. C++ is not a superset of C.

C is a classic low-level procedural programming language while C++ is a superset of C that is both procedural and object-oriented. Both C and C++ are commonly used languages and though C++ is derived of C both languages need to be approached differently.