Converting a 5-byte big number to a number?

GoForSmoke · August 22, 2023, 8:14pm

Hey great to see you, I thought we had lost you!

You did that fast /10 thread years ago, right?

GoForSmoke · August 22, 2023, 8:28pm

If one were to make a union with one unsigned long long // ULL for short
and an 8 byte array the array order can be checked by setting the union as 0x0123456789ABCDEF and then reading the bytes as HEX.
When that is known, the order to set the bytes to get a ULL is known.

alto777 · August 22, 2023, 8:38pm

I'm sorry to have asked that question, but without seeing the OP's code I had to, or thought I did.

There is nothing that needs any tricks to see that shifting bytes into an integer in all the ways we did should have worked fine. Endianness was the reddest of herrings, unless they were coming in backwards. So I thought to try to see if it was backwards at that point.

It was the sign bits and the long cast where long long was necessary that were any part of the problem.

So I've also learned how to spell endianness.

a7

GoForSmoke · August 22, 2023, 9:13pm

Backward incompatiblity to 4 bit days?

GoForSmoke · August 22, 2023, 9:18pm

That's why to map the union bytes in Post 22 by setting the 32 bit ULL in a hex pattern and reading the same union bytes as hex. Swaps show up.

Coding_Badly · August 23, 2023, 12:05am

3 posts were split to a new topic: Days o' yore ()

Coding_Badly · August 23, 2023, 12:06am

Nope. C++ type punning rules make that undefined behaviour. memcpy or bitwise operations are the correct choice.

gfvalvo · August 23, 2023, 12:23am

I believe one of the few allowed exceptions to the strict aliasing rules lets you to use a byte pointer to access an object's bytes.

#include "Arduino.h"

template<typename T>
void printHexBytes(const T &val) {
  const uint8_t *bytePtr = reinterpret_cast<const uint8_t *>(&val);
  for (size_t i = 0; i < sizeof(T); i++) {
    if (*bytePtr < 0x10) {
      Serial.print("0");
    }
    Serial.print(*bytePtr++, HEX);
    if (i < sizeof(T) - 1) {
      Serial.print(" ");
    }
  }
  Serial.println();
}

void setup() {
  Serial.begin(115200);
  delay(1000);
  uint64_t x = 0x24C9A75A2E;
  printHexBytes(x);
}

void loop() {
}

Output:

2E 5A A7 C9 24 00 00 00

robtillaart · August 23, 2023, 8:13am

Hey great to see you, I thought we had lost you!
You did that fast /10 thread years ago, right?

Sometimes I get a mail from the system and I have a look at the forum, and PM's. Spend my (Arduino) time mostly on maintaining libraries and creating new ones. See - RobTillaart (Rob Tillaart) · GitHub -

The fast /10 was indeed a thread to remember. brings back the good memories.

It is part of my fast_math library in which more "unreadable fast" code is collected.

TomGeorge · August 23, 2023, 9:01am

Hi, @AKnuts

A very important question.

What is the device you are getting this data stream from?
Can you please post a link to specs/date of the device?

Thanks.. Tom..

GoForSmoke · August 23, 2023, 10:35pm

The undefined behavior is consistent for the compiler used on the program. Not Random. The byte order is always the same.

I addressed what to do to map that and use it. Do that and nope becomes yes you can and it will work. Undefined != random. It does mean check and work with that. The code won't instantly port but so what? It CAN be made to work and that's not a bad thing.

GoForSmoke · August 23, 2023, 10:45pm

Site bookmarked.

When someone's maths are better than mine, I don't forget that easily!
If i get time, I'd like to go through that one again but to get it all to long term, I think I've gotten too old. You did some things going between bases that I didn't get locked in on but still appreciate just having seen.

gfvalvo · August 23, 2023, 10:46pm

Until the smallest unrelated change is made in the code and the entire thing blows up ... even with the same compiler. This has been discussed here many times before. "Proof by Working Example" is simply not a valid argument.

GoForSmoke · August 23, 2023, 10:59pm

Please show me an EXAMPLE of float byte order changing in the same program on the same machine and what small unrelated change makes that happen.

What compiler is inconsistent in how it stores type float?
What compiler would do that? GCC+ does not, does it? If so then demo.

Disproof does work. Show the byte order change arise from "a small unrelated change".

If i see a white van, that doesn't mean I should get a tinfoil hat.

gfvalvo · August 23, 2023, 11:16pm

"Proof by Lack of Counter Example" is also an invalid argument.

The language standard and guidelines are quite clear on the matter:
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c183-dont-use-a-union-for-type-punning

https://en.cppreference.com/w/c/language/union#:~:text=If%20the%20member%20used%20to,is%20known%20as%20type%20punning)..)

You, of course, are free to disregard such guidance. But, I'm glad you're not writing the code for my Auto Pilot.

christop · August 24, 2023, 10:18pm

There was a time when I used something like the following code:

void unsafe(void)
{
        struct foo {
                struct foo *next;
        };

        struct foo blah = { 0 };
        struct foo *head = 0;
        struct foo *node = (struct foo *)&head;
        node->next = &blah;
}

Yes, it's a bit of a hack, and it invokes undefined behavior (I didn't know it was undefined behavior at the time), but it "should" work because node->next should be equivalent to head (they are the same type, and &node->next == &head, after all), so it should just assign &blah to head, right? And it did work, until it didn't. What was the unrelated change that made it not work anymore? From what I remember, moving to another architecture with a newer version of GCC (both architectures were 32-bit so it wasn't a 32->64 bit incompatibility).

Relying on undefined behavior is like travelling in basketball. It's illegal but sometimes you can get away with it if the referee doesn't notice (like using a compiler that doesn't break the assumptions you made when relying on the undefined behavior).

I've been around the block long enough to see a number of coding assumptions I've made being broken due to undefined behavior or even just implementation-defined behavior (e.g., size of data types or default signedness of char), and all of those assumptions were broken because of "unrelated changes" like a newer compiler, different compiler flags, or a change in the architecture. I've gotten wiser over the years and now try to avoid undefined behavior whenever I learn that something I might be doing has undefined behavior.

Since it's easy enough to not rely on undefined behavior with the union hack, why not just do it the right way (with memcpy)?

GoForSmoke · August 25, 2023, 3:59am

Unrelated to what? Not the layout of variables.
For sure you should not try to predict that layout when you change compilation.

And if you do check, well just shiver that even if you check and change nothing that byte order may randomly change at runtime?

What happens if I read a constant stored high-low but changed to low-high? In my reality, that doesn't happen and I can count on the same behavior.

I show an apple and it gets compared any can of soda. That doesn't change.

alto777 · August 25, 2023, 11:19am

I'm enjoying this fascinating discussion, but by post #5 all questions about endianness were answered. It simply does not enter into the difficulties encountered in reassembling five bytes into a long long integer.

So… refresh my memory, or even just say clearly, what was the point here, for the OP's issue, of introducing type pinning or union busting or whatever it is called what you are doing that seems to be either Clever or a Bad Idea?

I've done the union thing, I think, and made bad use of things like type casting. But not since some time. Like years. Since it became clear that I was asking for a kind of trouble I didn't even know I could get in to.

It's easier for me, usually, to just play by the new rules than to fully understand nuances amongst them that allow the heavies to break them.

a7

gfvalvo · August 25, 2023, 12:25pm

Your misconception about Undefined Behavior in this thread (and others) is that you naively believe that the compiler emits assembly language code that follows a literal line-by-line reading of the C++ source code. It doesn't. The compiler is free to emit completely different (better and more efficient) code as long as that code produces the same observable behavior as a literal translation of the source code.

But, that convenient only holds if you obey the rules because many times the compiler can't tell when you're breaking them. So the compiler optimizes along, but if you've broken the rules, the convenient no longer holds. Thus you'll get machine code that doesn't do what an examination of the source code would have you believe. For example, you may think you're still accessing the same variable, but you're not because the compiler assumed you were obeying the rules so it produced a better way to access it. But, you broke the rules so the assumption wasn't valid.

Here's a better explanation by someone who knows more than I.

EDIT:
I meant "covenant", not "convenient" .

dougp · August 25, 2023, 1:46pm

Did you mean convention?

Topic		Replies	Views
Fixed point signed multiplication - shifts and unions Programming	27	217	October 26, 2025
unsigned long to 4 bytes conversion fails after 2^16 Programming	40	3914	May 5, 2021
Problem with the use of union Programming	70	4873	April 27, 2023
Kurze Erklärung zu int mit unsigned long Rechnung Deutsch	16	164	November 3, 2025
Left an right shift a Long and also have access to the 4 individual bytes (efficently) Programming	32	2262	September 22, 2022

Converting a 5-byte big number to a number?

Related topics