4 Byte Little Endian -> How to shift Bit's on Arduino?

GolamMostafa:
There is another way of changing the order of the given data bytes using union:

I know that Arduino is C++ and not classic C, but from K&R (2nd Edition) regarding unions: “…the results are implementation-dependent if something is stored as one type and extracted as another.”

GolamMostafa:
There is another way of changing the order of the given data bytes using union:

34 86 00 00 ---> 00 00 86 34

Serial.begin(9600);

union {
            byte xAr[4];
            unsigned long Q;
        }  value;

value.Q = 0x34860000;

Serial.println(value.xAr[3], HEX);  // shows 34
Serial.println(value.xAr[2], HEX);  // shows 86

//xAr[] contains 00 00 86 34

You didn't change the order of anything here as promised. You just extracted some bytes. I don't see how a union could help with this.

gfvalvo:
I know that Arduino is C++ and not classic C, but from K&R (2nd Edition) regarding unions: “…the results are implementation-dependent if something is stored as one type and extracted as another.”

Of course the GCC philosophy is "K&R? We don't need no stinking K&R!"

You didn't change the order of anything here as promised.

In Science, there is nothing such as 'promise'. The proposed theory can be nullified and even proved wrong/discarded by anybody at any time. There should not be authoritative verdicts like 'flat-out WRONG' or 'flat-out RIGHT'; things can always be adjusted/corrected provided we leave aside the attitude of winning/losing. I showed a possible way of reversing the order without using shifting. The union data structure is a remarkable contribution in the field of Computer Science by sharing common memory space by conglomerate members. It helped me a lot in many of my computations.

GolamMostafa:
In Science, there is nothing such as 'promise'. The proposed theory can be nullified and even proved wrong/discarded by anybody at any time. There should not be authoritative verdicts like 'flat-out WRONG' or 'flat-out RIGHT'; things can always be adjusted/corrected provided we leave aside the attitude of winning/losing. I showed a possible way of reversing the order without using shifting. The union data structure is a remarkable contribution in the field of Computer Science by sharing common memory space by conglomerate members. It helped me a lot in many of my computations.

The union was originally intended just to re-use memory space. Sharing types was a side effect. You can use it to gain access to constituent bytes for byte swapping provided that you know the size of the datatypes. It's worth mentioning because of the increased demand for efficiency on a small machine compared with the desire for portability. So you could do something like:

unsigned long swapEndian(unsigned long val)
{
union {
             byte b[4];
             unsigned long Q;
         }  value;
byte temp;
value.Q = val;

temp = value.b[0];
value.b[0] = value.b[3];
value.b[3] = temp;

temp = value.b[1];
value.b[1] = value.b[2];
value.b[2] = temp;

return value.Q;
}

Excellent post that keeps us together to go for the next and next round despite differences in opinions.

Um. All the Arduino platforms are natively little-endian.
If you have a 4-byte little endian number starting at array[40], you can get the 32bit long (and print it) with

Serial.Print(*((long *)&array[40]));

(which means "access a long from the address of the 40th byte of the array.")
No shifting or weird access order required.

Normally this would be hidden in a macro or function:

#define GETLONG(p) (*(long*)(p))
   :
Serial.Print(GETLONG(&array[40]));

If you happened to be running on a big-endian CPU, or an ARM where you had to account for alignment problems, you could change the definition of GETLONG()

Serial.Print(*((long *)&array[40]));
(which means "access a long from the address of the 40th byte of the array.")

Since I have seen/read the above post, I have been wondering (seeing the output 78EFBC9A on the Serial Monitor for the given memory space byte array[8] = {0x12, 0x34, 0x56, 0x78, 0x9A, 0xBC, 0xEF, 0x78;) whether the data bytes are really reversed or they are as it is as it should be!
My own post #19 is questionable to me?

Figure-1: Serial Monitor showing the output of Serial.println(*((long *)&array[4]), HEX);

1. The computer on which I am working now is based on Windows OS and AMD Processor. The AMD/Intel processors are complied with little endian architecture in which the low-order data byte of a multi-byte operand (not opcode) is stored in the low-order memory location. This is a hard-wired rule which the assembler(compiler)/linker/lod186 keep in their intellects while creating executable binary codes for these little endian processors.

2. Assume that in the Real Mode operation of 80386, we wish to store 78EFBC9A (arranged as MSByte to LSByte) into four memory locations and these are 0000:2567, 0000:2566, 0000:2565, 0000:2564. Because we know that 80386 is of little endian type, we execute the following instructions:

             .386
             .org 1000h
mov       bx, 2464h                                    // List File Codes: 66 BB 24 64
mov       DWORD ds:[bx], 78EFBC9Ah       // List File Codes: 67 C7 07 78 EF BC 9A

After assembly, we observe that the executable binary bytes (only the operand and not the opcode)are arranged in little endian style (marked bold).

0000 - 80 09 00 07 74 7A 74 2E 61 73 6D 9F 96 09 00 00 ;
0010 - 06 4D 59 43 4F 44 45 9A 98 07 00 61 0B 10 02 01 ;
0020 - 01 E1 88 04 00 00 A2 00 D2 A1 11 00 01 00 10 00 ;
0030 - 00 66 BB 64 24 67 C7 07 9A BC EF 78 A4 8B 09 00 ;
0040 - C1 00 01 01 00 10 00 00 99

If someone asks me to present the 32-bit (4-byte) content of these four memory locations, I will present it as 78EFBC9A. So, the bytes are not swapped/shifted. They are as it is as they are! @aarg has already made comment on this issue in Post #21 in respect of Post #19.

//--------------------------------------------------------------------------------------------------------------

3. Now, let us play a bit with the Serial.Print(*((long *)&array[40])); instruction.

(a) Given instruction: Serial.println(*((long *)&array[4]), HEX); (I have changed 40 to 4.)

(b) Given data space: byte array[8] = {0x12, 0x34, 0x56, 0x78, 0x9A, 0xBC, 0xEF, 0x78};

4. Reading/presenting one-byte data from memory location array[4].

byte *yb;                             //L10
yb = (byte*)&array[4];         //L20      
byte zb = *yb;                     //L30      equivalent to:    mov     al, BYTE PTR ds:[bx]
Serial.println (*yb, HEX);     //L40       Serial Monitor shows: 9A

L10 (Line-10) declares a pointer variable yb. It will hold the offset address of the memory location array[4]. It is pointing to an object (the data value) which is 8-bit (1-byte), and it is indicated by the keyword byte.

L20 contains a pointer (byte*) in the type casting to tell the compiler that the pointer variable yb will deliver one-byte from the memory location whose address it is holding now.

//-------------------------------------------------------------------------------------

5. Reading/presenting two-byte data from two consecutive memory location where array[4] is the low-order memory location.

unsigned int *yw;                             //L11
yw = (unsigned int*)&array[4];         //L21      
unsigned int zw = *yw;                     //L31      equivalent to:    mov     ax, WORD PTR ds:[bx]
Serial.println (zw, HEX);                   //L41       Serial Monitor shows: BC9A

L11 (Line-11) declares a pointer variable yw. It holds the offset address of the memory location array[4]. It is pointing to an object (the data value) which is 16-bit (2-byte), and it is indicated by the keyword unsigned int.

L21 contains a pointer (unsigned int*) in the type casting to tell the compiler that the pointer variable yw will deliver 16-bit (two-byte) data from two consecutive locations; it is holding the offset address of low-order memory location. The address of the next memory location is the next higher one.

//---------------------------------------------------------------------------------------------------------------

6. Reading/presenting 4-byte (32-bit) data from four consecutive memory location where array[4] is the low-order location.

long *ylong;                             //L12
ylong = (long*)&array[4];      //L22      
long zlong = *ylong;              //L32      equivalent to:    mov     eax, DWORD PTR ds:[bx]
Serial.println (zlong, HEX);     //L42       Serial Monitor shows: 78EFBC9A
//---------------------

All the above four lines have been compacted into a single line: 
Serial.println(*((long *)&array[4]), HEX); by @westfw.

L12 (Line-12) declares a pointer variable ylong. It will hold the offset address of the memory location array[4]. It is pointing to an object (the data value) which is 32-bit (4-byte), and it is indicated by the keyword long.

L22 contains a pointer (long*) in the type casting to tell the compiler that the pointer variable ylong will deliver 32-bit (four-byte) data from four consecutive memory locations; it is holding the offset address of lowest-order memory location. The addresses of the next three memory locations will be found by consecutive additions of 1.

The inclusion of the pointer long* in the type casting perhaps brings the following affect at the hardware level. The memory system of 80386 is organized as byte-oriented four banks. The presence of long* dictates the compiler to produce codes in such a way so that all four banks are selected at the same time; the result is that the 32-bit data enters into the destination in one go.

Figure-2: 32-bit memory system (organized as 4x8-bit banks) of Real Mode 80386 Microprocessor
//----------------------------------------------------------------------------------------------------------

Would appreciate comments.

GolamMostafa:
Would appreciate comments.

TLTR?

My own post #19 is questionable to me?

Yes. You said that a union can be used to do byte-swapping, but it doesn't really do anything other than permitting the data to be accessed in different ways.
If you have:

union {
  byte B[4];
  unsigned long Q;
}  value;

void setup() {
  Serial.begin(9600);
  value.Q = 0x12345678;

  long *Qp = &value.Q;
  long *Qp2 = (long *) &value.B[0];

  Serial.println(value.Q, HEX);
  Serial.println(*Qp, HEX);
  Serial.println(*Qp2, HEX);
  Serial.println(*((long*)(&value.B[0])), HEX);
}

Then you'll get "12345678" printed 4 times, but nothing has actually changed byte order anywhere, except as the result of ASCII/Hex conversion.

I don't see anything wrong with your longer message, but I'm not sure what you accomplished.

In x86 format, if you have a pointer to your byte "array" in BX and want to do math value in the array, you would load the value into another register:

  mov AL, byte ptr [BX + 4]   ;; get a byte
  mov EAX, dword ptr [BX + 4] ;; get a "long"

Or something like that. C's cast determines whether you get AL or EAX as a destination, and (more directly) whether the assembler syntax has "byte ptr" or [dword ptr] (I always rather disliked Intel's assembler. If I wanted my pointers to be strong typed, I would have written C code!)

On something like an ARM, you don't have byte-addressable registers, you'd get either

  ldr   r0, [r1, #4]  ;; load 32bits
  ldrb r0,[r1, #4]  ;; load (and extend to 32 bits) 8bits

On AVR, you'd have

   ld r24, Z
   ld r25, Z+1
   ld r26, Z+2
   ld r27, Z+3  ;; load 32bit number from memory

At least this thread is still on topic. I have a comment about the presentation of address bits to external memory - there can be nothing big, little or mixed up endian about physical memory address bits. You can scramble them all randomly and it will make no difference to the software because no matter how they are presented to the memory array, each address is unique and so whatever is written to a given address can also be retrieved from the same address. So for example, I could reverse all the bits, so A15 goes to A0, A14 goes to A1 and so on, and it would make absolutely no difference to the functioning of the computer (provided peripherals were decoded correctly). Even the use of banks can introduce no endian bias because those bits can be swapped arbitrarily as well.

Maybe we should analyze it in BCD ;D

I don't see anything wrong with your longer message, but I'm not sure what you accomplished.

To me, accomplishment comes from satisfaction and vice versa. Whence a received reply is found to have the quality of being expanded into many pages of A4-sized paper for the exposure of the hidden details thence the joy knows no bound.

The detail brings the form;

The form brings the beauty;

The beauty brings the absoluteness;

The absoluteness brings:
Oneness or
Nothingness or
He/NotHe.

At least this thread is still on topic. I have a comment about the presentation of address bits to external memory - there can be nothing big, little or mixed up endian about physical memory address bits. You can scramble them all randomly and it will make no difference to the software because no matter how they are presented to the memory array, each address is unique and so whatever is written to a given address can also be retrieved from the same address. So for example, I could reverse all the bits, so A15 goes to A0, A14 goes to A1 and so on, and it would make absolutely no difference to the functioning of the computer (provided peripherals were decoded correctly). Even the use of banks can introduce no endian bias because those bits can be swapped arbitrarily as well.

//-----------------------------------------------------------------------------------------------------------------
1. There can be nothing big, little or mixed up endian about physical memory address bits.

2. You can scramble them all randomly and it will make no difference to the software because no matter how they are presented to the memory array, each address is unique and so whatever is written to a given address can also be retrieved from the same address.

3. I could reverse all the bits, so A15 goes to A0, A14 goes to A1 and so on, and it would make absolutely no difference to the functioning of the computer (provided peripherals were decoded correctly).

4. Use of banks can introduce no endian bias because those bits can be swapped arbitrarily as well.
//-----------------------------------------------------------------------------------------------------------------

More than 40 years ago, my teacher Prof. Dr. A.M. Patwary was delivering lecture on Information Theory; he was saying that he was not delivering anything new; but, the style of presentation was his own and that created the 'Information'.
//-----------------------------------------------------------------------------------------------------------------

Through unique style of presentation @aarg has brought into limelight the very fact of computer architecture.

Maybe we should analyze it in BCD ;D

In this thread, we don't have that opportunity; but, yet there is a little bit in the @westfw's instruction Serial.print(*(long *) &array[40]); which contains decimal base by default. As we have remained in line with the context of the thread, we have not teleported off-topic.

If you are interested to analyse some programming aspect in terms of BCD, you may start with the lcd.print(0xFFFF, DEC); function where Issac Newton and William George Horner have been comfortably sleeping.