EEPROM.put() endianness

When I call EEPROM.put(0, (int16_t) 50) and later dump the first few EEPROM cells, I get

cell #0 == 0x32
cell #1 == 0x00

therefore, EEPROM.put() appears to save the value little-endian.

Is there a particular reason for this choice, or was it just a matter of developer's preference? Does it have anything to do with the endianness of the µC itself?

Do you think there's going to be a performance penalty if, for readability and ease of manipulation, I implement my own, big-endian routine? I'm only interested in saving int16_t values, if that makes any difference.

Apart from academic interest, my first question is who cares which order the data is saved in as long as I can get it back intact. You do, of course, have access to the source code for the put() and get() functions

Do you have a particular reason for wanting to save the 16 bit values the other way round ?

As I said in the OP, readability is my main concern. My plan was to mix .put() and .get() with single-byte access via .read() and .write(), and I feel like having to take endianness into account adds one more layer to complication and possible bugs. To a machine it may not matter, but to a human (myself, for instance) big-endian is more straightforward.

have you looked up the source? it casts data into byte array and saves bytes in order in a loop, so whatever endianess is arduino using would be the endianess in the eeprom

2 Likes

The eeprom memory representation just matches the SRAM byte order. That makes sense as this way it’s easy to restore the values back into their original form in memory.

Using put or get with any type is much more readable than messing around with the underlying memory… don’t do it! :slight_smile:

That sounds like a recipe for a pile of problems. What advantage do you see in doing it ?

1 Like

I already have some working code based around .read() and .write(). My original implementation is big-endian because it felt more natural when I wrote it (and it still does). Later, I thought I could add in some calls to .put() and .get() for their ease of use, but then I stumbled upon this endianness thing that does not make things easier at all and perhaps adds more problems than it takes away.

So I must either:

  1. Go on the hard way with .read() & .write() but no endianness issues, or
  2. Have it easy with .put() & .get(), but suffer the overhead of endianness conversion.

Thinking about it, I begin to feel like the matter with performance is of no real concern, considering that EEPROM access is orders of magnitude slower than byte manipulation.

At least, I now have a clearer idea about why the EEPROM methods are implemented the way they are.

One major advantage of put() and get() is that the eeprom is read first, and only written if the value has changed (for each byte of data). That can be important when dealing with eeprom with a fairly limited number of times it can be written.

Reversing the endianness would be complex with arrays or structs.

In other words: The EEPROM has no endianness. A block from the memory is read/written to EEPROM in the same order.

Yes, I'm aware of it: that's part of what I called "ease of use" earlier. I took care of that with my bespoke functions that use .read() and .write().

Right; to the EEPROM, a byte is a byte and there is no higher-order data structure to be concerned about. But if I want to treat 2 or more bytes from EEPROM as a single entity (e.g. an int16_t), then I have to know what is what and endianness does matter. I can have higher-order methods do that for me (but the endianness is hidden), or I can do it myself with lower-order methods (but I have to decide on endianness for my own data).

No, you don't need to know and it doesn't matter. Simply transfer the bytes between RAM and EEPROM in the same order in both memory areas ... i.e. read the bytes from RAM incrementing the addresses low to high and write them to EEPROM incrementing the addresses low to high. Same thing when you tranfer form EEPROM to RAM.

1 Like

How come? Say I have 2 bytes in EEPROM as in my OP:

cell #0 == 0x32
cell #1 == 0x00

Do they make 50 or do they it make 12800? If endianness was taken care of for me by the writing method, and I use the same kind of method for the reading, then I will get the correct result, but there is still endianness at work behind the scenes, though transparent to me as the caller of the methods. If, however, I have my 2 bytes, I access them individually as bytes, and I want to reconstruct an integer manually, then I have to decide on endianness, don't I? Shall I make it byte_0 + (byte_1 << 8) or (byte_0 << 8) + byte_1 ?

Of course. But, as already noted, your reason for wanting to do so seems unsound:

1 Like

Weird. I always thought the AVR was BigEndian (MSB first) but this seems to show that it is LittleEndian (LSB first).

#include <EEPROM.h>

void setup()
{
  Serial.begin(115200);
  delay(200);
  Serial.println("setup()");

  const unsigned long value = 0x12345678;

  Serial.println(value, HEX);

  EEPROM.put(0, value);

  Serial.print(EEPROM[0], HEX);
  Serial.print(EEPROM[1], HEX);
  Serial.print(EEPROM[2], HEX);
  Serial.println(EEPROM[3], HEX);

  const byte *valuePtr = (const byte *) &value;
  
  Serial.print(valuePtr[0], HEX);
  Serial.print(valuePtr[1], HEX);
  Serial.print(valuePtr[2], HEX);
  Serial.println(valuePtr[3], HEX);

  unsigned long value2;

  EEPROM.get(0, value2);
  Serial.println(value2, HEX);
}

void loop() {}

Results:

setup()
12345678
78563412
78563412
12345678

Reading EEPROM or RAM as bytes, the unsigned long comes out LSB First.

for common arduinos, it's more a compiler (avr-gcc) decision than the chip as it's only an 8 bit architecture and if I remember correctly, instructions that use register pairsas well as Peripheral registers use little endian convention. The stack (return address for example) might seem Big Endian but that's because the stack is LIFO thus getting LSB first as well.

Best you leave it to the machine how your data pointer is cast to a byte *.

int data = 50;  // or any struct or other data type
const byte* bp = &data;
for (int i=0; i<sizeof (data); i++) 
  Serial.println( *(bp++));  // or any other action, e.g. writing to EEPROM

The AVR and avr-gcc are pretty uniformly little-endian. That’s mostly arbitrary, since data memory is only 8bits, but it does show up wrt register pairs and the ADIW and SBIW instructions.

Since the architecture itself does not justify little-endianness, wouldn't big-endian have been the most natural and user-friendly choice?

You can ask Intel the same question; their 16 bit processors are little endian.