Hacking EEPROM content

Hi guys
I have some industrial equipment which stores info on a 24C04W EEPROM, which I'd like to hack.
So I used the excellent I2C_EEPROM library to read its content.
Now I have a series of bytes and need to figure out what they mean, see below the first bytes:

       0    1    2    3    4    5    6    7    8    9
00000: 086  053  071  000  000  212  065  012  242  252  
00010: 068  255  255  255  255  000  224  073  068  000  
00020: 000  148  065  010  215  163  060  255  255  255  
00030: 255  255  255  255  255  255  255  255  255  255 
00040: 255  255  255  255  255  255  255  255  255  255
00050: 255  255  255  255  255  255  255  255  255  255

I know more or less what I am looking for, ie. some hour counters and some voltages so probably likely stored as integers and floating point values. There could be a char header as well.

I can check from the equipment the values which are actually stored so I know what I should be getting in the end. Nevertheless the job can be time-consuming as I don't know at what address they are stored, over how many bytes, in what order...

Does anyone have any advice on perhaps how to automate the task? Perhaps there is some software that can display all possible combinations of a series of bytes?

Nice display but generally I and many others like to work in Hex format, I also show ascii as well. From 27 onward it is 0xFF which I would guess if a fill value. It probably means the toner has no room or the tortions setting is 68, I do not have a clue because I have no idea where this came from. Each program can put in the memory whatever it wants, there are no rules to follow as to the content.

1 Like

The first 3 bytes might read (see e.g. https://www.asciitable.com/)

V5G

Does that ring any bells?

Please provide the values on the machine and provide the matching output in hex.

Do you know if the processor in the equipment is 8/16/32/64/... bits?

PS

  1. I would never print in blocks of 10; rather 8 or 16.
  2. I would never print in decimal; hexadecimal is far more convenient. You can combine two /three/four/whatever bytes, put them in a calculator and you can see the decimal equivalent (e.g. Microsoft's calculator in programmer mode). Might not work for floats or doubles.
1 Like

Looking at the data in hex rather than decimal or in groups of 8 bytes rather than 10 is very unlikely to cause some miraculous revelation!

What you need is a "Rosetta Stone". An example set of data where you know exactly what values it represents. Only that will allow you to piece together what encoding has been used.

3 Likes

If we assume that the values are aligned, something like the following could be of help.

from struct import iter_unpack


data = bytes([
    86, 53, 71, 0, 0, 212, 65, 12, 242, 252, 68, 255, 255, 255,
    255, 0, 224, 73, 68, 0, 0, 148, 65, 10, 215, 163, 60, 255,
    255, 255, 255, 255, 255, 255, 255])

for endianity in ['<', '>']:
    for data_type in ['h', 'l', 'H', 'L', 'f', 'd']:
        fmt = endianity + data_type
        print(fmt + ': ' + ' '.join(map(lambda x: str(x[0]), iter_unpack(fmt, data[3:]))))

With output:

<h: 0 16852 -3572 17660 -1 -1 -8192 17481 0 16788 -10486 15523 -1 -1 -1 -1
<l: 1104412672 1157427724 -1 1145692160 1100218368 1017370378 -1 -1
<H: 0 16852 61964 17660 65535 65535 57344 17481 0 16788 55050 15523 65535 65535 65535 65535
<L: 1104412672 1157427724 4294967295 1145692160 1100218368 1017370378 4294967295 4294967295
<f: 26.5 2023.56396484375 nan 807.5 18.5 0.019999999552965164 nan nan
<d: 2.187060108900579e+24 9.546195687644226e+20 1.376676567650078e-16 nan
>h: 0 -11199 3314 -956 -1 -1 224 18756 0 -27583 2775 -23748 -1 -1 -1 -1
>l: 54337 217250884 -1 14698820 37953 181904188 -1 -1
>H: 0 54337 3314 64580 65535 65535 224 18756 0 37953 2775 41788 65535 65535 65535 65535
>L: 54337 217250884 4294967295 14698820 37953 181904188 4294967295 4294967295
>f: 7.614235465601759e-41 3.7437830573689093e-31 nan 2.0597433893386908e-38 5.318348061651978e-41 2.0765148849577988e-32 nan nan
>d: 1.153029926298783e-309 nan 8.0536196127115e-310 nan

Values 1, 4 and 5 in the line starting with <f look interesting.

1 Like

Wow that looks awsome and definitely in the direction of what I am looking for :star_struck: :star_struck:

Now I need to understand what your code is doing exactly, lol. I am not a complete beginner at coding but looking at it I can see I'll need to work a bit on it to understand it, haha.

For the next phase, @PaulRB is right, I need a "Rosetta stone", and the good news is I can get it, more or less. Currently the EEPROM I am working on was pulled out of a non-functional equipment so I don't know what's in it. But once I have a working code and a good strategy, I'll dump the EEPROM content of a working equipment (which requires some work to access it and the risk of damaging a perfectly functional equipment...), as well as the content of the equipment display, some of which reflects what is stored in the EEPROM.

Is this python?

It simply decodes everything from the fourth position onward using one specific endianity and data type. So in the line starting with <f, all values are decoded as little endian floats and in the line starting with >H, all values are decoded as big endian 16-bits unsigned integers.

Yes, this is written in Python 3.

OK thanks.
So the lines starting with <d are decoded over 64 bits?
I guess I would need also to reiterate this code with a different starting index, in case there is a header of seom sort. The initial "V5G" string mentioned above could indeed mean something, but then, once you start trying to make sense of a few letters, the harder you look the more you find stuff

OK thanks, I have started to modify the dump() routine to obtain the EEPROM content in a better format. The equipment's processor question (8/16/32/64/... bits?) is a good point and I would have to open an equipment to find out. I guess your question is to figure over how many bits a float or an int is represented?
What I know so far is that those equipments were designed in the 90's

I remember helping someone who was attempting to analyse the contents of the eeprom of an IR remote control device which stored patterns it had learned from recording other devices. This was not very successful and it turned out that much of what it stored was parameter sets for configuring a specific chip used. About the only concrete thing I was able to discover was that the data was protected by a mod 256 check digit sum !

Yes, you will have to do some sort of differential analysis as already implied. Change one parameter in the environment (you've mentioned hours and voltage), force the device to store its data and see which bytes change.

The good thing is that that EEPROM can store only 256 bytes.

1 Like

Indeed.

That is what I assumed too. This is why we start decoding from the fourth position onward. If you want to play around with the offset, perhaps consider the following.

from struct import iter_unpack


data = bytes([
    86, 53, 71, 0, 0, 212, 65, 12, 242, 252, 68, 255, 255, 255,
    255, 0, 224, 73, 68, 0, 0, 148, 65, 10, 215, 163, 60, 255,
    255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255])

for offset in range(0, 8):
    print(f'offset: {offset}') 

    chunk = data[offset:offset + 32]
    for endianity in ['<', '>']:
        for data_type in ['h', 'l', 'H', 'L', 'f', 'd']:
            fmt = endianity + data_type
            print(f'{fmt}: ' + ' '.join(map(lambda x: str(x[0]), iter_unpack(fmt, chunk))))

    print()

Yes, changing parameters and look for changes is definitely on my todo list. Currently my strategy is:
1/ look for certain values of variables which I know for sure are stored in the EEPROM (such as the number of running hours)
2/ if unsuccessful, vary some parameters, force an EEPROM writing, dump and compare to previous state

Great thank you, I was able to run it in a Python shell :slight_smile:
One thing I am wondering is that perhaps a float is stored on 32 bits rather than 64 bits, depending on the processor used. Is there any way in python to dictate over how many bits the data types will be stored? Perhaps some equivalent to the int_8, int_16 datatypes in C?

Yes, please see the links in post #8.

Ah yes sorry. OK perfect, I will implement all types into the script; that will yield a large dataset but then I can automate the search once I have some numbers that I definitely know should be there. Feels like I have a strategy falling in place :star_struck:

This is an "Arduino" version. It works a little different :wink:

It starts from the first byte and treats that byte (and consecutive bytes) as char, 8bit int, 16bit int (2 bytes), 32bit int (4 bytes) and ad float (4 bytes).
Next it takes the next byte as the starting point and repeats.

If you get nan or ovf for the float at a given index, you know that those 4 bytes don't represent a float.

The htons() etc in the beginning are borrowed from a network library (file w5100.h).

The sample data is a subset of yours.

#ifndef htons
// The host order of the Arduino platform is little endian.
// Sometimes it is desired to convert to big endian (or
// network order)

// Host to Network short
#define htons(x) ( (((x)&0xFF)<<8) | (((x)>>8)&0xFF) )

// Network to Host short
#define ntohs(x) htons(x)

// Host to Network long
#define htonl(x) ( ((x)<<24 & 0xFF000000UL) | \
                   ((x)<< 8 & 0x00FF0000UL) | \
                   ((x)>> 8 & 0x0000FF00UL) | \
                   ((x)>>24 & 0x000000FFUL) )

// Network to Host long
#define ntohl(x) htonl(x)

#endif // !defined(htons)



#define NUMELEMENTS(x) (sizeof(x) / sizeof(x[0]))

byte data[] = {86, 53, 71, 0, 0, 212, 65, 12, 242, 252, 68, 255, 255, 255, 255, 0, 224, 73, 68, 0,
               0, 148, 65, 10, 215, 163, 60, 255, 255, 255};


int16_t i = 0x1234;

void setup()
{
  Serial.begin(115200);
  while (!Serial) {};

  Serial.println(F("RAW data"));
  for (uint16_t cnt = 0; cnt < NUMELEMENTS(data); cnt++)
  {
    if (cnt % 8 == 0)
    {
      Serial.println();
    }
    if (data[cnt] < 0x10)
    {
      Serial.print("0");
    }
    Serial.print(data[cnt], HEX);
    Serial.print(" ");
  }
  Serial.println();

  Serial.println();
  // 8-bit integer
  for (uint16_t cnt = 0; cnt < NUMELEMENTS(data); cnt++)
  {
    int8_t i8 = data[cnt];
    int16_t *i16 = (int16_t*)&data[cnt];
    int32_t *i32 = (int32_t*)&data[cnt];
    float *f = (float*)&data[cnt];
    Serial.print(F("index = "));
    Serial.println(cnt);

    Serial.print(F("char = "));
    Serial.print((char)i8);
    Serial.print(F(" ("));
    Serial.print((uint8_t)i8, HEX);
    Serial.println(F(")"));
    
    Serial.print(F("i8 = "));
    Serial.print(i8);
    Serial.print(F(" ("));
    Serial.print((uint8_t)i8, HEX);
    Serial.println(F(")"));
    
    Serial.print(F("i16 = "));
    Serial.print(*i16);
    Serial.print(F(" ("));
    Serial.print((uint16_t)*i16, HEX);
    Serial.println(F(")"));
    
    Serial.print(F("i16 swapped = "));
    Serial.print(htons(*i16));
    Serial.print(" (");
    Serial.print(htons((uint16_t)*i16), HEX);
    Serial.println(F(")"));

    Serial.print(F("i32 = "));
    Serial.print(*i32);
    Serial.print(F(" ("));
    Serial.print((uint32_t)*i32, HEX);
    Serial.println(F(")"));
    
    Serial.print(F("i32 swapped = "));
    Serial.print(htonl(*i32));
    Serial.print(F(" ("));
    Serial.print(htonl((uint32_t)*i32), HEX);
    Serial.println(F(")"));

    Serial.print(F("f = "));
    Serial.print(*f, 6);
    Serial.print(F(" ("));
    Serial.print((uint32_t)*i32, HEX);
    Serial.println(F(")"));
    
    Serial.println("============================");
  }
}


void loop()
{
  // put your main code here, to run repeatedly:
}

Partial output

11:55:14.016 -> RAW data
11:55:14.016 -> 
11:55:14.016 -> 56 35 47 00 00 D4 41 0C 
11:55:14.016 -> F2 FC 44 FF FF FF FF 00 
11:55:14.016 -> E0 49 44 00 00 94 41 0A 
11:55:14.016 -> D7 A3 3C FF FF FF 
11:55:14.016 -> 
11:55:14.016 -> index = 0
11:55:14.016 -> char = V (56)
11:55:14.016 -> i8 = 86 (56)
11:55:14.016 -> i16 = 13654 (3556)
11:55:14.016 -> i16 swapped = 22069 (5635)
11:55:14.016 -> i32 = 4666710 (473556)
11:55:14.016 -> i32 swapped = 1446332160 (56354700)

Not shown is the float output (I ran out of time due to load shedding).

Also time ran out to add a little hardening as it will read some crap at the end of the array when using 2byte or 4byte variables.

Not fully tested !!

2 Likes

Sweet! I will also test and compare with the other apporach

So with a strategy and working code on the table, I could not wait to try it out on a working equipment... So I dismantled one today after having recorded the displayed variables on the equipment controller, carefully plugged in my faithful Arduino Mega into the EEPROM, dumped its content, ran the Python script and searched the output with the variable values I had recorded and.... TADAAA... @jfjlaros you were spot on :star_struck:

I found the variables values stored under <f with an offset of 3 (and the first three bytes are indeed a header with a meaning).

<f: 24.2891845703125 9045.4140625 nan 808.2369384765625 20.299999237060547 0.019999999552965164 nan nan

The last two values (20.2999... and 0.01999) I don't know what they are but the last one (0.01999) is the exact same value as in my test EEPROM which was pulled out of another equipment.
One thing I am wondering and will investigate is whether the "nan" in third position could hide other parameters in other format

In any case this is already a great success and I'd like to thank you all for your fantastic help!

From the age of the equipment I would expect it is an 8 bit machine with RAM and ROM/EEPROM chips with some peripheral I/O devices.