Go Down

Topic: How EEPROM wears out? (Read 187 times) previous topic - next topic

Smajdalf

Hi,
long time I think about how EEPROM wears out and cannot find any reasonable information about it. What I don't know:
EEPROM has limited number of writes. It is easy. First ATMega Datasheet I opened states:
Quote
Write/Erase Cycles EEPROM: 100,000
Data Retention: 100 years at 25°C
It is easy. If I write it up to 100,000 times I will read the last value written even after 100 years. But what counts as writing?
Will erasing of erased bit wear it? I.e. may I write 0xFF any number of times without any adverse effect on the cell?
Will programming of programmed cell wear it? I.e. may I write 0x00 using "Write only" option any number of times without damaging the cell?
How worn out cell behaves? Does it return only 1 or 0 regardless being erased or programmed? Or does it return random number? Or is there any other strange behavior? Such as returning 1 @ Vcc < x and 0 @ Vcc > x where x depends on temperature, humidity and phase of moon?
On 8-bit (i.e. Uno) use "byte" instead of "int" if possible - it is faster and saves resources!

el_supremo

Don't send me technical questions via Private Message.

Smajdalf

This is instructive:
Possibly. I have read other app notes from Microchip on this topic and it did not answer my questions. Neither does this one. For example they are testing EEPROMs from different manufacturers for "failures". How they do it? I doubt they write the EEPROM and wait 100 years to read it back and if it is OK they repeat it. How they know the tested bit has failed? Is reading back the same value as was written guaranty it will stay the same after 100 years or is it possible it will change in 5 mins? So many questions, so few answers.

Also they say erasing/writing whole page is better for the EEPROM because charge from the pump is spread into larger area, leading to less peak voltage leading to less wear out. Does that mean that writing 0x7F into one location repeatedly (and erasing between writes) will wear out the written bit much faster than cycling 0x00? What about the other 7 bits?
On 8-bit (i.e. Uno) use "byte" instead of "int" if possible - it is faster and saves resources!

pito

#3
Mar 21, 2017, 08:22 am Last Edit: Mar 21, 2017, 08:54 am by pito
The questions you are asking could be answered when studying physics. It is a rocket science, with a lot of research put in it.

The EEPROM cell is a capacitor - see https://en.wikipedia.org/wiki/Leyden_jar .

The dielectrics between the capacitor's electrodes is made of extremely pure glass (SiO2). The glass is thin - maybe a few nanometers (a few atoms thick).

The better the purity and compactness of the glass the better the electrical isolation (the lower the discharge currents). When the glass gets mechanical cracks, or chemical impurities are migrating into it, the leakage current through it increases by many orders, and the capacitor discharges itself faster - it means the memory cell "wears out".

The cracks and impurities (defects) occur in time in the insulating glass because of - the heat, the radiation, the electric fields, the chemicals from surrounding gases and contacting materials, etc. The higher the temperature and the stronger the electric fields or the radiation the faster the defects spread in the glass insulator. The speed of degradation is usually exponential.

So in order to simulate 100y of usage the scientists put a memory cell into high temperature, strong radiation and use higher programming voltages. They test for a few days and get results, with science of math and physics they extrapolate the results to a 100y prediction. So you must not wait for 100y to verify whether something will work on not..

cattledog

@pito
Thanks for the explanation of the 100 years.

What is the explanation of the 100K writes?

Smajdalf

@ pito
Still no answer to my questions. How it looks when a bit is worn out? Is it unable to hold written value returning random numbers? Is it stuck in one state unable to be changed? If so which one - 0, 1 or random? Will it fail abruptly - last write was OK but on next write it becomes stuck (crack in the insulation develops)? Or will its reliability slowly decrease - only 50 years retention -> next write 49.5 years -> ... -> next write 5s?
Another unanswered problem - will erasing of erased bit or programming of programmed bit do any harm? I.e. if changing 0x01 to 0x00 should I write 0xFE (to save bits from excessive writes) or 0x00 (to save the LSB from concentrating all energy from the charge pump on only bit)? Does it matter?
On 8-bit (i.e. Uno) use "byte" instead of "int" if possible - it is faster and saves resources!

pito

#6
Mar 22, 2017, 01:13 pm Last Edit: Mar 22, 2017, 03:26 pm by pito
Guys - 100y and 100k times is the same magic - the extrapolation from data based on measurement, simulation, physical models and production statistics. 100k to check is much easier, as you can do 100k in real time.

My understanding would be the 100k limit means after the 100k charges/discharges of the cell the cell's ability to store the charge for 100y drops to say by half (or to 63%, or 80% of the original value, etc.) because of the wear-out.

Under ambient temperature and at the sea level wear-out is mostly caused by strong electric fields - the strong programming pulses - fields/currents, they may push the chemical impurities (atoms) into the bulk SiO2 insulator, or create small cracks/atoms displacements in it - that creates ionization centers generating additional current leakage which discharges the Leyden jar faster).

In the microelectronics the stuff works such that when you shoot 10 atoms A into a bulk material made of 10mil B atoms the electrical parameters of the bulk material B may change significantly (by many orders of magnitude)..
Also mind in the production of chips the electrical parameters of a chip cut off from the middle of the silicon wafer may differ by 100++% to those cut off close to the wafer's edges or anywhere else (because of the fact a few atoms may change the parameters significantly, the chip production is similar to baking a cake, where the cake is cut off into maybe thousands of chips).
The cake:

The above picture depicts how the electrical parameters of chips differ across the Si wafer.. It does not necessarily mean the chip's digital/logical function will fail. That is a real picture..

That must not necessarily mean after 100k the cell returns a wrong value. So it does not mean after 100k the memory fails.

Most of the details are subject to an NDA, so you have to ask the vendor directly (vendor's internal know-how).

Mind the chip-makers use rather conservative estimates (because the production parameters spread), so when you do 10mil writes fine I would not be surprised..

For example there is an online monitoring page with various SSD disks used in heavy application, I saw results for Samsung 850 PRO (rated 150TB) - it did 7PB actually.. And in that single flash cell you store 3bits (8 charge values to measure), afik.

Long time back (as the FRAM memories started) I had a chat with an engineer from the chipmaker - I asked him why 10^14 writes only - he told me it is "unlimited" actually, but they better indicated that crazy max number of writes within the datasheet, otherwise people will not trust them :)

Smajdalf

I looks like I am unable to express what I want to know. But I have found half of the information I seek on Wikipedia.
It says some of electrons injected to/from the floating gate stay stuck in the insulation layer each write. If too much is stuck (to many erases/writes) it is impossible to erase the bit (it stays 0).
The charge is leaking away from the floating gate. After some time (the mentioned 100 years) too much electrons leak away and the bit erases itself - turning to 1.
The explanation suggests the two modes are independent of each other. Since I cannot find any definitive answer I will try to do some test. I wonder what I will find.
On 8-bit (i.e. Uno) use "byte" instead of "int" if possible - it is faster and saves resources!

Smajdalf

It looks like it is not so easy as I expected. I used 24C02 EEPROM memory and tried to write to it. The test cycle was write 0xFF, read back 20 times, write 0x7f, read back 20 times, ... write 0x01 and read back 20 times. After 2M of those cycles (so 8M writes) bit 3 read 1 when it should read 0. Just once, another fail of the same bit was 100k writes later. I continued writing to the byte and I have nearly 7M cycles so far.
Currently when I write 0 to the byte it stays 0 for about a minute. Then it turns to 1. After a bit longer time it turns to 0x7f. Strange since the MSB should be worn out most of the bits...
When repeatedly reading the byte again and again it changes it's value after about 200k reads. One bits turns to 0. But after a short moment it is 1 again.
With very limited evidence I have made it looks like worn out bit
1) When not touched it turns to 1
2) When repeatedly read it turns to 0

Writing 1 to the bit wears it more than writing 0 to it - quite strange if the external EEPROM works as others: write cycle should erase whole byte to 0xFF and the write only those bits that should be 0. I would say more charge transfers = more harm. Possibly a bug in my testing code???
On 8-bit (i.e. Uno) use "byte" instead of "int" if possible - it is faster and saves resources!

pito

#9
Today at 10:49 am Last Edit: Today at 11:16 am by pito
Great test! So it seems you get quite close to my estimation of 10mil :)

The reading of the memory cell does not work in a "digital" way. All digital chips work analog way. For simplicity we consider them "digital" with two states 1 and 0.

There is a sensor circuitry inside the eeprom which reads the "voltage" at the charged capacitor.
When the cell is worn-out, the leakage current is high, and the voltage at the capacitor decreases faster.
Also the capacitance of the cell decreases (in pF).

Imagine there is a reading threshold voltage at the cell, say 2V and more, where the sensor reads a perfect "1".
When the voltage at the cell is for example 1.5V +/- 0.5V the sensor reads 1 or 0 in an erratic way.
That is similar to when you do center the input voltage at the BPill pin to Vdd/2 (with 2x 1000k resistors for example), and you start to read the pin. You get a mess.

So you will never get stable readings when the cell is around its threshold voltage. And the threshold voltages across the chip may differ..

Moreover, when the cell is at its threshold voltage it is more sensitive to any change of Vdd, or temperature, or radiation.. And it may happen the cell's behavior starts to be depended on its position on the chip (wire lengths/resistances, noise, ..).

In modern high capacity SSD disks the cell is read by an ADC, the ADC measures 8 values of the cell voltage and returns 3 bits (ie. you read "101"). These kind of cells are worn-out after ~5000 writes (also because they use much finer production node - ie 24nm - thus the glass insulation wears-out much faster).



Smajdalf

Well the information in last post is "well known". Still no definitive answer to my questions. I would expect real EEs NEED to know what causes the EEPROM to wear and how worn memory looks like.

But I cannot find the info anywhere. It is even more obscure. From communication protocol it is not clear if the device does byte write or page write. For example Atmel's AT24C32A states

Quote
32-Byte Page Write Mode (Partial Page Writes Allowed)
Similar 24C32A from Microchip says
Quote
2 ms typical write cycle time, byte or page
Both memories are obsolete, new part from Microchip is 24LC32A. Despite Byte Write mode is described the same way as in other datasheets a note is added (page 8):

Quote
When doing a write of less than 32 bytes the data in the rest of the page is refreshed along with the data bytes being written. This will force the entire page to endure a write cycle, for this reason endurance is specified per page.
I wonder how I can know how to know if byte write mode is supported or just emulated by internal read-modify-write of the whole page.

As a side note: I don't see any great difference between the obsolete and new EEPROMs in their datasheets. What is wrong on the obsolete parts?
On 8-bit (i.e. Uno) use "byte" instead of "int" if possible - it is faster and saves resources!

CrossRoads

"I would expect real EEs NEED to know what causes the EEPROM to wear and how worn memory looks like."

Nonsense. You just to know that it will and plan accordingly by keeping track - either the number of writes, or estimate how many writes over a time period and then estimate end of life from that.

Obsolete parts usually have stopped selling, so they stop making them, or the process used to make them has stopped being used, or smaller faster technology has surpassed them in price & performance.  There are many reasons, almost all economically driven.
Designing & building electrical circuits for over 25 years.  Screw Shield for Mega/Due/Uno,  Bobuino with ATMega1284P, & other '328P & '1284P creations & offerings at  my website.

Go Up
 


Please enter a valid email to subscribe

Confirm your email address

We need to confirm your email address.
To complete the subscription, please click the link in the email we just sent you.

Thank you for subscribing!

Arduino
via Egeo 16
Torino, 10131
Italy