Question about Atmega328p chip's reliability for simple math calculations.

Hi All, this question is probably to those who has long term experience working with AVR chips, I am not talking about Arduino (however that probably won't matter much) but rather about Atmega328p chip itself in some light consumer or industrial application where it needs to perform math on some sensors readings, such as a RMS from a sensor or ADC, etc. I do not have any issues with it so far but I am curious once it goes into real-life application and need to sit for months at a time constantly reading sensor data at 50k samples per second and perform rolling RMS calculation on it, how reliable that is, or for a lack of better words: "what are the chances that coupe bits occasionally lost so to speak"? Or normally that is not an issue in microcontrollers such as 328p?

http://www.google.com/search?q=IEEE+754

what are the chances that coupe bits occasionally lost so to speak"?

The same chances that all the other processors / libraries that use the IEEE 754 standard have.

If you are integrated (summing), adding very large numbers to very small numbers, or subtracting very large numbers from very small numbers (vice versa) then you need to understand how those operations are affected.

Thank you, doing reading now, it seems though any error that may be related to working with very large numbers along with very small numbers with IEEE is a software related issue, what about hardware side of things? As per my understanding all bits inside processor and manipulated by transistors, so how stable and reliable those are, or as a simplest example, if I let's say take 64bits and simply shift them back and forth for let's say 5 billion cycles, can I be sure that non of the bits was shifted incorrectly?

Keep your math integer and you will be as accurate as can be, There is a big number class for Arduino so arbitrary precision is possible.

Months after another will also give timing (clock drift) issues or will you sync with a NTP server ?

But that would mean interrupting your steady flow of numbers to process.

Can you tell more about what you really want to do?

The rate of straight up errors in integer math (or anything else) is extremely low; lower than your desktop computer probably (since process geometry is smaller on desktop cpu -> more vulnerable to cosmic rays and shit)

You need to worry about floating point math as noted above, but that's not a data integrity issue, just limit of the way floating point values are stored.

And ofc you need to make sure your code is free of bugs and that you're not abusing it electrically (ex, noisy/unstable power supply)

alexmg2:
if I let's say take 64bits and simply shift them back and forth for let's say 5 billion cycles, can I be sure that non of the bits was shifted incorrectly?

Given the fact that a neutrino can pass freely through 1500 meters of earth then strike a chlorine atom turning it into an argon atom with a released electron the answer to your question is "no".

However, such things are so rare as to generally be considered negligible.

And that is why mathemagicians invented error correcting codes :wink:

xainnasir:
If you program it perfectly then there's no chance that it will lost or skip anything.

Well, defective external components or hardware design flaws around it can also cause malfunctioning - usually just a reset or hang, though my understanding is that other forms of malfunctioning are possible, if unlikely.

xainnasir:
I like this microcontroller. :slight_smile:

Yeah, we all do :wink:
What makes it really great though, is that you can move to a smaller attiny or a bigger mega2560 or something - and even if you're working at the level of registers, everything works the same way.

I guess that what you're looking for is the MTBF. Using a simple google, I found lifetime Arduino - #6 by DVDdoug - Project Guidance - Arduino Forum.

I can not vow for the correctness and what was exactly measured; the link in that post no longer seems to work. You can contact Atmel/Microchip support if you really need to know.

From my experience with Intel, Dallas and Microchip microcontrollers, they do live for years; no reason to believe that Atmels would be different.

if I let's say take 64bits and simply shift them back and forth for let's say 5 billion cycles, can I be sure that non of the bits was shifted incorrectly?

Extremely low. The microcontrollers generally work in 8 bit chunks since they're 8 bit CPUs. Lets say for simplicity that that 64 bit operation is actually 8, 8 bit shifts. Each shift is one CPU instruction. Lets say 2 more instructions to accomplish the loop. So 10 instructions.

5000000000 * 10 = 50000000000 instructions executed. AVRs execute 1 instruction per clock cycle so that is 50000000000 clock cycles. 50000000000 cycles / 16000000 (16MHz) = 3125 seconds or 52 minutes.

It would only take 52 minutes for an AVR to perform 5 billion shifts of 64 bits of data. The chance of error in 52 minutes of operation is extremely tiny unless the power supply is poor or the clock is glitchy.

As per my understanding all bits inside processor and manipulated by transistors, so how stable and reliable those are, or as a simplest example, if I let's say take 64bits and simply shift them back and forth for let's say 5 billion cycles, can I be sure that non of the bits was shifted incorrectly?

If we talk from scientific point of view, there are doubts everywhere on everything.
If we talk from engineering point of view, the chips are made to work within their specified specifications, and they work.

How much is it the chance for the upper bits say (b6 - b9) of the ADC of the ATmega328 to have been corrupted (0 becomes 1 and vice versa)? (0%). How much is it for the LSB-bit? (Almost 50%).

The bit shift is not associated with electron/hole/charge mobility and hence there is no perception of displacement. The transistorised switches are there as static elements; shifting is the high level abstraction of making one transistor ON and other transistor (TX) OFF. Shifting of 100 to the left by 1-bit makes it 1000 which means T3 is ON, T2-T0 are OFF.

IEEE-754/binary32 standard gives 23 digits precision for the representation of a decimal number with fractional part; but, the accuracy is only 6/7-digit. To an engineer, this accuracy is good enough; but, the scientist is looking for the reasons that have contributed to this inaccuracy or how this inaccuracy could be explained or what else could be done to overcome this limitation. It is the advent of IEEE-754/binary64 standard where the computation precision is 52 digits and the accuracy is 17 digits after the decimal point.

The Pentium-I's FPU was unfortunate that it computed the accuracy to only 3-digit rather than 6-digit --