Go Down

### Topic: Random Seeds and Random Numbers (Read 17664 times)previous topic - next topic

#60
##### May 03, 2012, 07:05 amLast Edit: May 03, 2012, 07:21 am by Coding Badly Reason: 1
That is one nice thing about Sewards code concept.  His code computes a median value of a sample of the readings and uses that median to transform the readings into bits.

That part bothers me.  Truly random data occasionally has long runs of low values and long runs of high values.  If the median is determined during such a run then all the rest of the data is skewed.

I also don't like the fact that he doesn't include some statistical analysis to back his use of the first 10 seconds.  Is that long enough?  What is the probability that a statistical error will occur using 10 seconds?

Quote
So far only rudimentary ones.

Nice.  ENT is an excellent first filter.  If it does not do well with ENT, there is no way it will pass other tests.

#61
##### May 03, 2012, 07:21 am
Terse much?

I am ashamed to admit that I failed my college mind reading class.  If I knew what you knew and knew what you wanted to know then I would have not hesitated to provide a more detailed response.  Without knowing what you know and without knowing what you want to know, I have no way to compose a response that both answers your questions and does not waste your time or mine.

In other words, if you want to know something, instead of posting sarcasm, try posting questions.

#62
##### May 03, 2012, 09:01 am
Quote
I've ran across tinkerit's TrueRandom library.  Maybe that will give you better results? ... No.  It has at least two serious flaws. ... It may take a few days to resurrect the memories.  I'll get back to you.

The details are coming back.

The theory of operation of TrueRandom is very similar to this project#...
http://web.archive.org/web/20090205182836/http://home.comcast.net/~orb/index.html

The first problem is that ORB relies on the sample-and-hold capacitor being a certain size; the charge / discharge time has to be fairly long relative to the processors clock.  The sample-and-hold capacitor in AVR processors has a much smaller capacitance.  This makes the charge / discharge time too fast for the algorithm to be stable.

The second problem is that ORB relies on being able to charge and discharge the capacitor through resistors.  With TrueRandom, only charging is through a resistor.  Discharging is done straight to ground.  I believe this violates the theory of operation.

The third problem is that generating a random value is unbounded.  It is possible the function call to get a random value will never return.  In my testing, occasionally it took too long (several seconds) to return and on one occasion it never returned.

The fourth problem was revealed in testing.  It became clear that the returned value is highly dependent on the previous state of the processor.

# Unfortunately, that site has gone offline and the Wayback Machine did not capture in the images.

#### wanderson

#63
##### May 03, 2012, 02:44 pm

That part bothers me.  Truly random data occasionally has long runs of low values and long runs of high values.  If the median is determined during such a run then all the rest of the data is skewed.

Yes long runs of similar values are indeed part of a truly random stream; however, all physical chaotic (random) sources have bias.  In other words they have tendencies to produce more 1's or 0's.  This bias doesn't change the underlying randomness of the phenomena, but it does prevent a generator that uses that phenomena from producing uniform sequences of random numbers, which is what we tend to want.  The running median that he used, is simply a method of reducing the bias, and it appears to work, though as I said earlier, both the circuit and the software required tweaking to get the degree of bias removal that I wanted to see.  So in essence the median doesn't skew the data, it helps remove the skew that is built into the phenomena.

I also don't like the fact that he doesn't include some statistical analysis to back his use of the first 10 seconds.  Is that long enough?  What is the probability that a statistical error will occur using 10 seconds?

Well, he published what he wanted to and we are all free to use it or not, which includes taking it and running the analysis ourselves.  Indeed that is something that needs to be done with any RNG used for critical applications, since they all have quirks, that could effect their applicability to a particular use and hardware based ones are particularly prone to that.  And when one considers the relative slowness of these types of generators, some (if not most) of the tests out there which were designed and intended for deterministic random number generators (PRNG's) I am not sure how relevant they are.  For most purposes, a hardware based random number generator has one simple criteria, is the data stream predictable?  If it is then it fails, which is why bias removal is so important.  It is predictability that differentiates a true random number generator from a psuedo random number generator which is inherently predictable.

From what I have generated so far, the 50,000 samples he uses seems to be more then sufficient.  Indeed, when considered from the view that it is simply trying to remove bias from the circuit that is caused by junction changes or possibly beta changes then it is almost certainly sufficient since we accomplish much the same thing when constructing the circuit and making one measurment with an oscilliscope or meter.

Nice.  ENT is an excellent first filter.  If it does not do well with ENT, there is no way it will pass other tests.

Ent, like the other common tests, are simply tools.  Too many folks apply those tests without understanding them.  Just because a generator passes (or fails) a particular test doesn't mean the generator is bad for all purposes, just some (and those are usually not common).  The classic random number generators, coin flips and dice rolls, will fail most of the modern tests as well, even when using "fair" coins and dice.  Part of the reason is that even "fair" coins and dice will show bias when examined over hundreds of thousands (or millions) of samples as these modern tests demand.  I recall a paper where  they tested a number of dice (all "fair") and discovered that they tended to become heavily biased over time.
New true random number library available at: http://code.google.com/p/avr-hardware-random-number-generation/

Current version 1.0.1

#### wanderson

#64
##### May 03, 2012, 03:21 pm

Ok, here's my attempt using the watchdog timer jitter.  ... Tested only on my Duemilanove.

Try a few boards with SMD processors.

Here is my result with a Mega R3.  Any ideas as to why the SMD version makes a difference?  One of the cites by endolith, used a butterfly which is a smd device...

wandrson@delphi:~/Development/hardware-rng/WatchDogTimer\$ ent rnd.tst1.bin
Entropy = 7.999432 bits per byte.

Optimum compression would reduce the size
of this 413640 byte file by 0 percent.

Chi square distribution for 413640 samples is 326.29, and randomly
would exceed this value 0.17 percent of the times.

Arithmetic mean value of data bytes is 127.5754 (127.5 = random).
Monte Carlo value for Pi is 3.130142153 (error 0.36 percent).
Serial correlation coefficient is 0.001009 (totally uncorrelated = 0.0).
New true random number library available at: http://code.google.com/p/avr-hardware-random-number-generation/

Current version 1.0.1

#65
##### May 03, 2012, 08:10 pmLast Edit: May 03, 2012, 08:13 pm by Coding Badly Reason: 1
Quote
Any ideas as to why the SMD version makes a difference?

The watchdog oscillator from all the processors my team tested had a considerably amount of jitter.  On the surface, they all seemed like good candidates for generating random data.

The watchdog jitter in DIP processors seems to have a strong random component.  Collected random data did will on statistical tests.

The watchdog jitter on SMD processors has a similar range but seems to have a much smaller random component.  It appears the jitter is primarily a function of the previous state of the oscillator and, to a smaller extent, the state of the processor.

There were significant variations in the SMD processors.  Some produced good quality data.  Most did not.  In the  extreme case, the watchdog oscillator ran in perfect harmony with the processor's clock for long periods of time.

I suspect Atmel is trying to and succeeding at making the watchdog oscillator more stable.

The bottom line is that there was too much variance for the technique to be generally useful.

#### GoForSmoke

#66
##### May 03, 2012, 09:26 pm
It occurs to me that an RNG written to generate randoms in not completely honest, but extremely hard to predict, ways could crank out data tailored to pass multiple simultaneous tests. I.e.... cheat.

2) http://gammon.com.au/serial
3) http://gammon.com.au/interrupts

#### wanderson

#67
##### May 03, 2012, 09:51 pm

It occurs to me that an RNG written to generate randoms in not completely honest, but extremely hard to predict, ways could crank out data tailored to pass multiple simultaneous tests. I.e.... cheat.

Cheating is certainly possible; however, designing a rng that can pass the more stringent tests is not easy (even without cheating).  It is important to note that the vast majority (if not all) computer based random number generator algorithms are predictable.  Of course some are easier to predict then others, but their underlying deterministic nature makes them predictable.  That is the real advantage of a so called true random number generator...  By utilizing a non-deterministic basis they are not predictable.  Of course it gets much more complicated then using just a non-deterministic source, because bias removes the uniform nature of the random stream, hence introducing the ability to use the bias based probabilities to predict the stream.

This subject has enough depth that entire academic careers can be spent on just aspects of it.
New true random number library available at: http://code.google.com/p/avr-hardware-random-number-generation/

Current version 1.0.1

#### endolith

#68
##### May 12, 2012, 04:04 amLast Edit: May 12, 2012, 04:07 am by endolith Reason: 1

I just thought of measuring the internal voltage reference vs. supply voltage. Right now I have not to much time to try this but this should pick up quite a lot of noise in the LSBs. But it might be biased though.

That's the first thing I tried, and it doesn't work, at least on the Atmega328.  The noise is below the level of the LSB, so it can be read many times in a row without changing at all.  Here's the histogram:

Here is my result with a Mega R3.

Thanks!

Quote
and randomly
would exceed this value 0.17 percent of the times.

"almost certainly not random"

It's still better than TrueRandom, at least.

Can you post the raw samples without the XORing and shifting?

Code: [Select]
`void loop(){  if (sample_waiting) {    sample_waiting = false;    Serial.write(result);  }}`

On Duemilanove after several days of running, I didn't see any variation or dropouts or non-randomness:

Code: [Select]
`Entropy = 7.999969 bits per byte.Optimum compression would reduce the sizeof this 5489591 byte file by 0 percent.Chi square distribution for 5489591 samples is 233.95, and randomlywould exceed this value 75.00 percent of the times.Arithmetic mean value of data bytes is 127.4536 (127.5 = random).Monte Carlo value for Pi is 3.145767276 (error 0.13 percent).Serial correlation coefficient is -0.001267 (totally uncorrelated = 0.0).`

In the  extreme case, the watchdog oscillator ran in perfect harmony with the processor's clock for long periods of time.

I suspect Atmel is trying to and succeeding at making the watchdog oscillator more stable.

Ah.  Relaxation oscillators sync up with each other if the power supply regulation isn't perfect and they're close in frequency (or close to a multiple?)  Maybe that's what it is?  What oscillator were you using for the processor's clock?

#69
##### May 12, 2012, 04:12 am
What oscillator were you using for the processor's clock?

16 MHz crystal, 16 MHz resonator, and the internal 8 MHz oscillator.  There was some combination of 328 boards and 1280 boards.  I recall an SMD Uno being the worst performer.

#### wanderson

#70
##### May 12, 2012, 02:20 pm
This idea intrigued me, so I have been continuing to experiment with the boards I have, including adding a new SMD UNO to the batch.

Curiously, I ran a longer test on that same MEGA3 before and got different results;

Code: [Select]
`Entropy = 7.999790 bits per byte. Optimum compression would reduce the sizeof this 1,000,000 byte file by 0 percent Chi square distribution for 1,000,000 samples is 291.25, and randomlywould exceed this value 5.89% percent of the time. Arithmetic mean value of data bytes is 127.4081 (127.5 = random)Serial correlation coefficient is -0.004496 (totally uncorrelated = 0.0).`

While not producing as uniform a distribution as I would like, this smd chip is exhibiting a large degree of entropy using this WDT based method when the Timer1 is just sampled without using the Timer1 library, my last test used the Timer1 library which further tests have shown significantly reduce the entropy on this chip.

I have gathered raw values (no processing) on this mega for all timers (0,1,and 2) and am also doing the same for a smd uno and a smd 32u4 chip, all running arduino bootloaders and this code.
Code: [Select]
`#include <stdint.h>#include <avr/interrupt.h>#include <avr/wdt.h>byte TC0sample,TC2sample;unsigned int TC1sample;boolean sample_waiting = false;void setup() {  Serial.begin(115200);  wdtSetup();  Serial.println("TC0,TC1,TC2");}void loop(){  if (sample_waiting) {    sample_waiting = false;    Serial.print(TC0sample);    Serial.print(",");    Serial.print(TC1sample);    Serial.print(",");    Serial.println(TC2sample);  }}// Setup of the watchdog timer.void wdtSetup() {  cli();  MCUSR = 0;    /* Start timed sequence */  WDTCSR |= _BV(WDCE) | _BV(WDE);  /* Put WDT into interrupt mode */  /* Set shortest prescaler(time-out) value = 2048 cycles (~16 ms) */  WDTCSR = _BV(WDIE);  sei();}// Watchdog Timer Interrupt Service RoutineISR(WDT_vect){  TC0sample = TCNT0;  TC1sample = TCNT1;  TC2sample = TCNT2;  sample_waiting = true;}`

I'll make the data available as soon as I arrange a place to share the files--I am collecting at least 80,000,000 samples for my testing of each chip.  I have some other smd chips that I plan to wire up and test as well.
New true random number library available at: http://code.google.com/p/avr-hardware-random-number-generation/

Current version 1.0.1

#### wanderson

#71
##### May 12, 2012, 02:28 pm

Here is my result with a Mega R3. ... and randomly would exceed this value 0.17 percent of the times.

"almost certainly not random"

It's still better than TrueRandom, at least.

Can you post the raw samples without the XORing and shifting?

The chi-square isn't a measure of the entropy in the data stream, but rather an indicator of how the uniformity of the histogram produced when compared to the theoretical ideal...  I have had many hardware based (and almost all pseudo random number generators) I have tested exhibit such behavior.  Even those who typically behave much better will occasionally produce similar chi-squared test results.

I can't post for that test since I didn't record the raw data, I used your sketch, but I have made recordings of the raw data on that chip which I will make available later today--as soon as I find a place to host the data.
New true random number library available at: http://code.google.com/p/avr-hardware-random-number-generation/

Current version 1.0.1

#### wanderson

#72
##### May 12, 2012, 04:29 pm
I have created a google code project page for collecting the data and code associated with hardware base random number generators on AVR's.  Currently the raw watch dog timer based numbers (no processing) are available from the download area of the project.

New true random number library available at: http://code.google.com/p/avr-hardware-random-number-generation/

Current version 1.0.1

#### wanderson

#73
##### May 13, 2012, 03:00 am
I added the file containing the raw wdt generated timer values for a UNO, smd version to the previous listed site.
New true random number library available at: http://code.google.com/p/avr-hardware-random-number-generation/

Current version 1.0.1

#### wanderson

#74
##### May 14, 2012, 03:39 pm
I just finished uploading raw wdt based timer values for an uno dip, uno smd, mega 2560 r3 (smd), and an 32u4 (smd) to the site I listed previously.  All four chips are showing similar histograms of data for each of the three timers.  I haven't run any analysis yet, but here is a typical histogram for each of the three timers;
New true random number library available at: http://code.google.com/p/avr-hardware-random-number-generation/

Current version 1.0.1

Go Up

Please enter a valid email to subscribe