Problem reading random noise generator

The bumps at 1 and 20 suggest that it is a software problem. The rest of the distribution looks pretty uniform. You just haven't yet found the appropriate algorithm for using the hardware to select truly random numbers. You might take a more careful look at what the author of this page http://holdenc.altervista.org/avalanche/ did to select streams of random 0s and 1s, which would be easy to map into any range you like.

As a first try, I suggested using the % operator, but many people recommend against using this method to reduce the range, as it can introduce bias. See this discussion http://stackoverflow.com/questions/2999075/generate-a-random-number-within-range/2999130#2999130

Edit: I note that in the original posted code, you do not initialize the two distribution arrays to zero. This is not a good idea, as some compilers (I haven't checked what the Arduino version of gcc does) just allocate memory space for an uninitialized variable, and that space almost certainly has some numbers already in it.

I certainly am going to try that, but I want to see first what the output looks like on the scope, and I can't get back into the university lab until it reopens on Monday. In the interim, do you know what might be causing the software issue?

The stackoverflow discussion I linked above suggests that you could try using the following function to limit the range of the hardware random number generator. You will need to modify it slightly (so as not to use the built in RAND_MAX and rand() function), but the general approach should work.

int rand_lim(int limit) {
/* 
  return a random number between 0 and limit inclusive.
 */

    int divisor = RAND_MAX/(limit+1);
    int retval;

    do { 
        retval = rand() / divisor;
    } while (retval > limit);

    return retval;
}

I used it to generate 100,000 integers between 0 and 19 and got the following uniform distribution

0 4976 1 4921 2 4980 3 5104 4 4862 5 4881 6 4970 7 5003 8 5064 9 5018 10 4930 11 5037 12 5205 13 5046 14 4906 15 4902 16 5010 17 5192 18 5005 19 4988

Interesting, I don't know why I didn't see the stackoverflow link earlier, my bad. What does RAND_MAX do? If I'm going to replace it, I'll need to know haha. Also, would you suggest replacing rand() with my noise generator?

Also, initializing the arrays to zero didn't help. Of course, that's probably because I reset the sketch between each run, and (correct me if I'm wrong) the memory is initialized to zero.

RAND_MAX is the maximum number that the built in C/C++ function rand() outputs.

It may be that the Arduino gcc compiler initializes variable memory to zero, but many others don’t. You should get into the habit of initializing ALL variables before using them.

You’re right, I should, thank you. Should I replace rand() with the noise generator?

Don't forget that the noise will have a higher frequency in the noise that you can sample at. Therefor there will be a degree of avraging going on which will limit the readings at the top and bottom end. You need to add a low pass filter to your noise source to take it below the niquis rate.

If that's happening, why do they seem to be grouping at the top and bottom ends?

An observation: most (maybe all) of the circuits I have seen like that use 12V or higher (including the one in your link which calls for 13V). I vaguely recall one author claiming that 12V is the minimum necessary for the circuit to work correctly. But, that could easily be because of the transistors he was using.

If you want continued help with the software I suggest you post the current version of your sketch.

This may give you some troubleshooting ideas... http://forum.arduino.cc/index.php?topic=161682.0

Thank you, Coding Badly, that thread will be very helpful. I'll post my sketch tomorrow when I get back to it. This is giving me a ton of ammunition with which to improve my design!

acegard: If that's happening, why do they seem to be grouping at the top and bottom ends?

Because the signal is moving faster in the middle of the transition, just think of a sin wave the fastest movement is at the zero crossing poi nt. those are just the levels you will miss.

Grumpy_Mike: Don't forget that the noise will have a higher frequency in the noise that you can sample at. Therefor there will be a degree of avraging going on which will limit the readings at the top and bottom end. You need to add a low pass filter to your noise source to take it below the niquis rate.

I tried something similar a while ago and got very similar results...a disproportionate binfull of 0s and 255s in my case....

im sure its a sampling frequency/nyquist issue. in my case I was cleaning up the signal with an lm393...I had biased my zener +NPN amp to be abt 2.5 midpoint and fed that plus a pot.tweaked Vcc/2 to the 393.

even tho my scope is ancient and slow, I could see that if I shifted the vcc midpoint up or down slightly in the wide blur of random fuzz, even tho it was still sort of middle.ish my distribution would vary widely...i did actually get a fairly flat distribution at one point, but I could never repeat it. I concluded I was being thwarted by the slew rate of the 393, or some kind of "beat frequency" between a less than nyquist rate on the max 393 slew rate...

my knowledge and scope were well below required level at the time, so I took the easy solution..I gave up!

ps they probably still are, but at least I can send you some empathy, if not fix your entropy....

Just to reinforce what Coding Badly pointed out, I made a circuit based on that type of design using 2n2222 and I found that a supply of 5V and 9V didn't work well. It needed at least 12V.

Avalanche noise circuits have several issues as random number generators. The first, is that they exhibit a fair degree of bias, regardless with how you capture the noise (ADC or comparator).

My experiments have indicated that two stages of 'whitening' are required to remove the bias from the readings. I use a Von Neumann filter followed by either some form of linear feed back shift register (LFSR). In general I have found that it requires at least a 4:1 whitening ratio to get a reasonably uniform RNG. For LFSR, I have used either a simple XOR'ing of consecutive bytes (which worked for one example circuit with A2D) to a Jenkings one-at-a-time hash on my second implementation which required more whitening due to a different noise source (5V zener source, no 10-15V supply needed).

The second issue with avalanche noise random number generator circuits is that the balance of 1's to 0's will change over time. If your code doesn't accomodate such changes your RNG will stop producing uniform RN when the bias exceeds your whitening technique's ability to cope. As you can see from the pages below, I noticed LARGE changes in the circuits balance point as it aged.

If your interested in my experiments, which include the code I derived from Rob Sewards median calculation approach are available from

https://sites.google.com/site/astudyofentropy/project-definition/avalanche-noise https://code.google.com/p/avr-hardware-random-number-generation/wiki/AvalancheNoise

The author of the HRNG link you posted claims to have gone to some effort to reach his goal, namely

trying to design a circuit that could produce a symmetrical, zero-average noise signal from a reverse biased p-n junction in avalanche breakdown

but it really isn't clear how he achieved and verified the output of the hardware other than with software. Your circuit has no provision for adjustments and can't produce a zero mean signal. Also, he used a digital input to generate "1"s and "0"s -- your use of the ADC throws in an additional complication. So you may be fighting an uphill battle.

I don't mean to be discouraging! These things are fun and challenging.

As an aside, it is interesting to listen to various different transistors in avalanche breakdown. You can do this by coupling the output transistor via a capacitor to the high gain phono input of an audio amplifier. There are a few different processes that lead to breakdown, and these manifest themselves as clicks, pops and hisses of different apparent pitches.

el_supremo: Just to reinforce what Coding Badly pointed out, I made a circuit based on that type of design using 2n2222 and I found that a supply of 5V and 9V didn't work well. It needed at least 12V.

I disagree. my original circuit was stolen from a 12v design. I dropped it to 5v, changed the NPN bias to get 2.5 ish midpoint and enough gain to exceed the upper and lower hystersis points of the 393. it didnt work...until I realised that the key point is the current through the noise source is quite critical. i guess most mfrs try to reruce noise and though zeners are inherently noisy there is a point on their curve where they are most noisy. we are talking nanocurrents here, any help you can get to boost the starting point is a bonus...I rebiased the zener to get the exact datasheet current point for peak noise and it burst into life....

albeit with the nyquist sample rate problem, so im not cel3brating yet, just my 2c that 5v circuits can be made noisy with a bit of care,a datasheet and a calculator

BareMetal:

el_supremo: Just to reinforce what Coding Badly pointed out, I made a circuit based on that type of design using 2n2222 and I found that a supply of 5V and 9V didn't work well. It needed at least 12V.

I disagree. my original circuit was stolen from a 12v design. I dropped it to 5v, changed the NPN bias to get 2.5 ish midpoint and enough gain to exceed the upper and lower hystersis points of the 393. it didnt work...until I realised that the key point is the current through the noise source is quite critical. i guess most mfrs try to reruce noise and though zeners are inherently noisy there is a point on their curve where they are most noisy. we are talking nanocurrents here, any help you can get to boost the starting point is a bonus...I rebiased the zener to get the exact datasheet current point for peak noise and it burst into life....

albeit with the nyquist sample rate problem, so im not cel3brating yet, just my 2c that 5v circuits can be made noisy with a bit of care,a datasheet and a calculator

Avalanche noise is generally an order of magnitude lower for 5V and below sources (typically zeners') when compared to 12V+ sources, even when current is adjusted for. Further the reverse bias transistor used in the OP example doesn't enter the breakdown zone until about 8-9V, though I have found a few examples of the 2N3904 (the transistor typically specified) that entered breakdown at lower voltages, the mean tends to be around 9V, with noise peaking near 12V.

The 'best' current for maximum noise varies with the device; however, the ones I have tested have generally been 5-50 uA.

Further, there have been several people who have tested a variety of such transistors, and the 2N3904 tend to produce the most noise.

jremington: The author of the HRNG link you posted claims to have gone to some effort to reach his goal, namely

trying to design a circuit that could produce a symmetrical, zero-average noise signal from a reverse biased p-n junction in avalanche breakdown

but it really isn't clear how he achieved and verified the output of the hardware other than with software. Your circuit has no provision for adjustments and can't produce a zero mean signal. Also, he used a digital input to generate "1"s and "0"s -- your use of the ADC throws in an additional complication. So you may be fighting an uphill battle.

The original author of the circuit the OP is using believed that a relatively even balance between 0's and 1's produce a non-biased output (it doesn't), but they didn't appear to actually test that asuumption since they used 'whitening' techniques from the start:

This time there is no need to minimize bias inherent in the operation of the generator, if all went well we should get a probability of 1s and 0s very close to 50%. Of course I'm going to assume that the bit-stream still contain bias and correlation, so a de-skewing technique for reducing bias and correlation will be used anyway. On the software side I have basically used the same code I had written for the Chua random generator: at the beginning GPIO4 is configured as input, then one bit is read (and its value is optionally printed), next we wait for a delay and repeat. This delay has been set to 500?s, producing a 2000bit/s stream.

Yes, I see. He ran a simple program to read two independently generated output streams from the HRNG and produce a third stream, as outlined by the following (from http://holdenc.altervista.org/chua/ )

John von Neumann invented a simple algorithm to fix simple bias, and reduce correlation. It considers successive pairs of consecutive, non-overlapping bits from the input stream, taking one of three actions: when two successive bits are equal, they are discarded; a sequence of 1,0 becomes a 1; and a sequence of 0,1 becomes a 0. It thus represents a falling edge with a 1, and a rising edge with a 0. This eliminates simple bias, and is easy to implement. This technique works no matter how the bits have been generated. It cannot assure randomness in its output, however. What it can do is transform a biased random bit stream into an unbiased one. However, significant numbers of bits are discarded. Thanks to the hardware bias reduction, we can affirm that the probability of 1 is p ~ 1/2, and in this case the discard of input pairs is minimum, giving us an output stream that is 1/4 the length of the input on average.