Beta testers needed for a new library that generates true random numbers

I have just published a new library that uses the jitter associated with the watch dog timer and timer one on an arduino to generate truly random numbers. Preliminary testing indicates that this library generates random sequences with far greater entropy and uniformity than either the randomSeed(analogRead(0)) method or the TrueRandom library which demonstrably does not generate true random numbers.

I would like folks to download and test the software for defects. While I have tested the software and the algorithms used on all current arduino hardware, UNO (dip and smd), MEGA (R3), and the 32u4 used in the leonardo, I would like to obtain test data from as many different examples of these chips as possible. To that end I would appreciate any an all folks emailing me (at wandrson01 at gmail.com) with screen captures of the following sketch, along with the type of arduino used (UNO, MEGA, LEONARDO, etc... along with indication if the chip is a smd or dip version). To be statistically significant the samples need to contain at least 25,000 lines of capture (a few hours run time), but 250,000 (a day and a half) would be even better. I will collect the results from these samples and publish the statistical performance on the libraries web site.

While it appears that the library is producing cryptographically useful random numbers, the test data needs to be from a much larger sample to verify that. Here is the test script I need run to collect these samples:

// Generate_Random_Numbers - This sketch makes use of the Entropy library
// to produce a serial of random 8 bit integers (bytes) that are streamed
// to the serial port of the arduino
//
// Copyright 2012 by Walter Anderson
//
// This file is part of Entropy, an Arduino library.
// Entropy is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// Entropy is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with Entropy.  If not, see <http://www.gnu.org/licenses/>.

#include <Entropy.h>

void setup()
{
  Serial.begin(115200);

  // This routine sets up the watch dog timer with interrupt handler to maintain a
  // pool of real entropy for use in sketches.  This mechanism is relatively slow
  // since it will only produce a little less than two 32-bit random values per 
  // second.
  Entropy.Initialize();

}

void loop()
{
  // When the random method is called with a single integer parameter it will return
  // a random integer that is in the range: 0 <= random_value < integer parameter
  Serial.println(Entropy.random());
}

The library (zip file) is available from the download page (a link to the library is on the home page) at Google Code Archive - Long-term storage for Google Code Project Hosting. as well as an attachment to this post.

The source of the library is also on this site as a git repository and includes the draft of the documentation I have prepared. I welcome any and all comments. Any assistance is appreciated.

Entropy.v0.5.zip (100 KB)

I've got a clone mega 2560 I'll set up soon.
But before I do I might try and find a standalone serial app, don't want to use the IDE serial monitor.

EDIT: Is running now, I have at least 48 hours I can donate to this.

I'm not far into this, so I will ask; Do you want the data as binary instead, seems easier to work with when analysing, not to mention the size.

#include <Entropy.h>

void setup()
{
  Serial.begin(115200);
  Entropy.Initialize();
}

void loop()
{
  uint32_t u_Data = Entropy.random();
  char *c_Data = ( char* ) &u_Data;
  for( char c_Index = 0 ; c_Index < sizeof( uint32_t ) ; Serial.write( c_Data[ c_Index++ ] ) );
}

Thank you. I chose the ascii data format for two reasons. One, it avoided any issues with platform specific binary storage formats, which shouldn't matter if the data is truly random, but if not could introduce bias. And secondly, I wanted it easy for everyone to see what data they were providing me, so it was clear that it wasn't a virus or such.

Feel free to use a zip or other archiving program to compress the ascii file when you email it. Thanks!

pYro_65:
I've got a clone mega 2560 I'll set up soon.
But before I do I might try and find a standalone serial app, don't want to use the IDE serial monitor.

EDIT: Is running now, I have at least 48 hours I can donate to this.

I use minicom on my linux boxes to perform the screen captures. On windows, older versions will have hyper terminal which would work, new versions of windows could make use of bray's terminal to perform that function. Also bray's is very useful for other arduino related communication

http://www.smileymicros.com/download/term20040714.zip?&MMN_position=42:42

I have added the initial test files to the google code site for the project. Google Code Archive - Long-term storage for Google Code Project Hosting.

I have generated 1,000,000 bytes of entropy on four different Arduino's so far... Here is a summary of the initial results

ID Device Type Sample Size Entropy Chi square P-value
1 Arduino Uno R3 DIP 1,000,000 7.999797 281.39 0.1231
3 Arduino Uno R3 DIP 1,000,000 7.999819 251.38 0.5524
3 Arduino Uno SMD 1,000,000 7.999809 265.27 0.3163
4 Arduino Mega R3 SMD 1,000,000 7.999813 258.51 0.4268

The more of these tests we can run on as many sample Arduino's the better. This may be cryptographically useful RNG approach.

Damn, power went out while I was at work, I was hoping it would run to one million. I will sort out the e-mail this afternoon.
202496 samples taken.

I have an attiny85 I'm planning to set up for a project, do you know if it is compatible??

The library/test should work on any AVR mega/tiny with enough memory and a serial port, so yes I believe it will work on the 85.

I too have had more than a few of my early tests come to a conclusion because of power outages... :wink: The data is still useful!

pYro_65:
I have an attiny85 I'm planning to set up for a project, do you know if it is compatible??

avr-g++ -c -g -Os -Wall -fno-exceptions -ffunction-sections -fdata-sections -mmcu=attiny85 -DF_CPU=8000000L -MMD -DUSB_VID=null -DUSB_PID=null -DARDUINO=101 -I/home/wandrson/sketchbook/hardware/tiny/cores/tiny -I/home/wandrson/sketchbook/libraries/Entropy /tmp/build9159983860887415963.tmp/sketch_jun06b.cpp -o /tmp/build9159983860887415963.tmp/sketch_jun06b.cpp.o 
In file included from /home/wandrson/sketchbook/hardware/tiny/cores/tiny/Stream.h:24:0,
                 from /home/wandrson/sketchbook/hardware/tiny/cores/tiny/TinyDebugSerial.h:31,
                 from /home/wandrson/sketchbook/hardware/tiny/cores/tiny/WProgram.h:17,
                 from /home/wandrson/sketchbook/hardware/tiny/cores/tiny/Arduino.h:4,
                 from sketch_jun06b.cpp:3:
/home/wandrson/sketchbook/hardware/tiny/cores/tiny/Print.h:37:0: warning: "BIN" redefined
/usr/lib/gcc/avr/4.5.3/../../../avr/include/avr/iotnx5.h:55:0: note: this is the location of the previous definition
avr-g++ -c -g -Os -Wall -fno-exceptions -ffunction-sections -fdata-sections -mmcu=attiny85 -DF_CPU=8000000L -MMD -DUSB_VID=null -DUSB_PID=null -DARDUINO=101 -I/home/wandrson/sketchbook/hardware/tiny/cores/tiny -I/home/wandrson/sketchbook/libraries/Entropy -I/home/wandrson/sketchbook/libraries/Entropy/utility /home/wandrson/sketchbook/libraries/Entropy/Entropy.cpp -o /tmp/build9159983860887415963.tmp/Entropy/Entropy.cpp.o 
/home/wandrson/sketchbook/libraries/Entropy/Entropy.cpp: In member function ‘void EntropyClass::Initialize()’:
/home/wandrson/sketchbook/libraries/Entropy/Entropy.cpp:47:3: error: ‘WDTCSR’ was not declared in this scope
/home/wandrson/sketchbook/libraries/Entropy/Entropy.cpp: In member function ‘uint32_t EntropyClass::random(uint32_t, uint32_t)’:
/home/wandrson/sketchbook/libraries/Entropy/Entropy.cpp:152:12: warning: unused variable ‘slice’
/home/wandrson/sketchbook/libraries/Entropy/Entropy.cpp: In function ‘void __vector_12()’:
/home/wandrson/sketchbook/libraries/Entropy/Entropy.cpp:179:39: error: ‘TCNT1L’ was not declared in this scope

It looks like the initialization code for the library will need some device specific modifcations. The errors I received when I tried to load it to the one ATtiny85 I have indicate that some of the registers I use for the library have different names for the tiny85, WDTCSR -> WDTCR and TCNT1L -> TCNT1

I will try and get the library modified to take that into account this weekend.

I suggest using timer 0 on the ATtiny85.

For the watchdog control register, Libc provides a processor independent macro. There are other things in wdt.h that may help like wdt_enable...

#include <avr\wdt.h>

_WD_CONTROL_REG |= (1<<_WD_CHANGE_BIT) | (1<<WDE);

http://www.nongnu.org/avr-libc/user-manual/group__avr__watchdog.html

CodingBadly:

Thank you for the information on the WDT device independent information. Would you provide some more information on why you would suggest using TMR0 on the ATtiny85? I haven't run any of my raw WDT tests on that chip, but noticed that it does have an 8-bit timer1. Since TMR0 on the standard Arduino's showed some potential bias problems, which as we have discussed, is probably due to TMR0 being used for an interrupt to maintain micros, etc.. I assumed that would be a similar problem on your ATtiny85 core (I haven't looked at your core code in any detail).

Also, I posted this question in the Programming forum, but I had some difficulty getting my #ifdef to recognize the ATtiny85 was being compiled for... I would appreciate advice on how to address that issue.

Walt

wanderson:
Would you provide some more information on why you would suggest using TMR0 on the ATtiny85?

Standard Core and Tiny Core both configure the millis timer for fast PWM and the other timers for phase-correct PWM (this is probably true for all Arduino cores). Standard Core uses timer 0 for millis. Tiny Core uses timer 1 for millis. Basically, the timer 0 Tiny Core configuration should very closely match the timer 1 Standard Core configuration. I assume that will improve your library's prospects.

I haven't run any of my raw WDT tests on that chip, but noticed that it does have an 8-bit timer1.

It does but it is configured to run like timer 0 on the Standard Core.

Okay, thanks.

Well with CodingBadly's assistance I have compiled the library and am testing it on an ATTiny85, specifically, I had a Sparkfun AVR Stick (SparkFun AVR Stick - DEV-09147 - SparkFun Electronics) which I am running the same test sketch as above on. The stick and an FTDI cable was all I needed. In 36-48 hours I should have a sample dataset from the ATtiny, using CodingBadly's core for that machine.

If someone wants to try this with the ATtiny85, I am attaching the one library file that needed to change. The file is also on the google code page, but I haven't updated the zip file there yet.

Entropy.cpp (8.52 KB)

Q&D perl script to capture the data:

#!/usr/bin/perl
$|++;

use Device::SerialPort;

$PortObj = new Device::SerialPort ("/dev/ttyUSB2")
  || die "Can't open: $^E / $!\n"; 

$PortObj->user_msg(ON);
$PortObj->databits(8);
$PortObj->baudrate(115200);
$PortObj->parity("none");
$PortObj->stopbits(1);
$PortObj->handshake("rts");
$PortObj->read_const_time(100);

do {
  ($count_in, $string_in) = $PortObj->read(100);
} until ($string_in =~ /\n/);

while(($count_in, $string_in) = $PortObj->read(100)) {
  if ($count_in > 0) {
    print $string_in;
  }
}

Just direct STDOUT to a file, e.g. "perl script.pl > logfile.txt".

One more test result added for the library. While only a small number of devices have been test so far, none have failed the basic tests.

ID Device Type Sample Size Entropy Chi square P-value
1 Arduino Uno R3 DIP 1,000,000 7.999797 281.39 0.1231
3 Arduino Uno R3 DIP 1,000,000 7.999819 251.38 0.5524
3 Arduino Uno SMD 1,000,000 7.999809 265.27 0.3163
4 Arduino Mega R3 SMD 1,000,000 7.999813 258.51 0.4268
5 Adafruit ATmega32u4 breakout board SMD 1,000,000 7.999811 261.87 0.3703

While these tests have a lot of data (1,000,000 bytes), tests of only 100,000 bytes (25,000 samples) would be useful as well. Please spend a couple of hours running the test script and send me your results. In order to be confident that the library, and the methodology, are producing truly uniform random numbers we need a lot more samples from a lot more devices. In particular, we need more samples from the same type of chips as already sampled.

Do you have a link to the stuff you are using to calculate the entropy value?
If it is producing true random numbers, can it be made faster by masking blocks of random numbers together; seeing as each block is truly random?

EDIT:
I have attached my capture, it is still a binary file. I have to still convert it to strings for you but you can grab this if you wanted now.
I set up my mega to gather 7kb blocks at a time before sending to the PC.

SerialIn.data (791 KB)

pYro_65:
Do you have a link to the stuff you are using to calculate the entropy value?
If it is producing true random numbers, can it be made faster by masking blocks of random numbers together; seeing as each block is truly random?

I am using a python script I wrote to perform the calculations used by the ent program (HotBits Statistical Testing) by John Walker. My script (attached) performs the same tests, except for the pi calculation, and combines several of the option along with producing a couple of charts of the data. I am not sure where you would obtain blocks of random numbers to mask with those produced by this algorithm, so I can't comment if it would be faster. If you need, non cryptographically secure, random numbers, that still have useful properties, at faster speeds, the best method is to use this library to re-seed the avr-libc random function whenever it has a value available. Here is a sample sketch to illustrate what I mean:

#include <Entropy.h>

void setup()
{
  Entropy.Initialize();
  randomSeed(Entropy.random());
}

void loop()
{
  if (Entropy.Available > 0)
      randomSeed(Entropy.random());
  // Use normal random function for getting random numbers
  // ie. some_value = random();    
}

pYro_65:
I have attached my capture, it is still a binary file. I have to still convert it to strings for you but you can grab this if you wanted now.
I set up my mega to gather 7kb blocks at a time before sending to the PC.

I am attaching the results of the tests for your data. This is the first sample that shows some concern, the p-value for the chi-square test is only 0.0189. Can you send me the full text on the smd chip you used for this--the label on the chip? If the mechanism is producing truly random number we should get samples that exceed the normal acceptible p-values as George Marginalia says himself in his diehard series of tests:

NOTE: Most of the tests in DIEHARD return a p-value, which
should be uniform on ![0,1) if the input file contains truly
independent random bits. Those p-values are obtained by
p=F(X), where F is the assumed distribution of the sample
random variable X---often normal. But that assumed F is just
an asymptotic approximation, for which the fit will be worst
in the tails. Thus you should not be surprised with
occasional p-values near 0 or 1, such as .0012 or .9983.
When a bit stream really FAILS BIG, you will get p's of 0 or
1 to six or more places. By all means, do not, as a
Statistician might, think that a p < .025 or p> .975 means
that the RNG has "failed the test at the .05 level". Such
p's happen among the hundreds that DIEHARD produces, even
with good RNG's. So keep in mind that " p happens".

SerialIn.data.stats.txt (15.9 KB)

analyze.py (8.7 KB)

I am not sure where you would obtain blocks of random numbers

I was thinking along the lines of using your algorithm to create two sets of random numbers. And masking them together to make a third set of random numbers. Would provide more numbers per minute for something like a random number provider.

Its just a thought, but I would assume ( with little analysis mind you ) that two truly random numbers combined to make a third should be no less random than generating the third value from scratch? maybe...