Pages: [1] 2 3 ... 5   Go Down
Author Topic: The reliable but not very sexy way to seed random  (Read 7522 times)
0 Members and 1 Guest are viewing this topic.
Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 178
Posts: 12288
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Code:
#include <EEPROM.h>

void reseedRandom( void )
{
  static const uint32_t HappyPrime = 937;
  union
  {
    uint32_t i;
    uint8_t b[4];
  }
  raw;
  int8_t i;
  unsigned int seed;
 
  for ( i=0; i < sizeof(raw.b); ++i )
  {
    raw.b[i] = EEPROM.read( i );
  }

  do
  {
    raw.i += HappyPrime;
    seed = raw.i & 0x7FFFFFFF;
  }
  while ( (seed < 1) || (seed > 2147483646) );

  randomSeed( seed ); 

  for ( i=0; i < sizeof(raw.b); ++i )
  {
    EEPROM.write( i, raw.b[i] );
  }
}

void setup( void )
{
  reseedRandom();
}

Comments?
Logged

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 170
Posts: 12483
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Comments?

Good effort, will work in many cases.

Remarks:
- major -
it misses a entropy generator so the seeding is still deterministic, two arduino's (even with different sketches)  will get the same random() sequence and in case of the same sketch they will behave identical I think - unless eeprom is initially different, which normally isn't ...
Just putting 1,2,3,4,5, etc as next seed is equally deterministic; only less obfuscated and easier to program no need for happyPrime...

- major -   
it should contain a test that the value written to EEPROM is not the same as the one read from it at the start to prevent "seed-lock "
 
- minor -
the algorithm uses EEPROM to store the seedGenerator over restarts. This is a good idea, however, it uses the first 4 bytes which are the first to use if someone uses EEPROM in their sketch, might be better to use the last 4 bytes of the EEPROM, these are less likely to be used by a sketch.


Ideas
- readout (part of) the flashmemory (progmem ?) and do a CRC-16 or so => to be used as part of the seed. Then at least two different sketches will get a different seed.

- Every EEPROM might have slightly different timing (I'm thinking about the delay one needs after writing has a reason). Can it be used as source of entropy?


Rob
Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 178
Posts: 12288
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Good effort, will work in many cases.

Thanks.

Quote
it misses a entropy generator so the seeding is still deterministic, two arduino's (even with different sketches)  will get the same random() sequence and in case of the same sketch they will behave identical I think - unless eeprom is initially different, which normally isn't ...

Good point.  Some solutions... Initialize the EEPROM to a value from random.org.  Use a different odd prime number for different sketches.

Quote
Just putting 1,2,3,4,5, etc as next seed is equally deterministic; only less obfuscated and easier to program no need for happyPrime...

Ah, but I like using a Happy Prime.   smiley-cool

Quote
it should contain a test that the value written to EEPROM is not the same as the one read from it at the start to prevent "seed-lock "

I don't understand.
  
Quote
the algorithm uses EEPROM to store the seedGenerator over restarts. This is a good idea, however, it uses the first 4 bytes which are the first to use if someone uses EEPROM in their sketch, might be better to use the last 4 bytes of the EEPROM, these are less likely to be used by a sketch.

Agree.

Quote
- readout (part of) the flashmemory (progmem ?) and do a CRC-16 or so => to be used as part of the seed. Then at least two different sketches will get a different seed.

I like the idea but it may add too much start-up delay.  I'll investigate when I have more time.

Quote
- Every EEPROM might have slightly different timing (I'm thinking about the delay one needs after writing has a reason). Can it be used as source of entropy?

While the physical EEPROM may have slightly different timings, I don't think the CPU has a way to measure it.  The datasheet hints that EEPROM reads and writes are driven solely by the processor clock.  I vaguely recall considering EEPROM timing instead of watchdog jitter as an entropy source and rejecting the idea because of datasheet hints.
Logged

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 170
Posts: 12483
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
it should contain a test that the value written to EEPROM is not the same as the one read from it at the start to prevent "seed-lock "

I don't understand.

Simplified:
Suppose the algorithm reads the value 13 from EEPROM, and after calculating the seed it should not write 13 to the EEPROM back. Ideally all numbers from 0..maxint should have an equal chance.
A better look at your code made me clear that as long as the value read from EEPROM is positive, this will be guaranteed by the primeness of Happy Prime.


However if the value from EEPROM is negative, the Happy Prime will be added repeatedly until seed becomes > 0.
That means that the seed will allways be between 1 and HappyPrime. This gives a bias, which depends on the size of HappyPrime - values in the range of Maxint/2 might score better (did not analyse this)

I propose to add the following line :     raw.i = raw.i & 0x7FFFFFFF;   just after reading the value from EEPROM.

This changes the sign before the while loop and that means the number will be somewhere between 1 and maxint/2  instead of between 1 and HappyPrime. Bias gone. thats the idea.


Quote
- readout (part of) the flashmemory (progmem ?) and do a CRC-16 or so => to be used as part of the seed. Then at least two different sketches will get a different seed.
I like the idea but it may add too much start-up delay.  I'll investigate when I have more time.
instead of complete flash you could take only 256 (512?) bytes of the places where the sketch starts as that part might have the most entropy. By only using 256 bytes it is substantially faster than the complete 32K memory.


Quote
While the physical EEPROM may have slightly different timings, I don't think the CPU has a way to measure it.  The datasheet hints that EEPROM reads and writes are driven solely by the processor clock.  I vaguely recall considering EEPROM timing instead of watchdog jitter as an entropy source and rejecting the idea because of datasheet hints.
OK
Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 178
Posts: 12288
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Ideally all numbers from 0..maxint should have an equal chance ... this will be guaranteed by the primeness of Happy Prime

Exactly.  One and odd primes less than 2^32 - 1 are all candidates for incrementing the raw value.

Quote
However if the value from EEPROM is negative, the Happy Prime will be added repeatedly until seed becomes > 0.

Ah, but it doesn't.  I'll use 0x8000019F (-2147483233) as an example...

Code:
  do
  {
    // raw.i = 0x8000019F (-2147483233)
    raw.i += HappyPrime;
    // raw.i = 0x8000019F + 0x000003A9 = 0x80000548 (-2147482296)
    seed = raw.i & 0x7FFFFFFF;
    // seed = 0x80000548 & 0x7FFFFFFF = 0x00000548 (1352)
  }
  // seed (1352) is greater than or equal to 1 and less than or equal to 2147483646 (0x7FFFFFFE) so the loop breaks
  while ( (seed < 1) || (seed > 2147483646) );

The four values that are rejected are...

raw.i = 0x00000000 (seed = 0)
raw.i = 0x7FFFFFFF (seed = 2147483647)
raw.i = 0x80000000 (seed = 0)
raw.i = 0xFFFFFFFF (seed = 2147483647)

...which is nicely symmetric (two "positive" raw values and two "negative" raw values).  So, the raw sequence is full cycle minus the four rejected values with the 2,147,483,646 seeds each occurring twice before the whole thing repeats.

Quote
instead of complete flash you could take only 256 (512?) bytes of the places where the sketch starts as that part might have the most entropy

I'm thinking loop and setup are good candidates.
Logged

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 170
Posts: 12483
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset


Got it , the loop will only be executed once for all but 4 scenario's these latter run twice.
Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 178
Posts: 12288
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset


The latest version...

Code:
#include <avr/eeprom.h>
#include <stdlib.h>


void setup( void );

void loop( void );


void reseedRandom( uint32_t* address )
{
  static const uint32_t HappyPrime = 937;
  uint32_t raw;
  unsigned long seed;
  uint32_t* code;
  uint8_t i;
  uint32_t offset;

  offset = 0;

  /* The following adds 1050 - 922 = 128 bytes of code.  Is it worth the 128 bytes? */
/* */
  code = (uint32_t*)(setup);
  i = 0;
  do
  {
    offset ^= pgm_read_dword( code );
    ++code;
    ++i;
  }
  while ( i != 0 );
/* */

  raw = eeprom_read_dword( address );

  do
  {
    raw += HappyPrime;
    seed = (raw + offset) & 0x7FFFFFFF;
  }
  while ( (seed < 1) || (seed > 2147483646) );

  srandom( seed ); 

  eeprom_write_dword( address, raw );
}

inline void reseedRandom( unsigned short address )
{
  reseedRandom( (uint32_t*)(address) );
}


uint32_t reseedRandomSeed EEMEM = 0xFFFFFFFF;


void setup( void )
{
/* */
  reseedRandom( &reseedRandomSeed );
/* */
/*
  reseedRandom( (unsigned short) 0 );
*/
}


void loop( void )
{
}

Eliminating EEPROM.h made it considerably smaller.

The randomSeed declaration is a problem.  The parameter is unsigned int (16 bits) instead of a 32 bit data-type so srandom is called directly.

Including the first 256 bytes of setup in the seed adds 128 bytes of code.  The cost seems too high for the benefit.

Thoughts?
Logged

nr Bundaberg, Australia
Offline Offline
Tesla Member
***
Karma: 121
Posts: 8458
Scattered showers my arse -- Noah, 2348BC.
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Thoughts?
I may have if I knew WTF it was doing smiley

Am I correct in saying that it

XORs 256 bytes of program flash starting at setup()
reads a value from EEPROM
uses that value and the result of the XORing plus HappyPrime to arrive at "seed"
stores the seed value back into EEPROM

______
Rob
« Last Edit: July 29, 2011, 07:17:07 am by Graynomad » Logged

Rob Gray aka the GRAYnomad www.robgray.com

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 170
Posts: 12483
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset


Looks good, I like the improvement but there are still some points

The code fetching the code of the sketch to generate a part of the seed value will produce the same offset every time for the same sketch. But it will probably be very different for every different sketch. (or compiler option/version etc) so it is definitely a big step in the right direction.
Code:
code = (uint32_t*)(setup);
  i = 0;
  do
  {
    offset ^= pgm_read_dword( code );
    ++code;
    ++i;
  }
  while ( i != 0 );

I think it is better to use an additional offset to start at a different place...

Code:
code = (uint32_t*)(setup) + some_8bit_number;
This changes the value for offset if some_8bit_number is different.
The natural option is a byte from the last seed from EEPROM.

just a last thought..
Have you tried to use pgm_read_byte()  4 times iso  pgm_read_dword() ? As it is more elementary it might result in a smaller footprint... (remove long/ dword math)

Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 178
Posts: 12288
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Thoughts?
I may have if I knew WTF it was doing smiley

I'll include more comments (than one) next time.   smiley-red

Quote
Am I correct in saying that it ... XORs 256 bytes of program flash starting at setup()

Yes.  In theory, this gives a different set of seeds for different sketches.

Quote
reads a value from EEPROM

Yes.  The previous "raw" value is read.  The "raw" value is almost the seed.

Quote
uses that value and the result of the XORing plus HappyPrime to arrive at "seed"

Yes.

Quote
stores the seed value back into EEPROM

Almost.  The "raw" value is stored back into EEPROM for next time.

For not knowing what it was doing you did a good job describing what it does.   smiley-wink
Logged

Seattle, WA
Offline Offline
Full Member
***
Karma: 0
Posts: 174
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

You could always stuff an EEPROM with values from hotbits (http://www.fourmilab.ch/hotbits/) as an entropy source; at least the initial seed would be non-deterministic.  Store an offset to the next unused byte in EEPROM and update it each time a byte is fetched.  Having net access (via whatever mechanism) also opens up some interesting RNG possibilities with hotbits, and memory devices are cheap; a flash card could be filled via hotbits and one could eschew the use of an RNG altogether.

For those new to the topic, hotbits is a service ran out of Switzerland which uses a cesium-137 gamma source coupled with a commercial radiation detector to generate random bits.  More detail here if interested: http://www.fourmilab.ch/hotbits/how3.html

No affiliation with the site here, just a math/cryptography geek.
Logged

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 178
Posts: 12288
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

You could always stuff an EEPROM with values from hotbits (http://www.fourmilab.ch/hotbits/) as an entropy source; at least the initial seed would be non-deterministic.

An excellent idea.

For those with an Ethernet Shield, a combination of this from robtillaart and a bit of EEPROM code can be used to automate the initialization.

Quote
Store an offset to the next unused byte in EEPROM and update it each time a byte is fetched.

You lost me.

Quote
Having net access (via whatever mechanism) also opens up some interesting RNG possibilities with hotbits, and memory devices are cheap; a flash card could be filled via hotbits and one could eschew the use of an RNG altogether.

Also both good ideas.

Quote
For those new to the topic, hotbits

http://www.random.org/ is also a good source of entropy and information.
Logged

Seattle, WA
Offline Offline
Full Member
***
Karma: 0
Posts: 174
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Store an offset to the next unused byte in EEPROM and update it each time a byte is fetched.

You lost me.


Sorry, posting while distracted smiley-razz  It's a technique I've used in a couple of projects - to ensure that a given byte from the EEPROM is only used once, the first byte(s) of the EEPROM store an offset to the next unused byte of random data.  When byte(s) are read from EEPROM, the offset is updated accordingly.  On a slow microcontroller like the AVR you don't really have contention issues, so a simple approach like this works fine.

EEPROM is not particularly fast; if speed is a concern then FRAM or flash may be a more appropriate storage medium.  For occasional access like refreshing a random seed, EEPROM is just fine (not to mention much cheaper)
Logged

Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 178
Posts: 12288
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Looks good

Thanks.

Quote
The code fetching the code of the sketch to generate a part of the seed value will produce the same offset every time for the same sketch.  But it will probably be very different for every different sketch. (or compiler option/version etc) so it is definitely a big step in the right direction.

I'm not convinced it is worth the cost.  I'm also not convinced it is necessary.  But I can't yet see far enough down each road to know for certain.

Someone who wants a unique random set for each board or sketch can simply initialize the "raw" value (the value stored in EEPROM) with a truly random value (or a constant).  I'm leaning towards providing that as an example and removing the code / offset part.

Quote
I think it is better to use an additional offset to start at a different place...

Why?

Quote
Have you tried to use pgm_read_byte()  4 times iso  pgm_read_dword() ? As it is more elementary it might result in a smaller footprint... (remove long/ dword math)

That does help (20 bytes trimmed).  Thanks for the suggestion.

The majority of the code added is from random / srandom then the EEPROM stuff.  The reseedRandom does not pull in any 32 bit library code.  The 32 bit operations all get turned into four machine instructions (add, exclusive-or, load, store).  As far as I can tell, the only thing of significant size that can optionally be removed from reseedRandom is the code / offset part.
Logged

Rapa Nui
Offline Offline
Edison Member
*
Karma: 53
Posts: 1991
Pukao hats cleaning services
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

..some atmels(e.g 328)  have a temperature sensor on the chip - what about to use the temp value or the noise it produces? P/
Logged

Pages: [1] 2 3 ... 5   Go Up
Jump to: