Go Down

Topic: [solved] Shuffle an array. (Read 4743 times) previous topic - next topic

SilverCG

Mar 12, 2010, 02:09 am Last Edit: Mar 12, 2010, 05:58 am by silvercg Reason: 1
hey everyone, i'm working on another project which is almost complete. I'm just adding a little style to it.

so say i have an array like:

int list[] = {0,1,0,0,1,0,1,1,0};

so what would be the best way to randomly shuffle the positions
  • - [8]?

    thanks for you time,
    Dave.

raron

I won't claim to know the best way, but some sort of "bubble unsort" algorithm came to mind:

Code: [Select]
int list[] = {0,1,0,0,1,0,1,1,0}

for (int a=0; a<9; a++)
{
 r = random(a,8) // dont remember syntax just now, random from a to 8 included.
 int temp = list[a];
 list[a] = list[r];
 list[r] = temp;
}


remember to seed the randomizer, maybe by time or an unused analog input.

Btw, if the values you are using are just 1 and 0, it would be a little more memory efficient to use an int and bit manipulation instead (since its 9 bits, not 8), but this works too.

SilverCG

thanks buddy that does. the trick.

rFree

That won't shuffle very well at all.

Below should work well, given a decent rand() implementation. It's certainly not written with speed in mind.

Code: [Select]

// generate a value between 0 <= x < n, thus there are n possible outputs
int rand_range(int n)
{
   int r, ul;

   ul = RAND_MAX - RAND_MAX % n;
   while ((r = rand()) >= ul)
       ;
   
   return r % n;
}


void shuffle_swap(size_t index_a, size_t index_b, void *array, size_t size)
{
   char *x, *y, tmp[size];

   if (index_a == index_b)
       return;

   x = array + index_a * size;
   y = array + index_b * size;

   memcpy(tmp, x, size);
   memcpy(x, y, size);
   memcpy(y, tmp, size);
}

// shuffle an array using fisher-yates method, O(n)
void shuffle(void *array, size_t nmemb, size_t size)
{
   int r;
   
   while (nmemb > 1) {                                                                      
       r = rand_range(nmemb--);                                                              
       shuffle_swap(nmemb, r, array, size);
   }
}

raron

#4
Mar 14, 2010, 01:31 am Last Edit: Mar 14, 2010, 01:32 am by raron Reason: 1
Thanks to rFree's post I wanted to look a little closer at this :) Sorry if all this seems a bit long for such a simple thing.
Quote
That won't shuffle very well at all.

I'd say it shuffles well enough, unless you are into scientific random probability statistics stuff. Or some such thing :P OK it had a minor flaw in swapping when not needed, and scanning the array to the end (again swapping with itself). But this is almost not worth mentioning for a first draft. I changed it in the versions below. "Bubble unsort" might be a misnomer too, i just though I remembered a very similar sorting method, but I see now that a bubble sort is different. At any rate I'd say both shufflers use the same principle, and now I know it has an impressive-sounding name: The Fisher[ch8211]Yates shuffle http://en.wikipedia.org/wiki/Fisher[ch8211]Yates_shuffle

So I'd say you posted an almost identical algorithm, with the addition of the rand_range() function, which alters the random output somehow. This is why it is different. Which is better I don't know. Also note;  when I changed the "bubble unsort" to scan the array in the same direction as your fisher-yates (ie backwards), the outputs where exactly identical when I tested them on my PC (code::blocks in ubuntu). However on the Arduino I got different results. Not sure why that is. I had to change the arduino version a bit, but I don't think that's the reason (no guarantee that I didn't make any errors though).

But anyway thanks for some interesting code!


The difference (on the arduino version at least) is all due to the rand_range() function, which doesn't have anything to do with Fisher-Yates as far as I can see. Otherwise the two shuffling methods are really the same. I get the same result when implementing rand_range() (and scanning the array the same way) on the arduino too. Not surprising since they are basically the same.
Code: [Select]

void bubbleUnsort(int *list, int elem)
{
 for (int a=elem-1; a>0; a--)
 {
   //int r = random(a+1);
   int r = rand_range(a+1);
   if (r != a)
   {
     int temp = list[a];
     list[a] = list[r];
     list[r] = temp;
   }
 }
}



// generate a value between 0 <= x < n, thus there are n possible outputs
int rand_range(int n)
{
   int r, ul;
   ul = RAND_MAX - RAND_MAX % n;
   while ((r = random(RAND_MAX+1)) >= ul);
   return r % n;
}




Btw, in case someone is interested, these are the test programs I made for the arduino and code::blocks

On the arduino, functions with arguments like this:
Code: [Select]
void shuffle_swap(size_t index_a, size_t index_b, void *array, size_t size)
became
Code: [Select]
void shuffle_swap(int index_a, int index_b, int *array, int size)
Since I don't know how else to do it. But anyway this doesn't matter for testing shuffling routines.

Also had to add a cast to a char pointer in both versions in the shuffle_swap() function. But I don't belive that matters, as the shuffle_swap() takes care of the array's data type size. (Except possibly in the modified arduino version).


This is the arduino code:
Code: [Select]

// Arduino comparative shuffle test
// Fisher-yates vs "bubble unsort"
// 2010.03.13 raron


int anArray[] = {0,1,2,3,4,5,6,7,8};
int elements = sizeof(anArray) / sizeof(int);

int anotherArray[] = {0,1,2,3,4,5,6,7,8};
int arraySize = sizeof(anotherArray) / sizeof(int);



void setup()
{
 Serial.begin(9600);
 delay(2000);
 int noise = analogRead(2);
 randomSeed(noise);

 /*
 Serial.print("RAND_MAX is ");
 Serial.println(RAND_MAX,DEC);

 Serial.println("---- Original ----: ");
 printArray(anotherArray, arraySize);
 Serial.println("");
 */

 // bubble unsort shuffle
 bubbleUnsort(anotherArray, arraySize);
 Serial.println("---- bubble unsorted ---- ");
 printArray(anotherArray, arraySize);


 Serial.println("");
 Serial.println("");
 randomSeed(noise); // seeding with same nr. again


 // Fisher-yates
 shuffle(anArray, elements, sizeof(int));
 Serial.println("---- fisher-yates shuffled ---- ");
 printArray(anArray, elements);

}


void loop()
{
}




void printArray(int *array, int elem)
{
   for (int i=0; i<elem; i++)
   {
     Serial.print("array[ ");
     Serial.print(i);      Serial.print(" ] = ");
     Serial.println(array[i]);
   }
}


void bubbleUnsort(int *list, int elem)
{
 for (int a=elem-1; a>0; a--)
 {
   int r = random(a+1);
   //int r = rand_range(a+1);
   if (r != a)
   {
     int temp = list[a];
     list[a] = list[r];
     list[r] = temp;
   }
 }
}


// generate a value between 0 <= x < n, thus there are n possible outputs
int rand_range(int n)
{
   int r, ul;
   ul = RAND_MAX - RAND_MAX % n;
   while ((r = random(RAND_MAX+1)) >= ul);
   //r = random(ul);
   return r % n;
}


void shuffle_swap(int index_a, int index_b, int *array, int size)
{
   char *x, *y, tmp[size];

   if (index_a == index_b) return;

   x = (char*)array + index_a * size;
   y = (char*)array + index_b * size;

   memcpy(tmp, x, size);
   memcpy(x, y, size);
   memcpy(y, tmp, size);
}

// shuffle an array using fisher-yates method, O(n)
void shuffle(int *array, int nmemb, int size)
{
   int r;
   
   while (nmemb > 1) {                                                                      
       r = rand_range(nmemb--);                                                              
       shuffle_swap(nmemb, r, array, size);
   }
}


And this is the code::blocks version:
Code: [Select]
// PC comparative shuffle test
// Fisher-yates vs "bubble unsort"
// 2010.03.13 raron

#include <iostream> // cout
#include <cstdlib>  // rand()
#include <ctime>    // time()
#include <string.h> // memcpy

using namespace std;

int anArray[] = {0,1,2,3,4,5,6,7,8};
int elements = sizeof(anArray) / sizeof(int);

int anotherArray[] = {0,1,2,3,4,5,6,7,8};
int arraySize = sizeof(anotherArray) / sizeof(int);


void printArray(int *array, int elem);
void bubbleUnsort(int *list, int elem);
int rand_range(int n);
void shuffle_swap(size_t index_a, size_t index_b, void *array, size_t size);
void shuffle(void *array, size_t nmemb, size_t size);


int main()
{
   int timesec = time(0);
   srand((unsigned)timesec); // random seed based on seconds since midnight


   /*
   cout << "RAND_MAX is " << RAND_MAX << "  with size: " << sizeof(RAND_MAX) << "bytes" << endl;
   cout << "---- Original array ----: " << endl;
   printArray(anotherArray, arraySize);
   cout << endl;
   */

   // bubble unsort
   bubbleUnsort(anotherArray, arraySize);
   cout << "---- Bubble unsorted ---- " << endl;
   printArray(anotherArray, arraySize);

   cout << endl;
   srand((unsigned)timesec);  // seeding with the same nr again

   // fisher-yates
   shuffle(anArray, elements, sizeof(int));
   cout << "---- Fisher-yates shuffled ---- " << endl;
   printArray(anArray, elements);


   return 0;
}



void printArray(int *array, int elem)
{
   for (int i=0; i<elem; i++) cout << "array[ " << i << " ] = " << array[i] << endl;
}


void bubbleUnsort(int *list, int elem)
{
 for (int a=elem-1; a>0; a--)
 {
   int r = rand()%(a+1);
   //int r = rand_range(a+1); // seemingly doesn't matter which one is used on the PC
   if (r != a)
   {
       int temp = list[a];
       list[a] = list[r];
       list[r] = temp;
   }
 }
}


// generate a value between 0 <= x < n, thus there are n possible outputs
int rand_range(int n)
{
   int r, ul;
   ul = RAND_MAX - RAND_MAX % n;
   while ((r = rand()) >= ul); // r = rand()%ul;
   //r = rand()%ul;
   return r % n;
}


void shuffle_swap(size_t index_a, size_t index_b, void *array, size_t size)
{
   char *x, *y, tmp[size];

   if (index_a == index_b) return;

   x = (char*)array + index_a * size; // had to cast array to a char pointer
   y = (char*)array + index_b * size;

   memcpy(tmp, x, size);
   memcpy(x, y, size);
   memcpy(y, tmp, size);
}

// shuffle an array using fisher-yates method, O(n)
void shuffle(void *array, size_t nmemb, size_t size)
{
   int r;

   while (nmemb > 1) {
       r = rand_range(nmemb--);
       shuffle_swap(nmemb, r, array, size);
   }
}


EDIT: Hey, 100th post :)

rFree

Quote
I'd say it shuffles well enough,


Ahh, I owe you an apology. I read something else, code I've seen a few times before....Where people loop and swap between random 2 indices, but now I see only 1 of your indices is random.

I wonder what difference there is. Fisher-yates is well documented and studied, if only that.

Just looking at the arduino-0017 source now, the random(min,max) function introduces modulus bias.

The code was C99, size_t is defined in stddef.h, among other places. You'll notice in C sizeof returns a value of type size_t, strlen, malloc, etc...Also return size_t.

SilverCG

thanks for your input..  I understand it alot better now. I used this code on my clock http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1268523298

and in the past 2 days i've seen some very random locations


rFree

I found your method in a C++ textbook and had closer look at wikipedia where I found the answer we're looking for.

http://en.wikipedia.org/wiki/Shuffling

Quote

Poorly implemented Knuth shuffles

Care needs to be taken in implementing the Knuth shuffle; even slight deviations from the correct algorithm will produce biased shuffles. For example, working your way through the pack swapping each card in turn with a random card from any part of the pack is an algorithm with nn different possible execution paths, yet there are only n! permutations, and a counting argument based on using the pigeonhole principle on assignments of execution paths to outcomes shows that this algorithm cannot produce an unbiased shuffle, unlike the true Knuth shuffle, which has n! execution paths which match up one-to-one with the possible permutations.

raron

#8
Mar 15, 2010, 12:50 pm Last Edit: Mar 15, 2010, 12:51 pm by raron Reason: 1
And here I was thinking the subject ended on the 2nd post above  :D

Quote
Ahh, I owe you an apology


No problem. It just made me dust off code::block a bit and try C++ again, and read up on some random stuff (literally, kind of), not a bad thing!

Anyway, I wasn't entirely clear on the text you (rFree) quoted, so I googled up this, which explained it very well: http://dev.netcetera.org/blog/2007/08/24/good-knuth-bad-knuth/

Quoting from the "good Knuth" perl script from that link (I don't know perl, but I can just about see whats going on there)
Quote
sub shuffle {
   my @array = @_;
   my $length = scalar(@array);

   for (my $i = $length - 1; $i > 0; $i--) {
       my $r = int(rand() * ($i + 1));
       @array[$i, $r] = @array[$r, $i];
   }

   return @array;
}


IMHO I had a pretty good Knuth then.

Still I can't help but feel the name is more impressive-sounding and almost longer than the algorithm itself :P

Go Up