Secure Computer Shuffling

I just re-read an ancient article referenced on reddit.com entitled How We Learned to Cheat at Online Poker: A Study in Computer Security. The article points up that computer shuffling is hard to get secure in this situation, and illustrates three seperate bad security flaws in the algorithm used by a popular computer Poker site in 1999. Unfortunately, it then goes on to suggest a "secure" shuffle that actually is quite insecure…

I've always suspected that the article deliberately left a hole in the algorithm, so that the authors could have another round of cheating at online Poker, but of course I can't prove that—in any case, it is neither here nor there. I won't bore you with the details of the article; you can read it yourself.

What I will do here is briefly outline the algorithms behind the shuffle() function provided by Keithp's and my Nickle programming language, and illustrate why I believe it to be secure. Review is of course welcome.

First, let us assume that we have some source of truly random numbers. What we want to do is produce a permutation of a sequence, chosen in a uniform random way from the set of all possible sequences. The traditional shuffle algorithm, as recorded by Knuth, is captured in the Nickle implementation of shuffle().

  public void shuffle(&poly[*] a) {
    int na = dim(a);
    for (int i = 0; i < na - 1; i++) {
      int j = randint(na - i) + i;
      poly tmp = a[i];
      a[i] = a[j];
      a[j] = tmp;
    }
  }

In essence, the algorithm selects a random element of the array a with index greater than or equal to a given element and exchanges (swaps) the elements at those indices. By doing this randomized exchange with each element in turn, one generates a random permutation.

The proof of this is not terribly hard, actually.

Premise:
Imagine that you have a random permutation of a set of elements (i.e., a uniformly-chosen permutation of the sequence obtained by listing the elements of the set in some canonical order). If you then add element n+1 at a position chosen uniformly from the set of all possible positions, the result is a random permutation of the set including the new element.
Proof Sketch:
Consider the operation of the algorithm when i=0. This randomly swaps the element at index i with a uniformly chosen element of the set underlying the sequence, including i itself. Thus, we have now a randomly chosen length-1 sequence with some element of the array in it at position 0, and we have the set of remaining elements to be inserted starting at position 1. By induction on i, we can see that successive steps generate longer and longer random sequences of elements starting at 0, until eventually all elements are included.

OK, so we've proven that, given an unlimited source of truly random numbers, we can produce a truly random shuffle. The problem is that sources of truly random numbers are expensive.

We'll use a cryptographically secure pseudo-random number generator in the form of the RC4 stream cipher. We'll seed the generator by keying the stream cipher from UNIX's /dev/random, which at least in the Linux implementation provides a limited source of truly random numbers. We'll choose the seed/key with enough bits to prevent a brute force attack on the chosen key, but few enough bits that /dev/random can reasonably provide them.

The resulting shuffle will be provably impossible to reverse engineer without "breaking" RC4. RC4 isn't the world's most secure cipher; if your application is extremely sensitive, you probably want to use a block cipher such as AES and generate a pseudo-random stream by using it in counter mode, but I have never had or even really seen an application that needed that level of security.

Here's the Nickle code for extracting random bytes out of the random number generator. (For added security, you may want/need to open /dev/random rather than /dev/urandom.)


  public void dev_srandom(int nbits) {
    twixt(file f = open("/dev/urandom", "r");
          close(f)) {
      int seed = 0;
      while (nbits >= 8 ) {
        seed = (seed << 8 ) | getb(f);
        nbits -= 8;
      }
      if (nbits > 0)
        seed = (seed << nbits) |
               (getb(f) & ((1 << nbits) - 1));
      srandom(seed);
    }
  }

Here's the Nickle implementation of the RC4 stream cipher. Note how simple and fast it is. (The keying function has been omitted for brevity.)

  public int streambyte() {
    p1 = (p1 + 1) & 0xff;
    p2 = (p2 + state[p1]) & 0xff;
    int tmp = state[p1];
    state[p1] = state[p2];
    state[p2] = tmp;
    int n = (state[p1] + state[p2]) & 0xff;
    return state[n];
  }

Finally, here's the actual PRNG. Note that because of the modulus, it will be non-uniform by about 1 part in 2 billion. This is probably sufficient for most purposes: otherwise, increase the 32 in the source to a large number.

  public int randint(int m) {
    int n = 32 + bit_width(m);
    int q = n // 8;
    int r = n % 8;
    int acc = ARC4::streambyte() >> (8 - r);
    while (q-- > 0)
      acc += ARC4::streambyte() << (8 * q + r);
    return acc % m;
  }

There you have it. A working, secure shuffle for your security-critical situations. Hope you find it useful in your work. Hope, as well, that you don't believe everything you read on the Interweb.