Home > Back-end >  why xorshift random number generator uses the same "amount" of SBR/SBL in all examples?
why xorshift random number generator uses the same "amount" of SBR/SBL in all examples?

Time:03-18

I was going through a book that explained the xorshift algorithm (I know, basic stuff). Then, while searching a little bit more on the Internet, I found that all the basic examples seem to shift the bits right/left the same "amount" (13, 17, 5).

For instance:

struct xorshift32_state {
  uint32_t a;
};

uint32_t xorshiftTransform(struct xorshift32_state *state) {
    uint32_t x = state->a;

    x ^= x << 13;
    x ^= x >> 17;
    x ^= x << 5;
    
    return state->a = x;
}

Is there a particular reason why in all examples they use 13, 17 and 5? Yep, I found other examples too, but this one keeps repeating, and I don't know if the numbers choice is trivial or not.

CodePudding user response:

This is actually a lot more subtle and interesting than you might suspect!

The xorshift random number generator has an interesting theoretical backstory. The use of shifts and XORs corresponds to performing a matrix-vector product where both the matrix and the vector are made of 0s and 1s. The specific matrices in question are derived based on the choices of shift sizes and the directions of those shifts.

In order for the RNG to perform well (specifically, to not repeat any outputs until all possible values have been generated), the matrix derived by the shifts must be invertible. Most choices of shifts will not give an invertible matrix, and the author of xorshift ran a computer search to find all possible shift sizes that work. In the paper detailing the xorshift family of RNGs, the author detailed the specific choice of shifts you mentioned and says the following:

It uses one of my favorite choices, [a, b, c] = [13, 17, 5], and will pass almost all tests of randomness, except the binary rank test in Diehard [2]. (A long period xorshift RNG necessarily uses a nonsingular matrix transformation, so every successive n vectors must be linearly independent, while truly random binary vectors will be linearly independent only some 30% of the time.) Although I have only tested a few of them, any one of the 648 choices above is likely to provide a very fast, simple, high quality RNG.

So in a sense, these numbers satisfy the theoretical necessary conditions needed for the math to work out to make this a good RNG, and the author tested it and singled them out in the original paper, which is why I’m guessing they’re so widely used. But perhaps there’s an even better choice using other numbers from the paper that folks just haven’t gotten around to using yet?

  • Related