I recently read somewhere that values with less significant bits tend to be less random than with more significant bits, could someone explain this better? If you can pass me some paper that talks about this and about random numbers I would be very grateful, as long as it doesn't have a lot of complex math
CodePudding user response:
This is a commentary on the often poor quality of random number generator used for the C library rand() function. The numbers it returns is supposed to be perfectly random--that is, if you asked for a million such numbers, you would not me able to find a pattern to predict the next one.
But if rand() uses a poor algorithm, it's possible to find patterns in the numbers it returns, and predict future numbers, or perhaps some bits of future numbers. In particular, the low-order bits of the returned numbers (that is, the bits representing 2^0, 2^1, 2^2, ...) are more predictable than the high-order bits (those representing 2^31, 2^30, ...). It might be possible to say, for example, that we don't know what the next number will be exactly, but there's a 60% chance its low bit is 1, rather than 50%.
The cure is simply to not use the built-in rand() function, but use a trusted library or algorithm to generate random numbers--and never "roll your own" (i.e., write your own RNG).
CodePudding user response:
There are two things at play here.
First is the algorithm. Whether particular bits of a pseudorandom number generator's (PRNG) output are "weaker" than others, depends on the algorithm. For example, many PRNGs that rely on linear recurrences (such as many linear congruential generators) will produce outputs whose bits have shorter periods the less significant they are. This tends to be at its worst when the so-called "modulus" is a power of 2 (or more generally, a power of a prime number). The first paper cited below reviews the theory of linear congruential generators, and the second paper shows a particular phenomenon of such generators.
Steele and Vigna, Computationally easy, spectrally good multipliers for congruential pseudorandom number generators, 2020/2021.
Durst, Using linear congruential generators for parallel random number generation, 1989 Winter Simulation Conference.
Second is the nature of
rand
(andsrand
) in the C language, which is true regardless of the algorithm used by a particularrand
implementation. Perhaps the most serious is the fact thatrand
doesn't guarantee a particular distribution the pseudorandom numbers must follow. For more information see: Why is the use of rand() considered bad?