Home > Enterprise >  A probability theory problem in skiplist's C implement
A probability theory problem in skiplist's C implement

Time:10-31

These days I am looking at skiplist code in Algorithms in C, Parts 1-4, and when insert a new value into skiplist is more complex than I think. During insert, code should ensure that the new value insert into level i with the probability of 1/2^i, and this implement by below code:

static int Rand()
{
    int i, j = 0;
    uint32_t t = rand();
    for (i = 1, j = 2; i < lg_n_max; i   , j  = j)
        if (t > RANDMAX / j)
            break;
    if (i > lg_n)
        lg_n = i;
    return i;
}

I don't know how the Rand function ensure this, can you explain this for me, thank you.

CodePudding user response:

Presumably RANDMAX is intended to be RAND_MAX.

Neglecting rounding issues, half the return values of rand are above RAND_MAX / 2, and therefore half the time, the loop exits with i = 1.

If the loop continues, it updates i to 2 and j to 4. Then half the remaining return values (¾ of the total) are above RAND_MAX / 4, so, one-quarter of the time, the loop exits with i = 2.

Further iterations continue in the same manner, each iteration exiting with a portion of return values that is half the previous, until the lg_n_max limit is reached.

Thus, neglecting rounding issues and the final limit, the routine returns 1 half the time, 2 one-quarter of the time, 3 one-eighth the time, and so on.

lg_n is not defined in the routine. It appears to be a record of the greatest value returned by the routine so far.

CodePudding user response:

Thanks Eric Postpischil very much for his answer, I have understand how to ensure the probability. And I have a more understood answer: The t is a random value between 0 and RANDMAX, and we assume that the loop will run 2 times. In the first loop, value of t is smaller than RANDMAX/2^1, means that value of t fall into the range from 0 to RANDMAX/2 , the probability of this is 1/2. In the second loop, remember the fact that value of t is in the range of (0, RANDMAX/2^i), value of t is smaller that RANDMAX/2^2, means that value of t fall into the range from 0 to RANDMAX/2^2, the probability of this is also 1/2, because the range of (0, RANDMAX/2^2) is only 1/2 of the range in first loop, and the first loop show value of t is in the range of (0, RANDMAX/2^1). And notice that the probability of second loop is conditional probability,it‘s based on the probability of first loop, so the probability of second loop is 1/2*1/2=1/4.
In a short, every loop will bring a * 1/2 to last loop's probability.

  • Related