Home > Back-end >  Why it is appropriate to use `std::uniform_real_distribution`?
Why it is appropriate to use `std::uniform_real_distribution`?

Time:02-12

I'm trying to write Metropolis Monte Carlo simulation code. Since the simulation will be very long, I'd like to think seriously about the performance for generating random numbers in [0, 1].

So I decided to check the performance of two methods by the following code:

#include <cfloat>
#include <chrono>
#include <iostream>
#include <random>

int main()
{
    constexpr auto Ntry = 5000000;

    std::mt19937 mt(123);
    std::uniform_real_distribution<double> dist(0.0, std::nextafter(1.0, DBL_MAX));
    double test1, test2;

    // method 1
    auto start1 = std::chrono::system_clock::now();
    for (int i=0; i<Ntry; i  ) {
        test1 = dist(mt);
    }
    auto end1 = std::chrono::system_clock::now();
    auto elapsed1 = std::chrono::duration_cast<std::chrono::microseconds>(end1-start1).count();
    std::cout << elapsed1 << std::endl;

    // method 2
    auto start2 = std::chrono::system_clock::now();
    for (int i=0; i<Ntry; i  ) {
        test2 = 1.0*mt() / mt.max();
    }
    auto end2 = std::chrono::system_clock::now();
    auto elapsed2 = std::chrono::duration_cast<std::chrono::microseconds>(end2-start2).count();
    std::cout << elapsed2 << std::endl;
}

Then the result is

  • 295489 micro sec for method 1
  • 79884 micro sec for method 2

I understand that there are many posts that recommend to use std::uniform_real_distribution. But performance-wise, it is tempting to use the latter as this result shows.

Would you tell me what is the point of using std::uniform_real_distribution? What is the disadvantage of using 1.0*mt() / mt.max()? And in the current purpose, is it acceptable to use 1.0*mt() / mt.max() instead?

Edit:

I compiled this code with g -11 test.cpp. When I compile with -O3 flag, the result is qualitatively same (the method 1 is approx. 1.8 times slower). I would like to discuss what is the advantage of the widely-used method. I do concern the trend of performances, but specific performance comparison is out of my scope.

CodePudding user response:

Your testing methodology is flawed. You don't use the result that you produce, so a smart optimiser may simply skip producing a result.

Would you tell me what is the point of using std::uniform_real_distribution?

  • The clue is in the name. It produces a uniform distribution.
  • Furthermore, it allows you to specify the minimum and maximum between which you want the distribution to lie.

What is the disadvantage of using 1.0*mt() / mt.max()?

  • You cannot specify a minimum and a maximum.
  • It produces a less uniform distribution.
  • It produces less randomness.

is it acceptable to use 1.0*mt() / mt.max() instead?

In some use cases, it could be acceptable. In some other cases, it isn't acceptable. In the rest, it won't matter.

CodePudding user response:

You use the standard random library because it is extremely difficult to do numerical calculations correctly and you don't want the burden of proving and maintaining your own random library.

Case in point, your random distribution is wrong. std::mt19937 produces 32-bit integers, yet you're expecting a double, which has a 53-bit significand (usually). There are values in the range [0, 1] that you will never obtain from 1.0*mt() / mt::max().

  • Related