Home > Mobile >  timer for measuring function execution time not working
timer for measuring function execution time not working

Time:10-22

I'm writing a cuda library and I need to check the differences in performance between the option CPU and GPU. So I created a simple class called Timer to measure the time required to execute first a GPU function and then the CPU version.

class Timer
{
public:
    Timer()
    {
        _StartTimepoint = std::chrono::steady_clock::now();
    }
 
    ~Timer() {}

    void Stop()
    {

        _stopped = true;
        using namespace std::chrono;
        auto endTimepoint = steady_clock::now();

        auto start = time_point_cast<milliseconds>(_StartTimepoint).time_since_epoch().count();
        auto end = time_point_cast<milliseconds>(endTimepoint).time_since_epoch().count();

        auto _ms = end - start;



        _secs   = _ms   / 1000;
        _ms    -= _secs * 1000;
        _mins   = _secs / 60;
        _secs  -= _mins * 60;
        _hour   = _mins / 60;
        _mins  -= _hour * 60;


    }


    double GetTime(){
        if(_stopped == true)
            return _ms;
        else{
            Stop();
            return _ms;
        }
    }

private:
    std::chrono::time_point< std::chrono::steady_clock> _StartTimepoint;
    double _secs,_ms,_mins,_hour;
    bool _stopped = false;
};

Since I need to check the performances for different values of a parameter m I just run both the functions inside a for loop as you can see:

for (size_t m = MIN_M; m < MAX_M; m =M_STEP){
        m_array[m_cont] = m;
        //simulate
        double time_gpu,time_cpu;

        Timer timer_gpu;
        run_device(prcr_args,seeds,&m_array[m_cont]);
        timer_gpu.Stop();
        time_gpu = timer_gpu.GetTime();

        Timer timer_cpu;
        simulate_host(prcr_args,seeds,&m_array[m_cont]);
        timer_cpu.Stop();
        time_cpu = timer_cpu.GetTime();
        
        double g = time_cpu/time_gpu;
        
        ofs << m  //stream to print the results
            << "," << time_cpu
            << "," << time_gpu 
            << "," << g << "\n";
        m_cont   ;
    }

The problem is that the results i obtain are incredibly small and clearly wrong since they all are equal (the execution time should increase with m) and that my code requires a couple of minutes to run.

m,cpu_time,gpu_time,g
10,9.88131e-324,6.90979e-310,1.43004e-14
15,9.88131e-324,6.90979e-310,1.43004e-14
....
90,9.88131e-324,6.90979e-310,1.43004e-14
95,9.88131e-324,6.90979e-310,1.43004e-14
100,9.88131e-324,6.90979e-310,1.43004e-14

My guess is that the CPU doesn't execute the cycle sequentially and therefore starts and stops the clock immediately.

CodePudding user response:

The problems are in this sequence:

    auto _ms = end - start;



    _secs   = _ms   / 1000;
    _ms    -= _secs * 1000;
    _mins   = _secs / 60;
    _secs  -= _mins * 60;
    _hour   = _mins / 60;
    _mins  -= _hour * 60;
  1. as indicated in the comments, the auto there is creating a new local variable. Probably not what you want.

  2. It seems that the intent of the arithmetic is to leave only the milliseconds portion in _ms. Since that is the quantity you are returning, if that were true, your GetTime() method would only ever return values from 0 to 999.

  3. Since your member variables are type double, rather than integral types:

      double _secs,_ms,_mins,_hour;
    

    this arithmetic is not doing what you intend:

      _secs   = _ms   / 1000;
      _ms    -= _secs * 1000;
    

    that is not integer division.

Taking these ideas into account, a roadmap to fix your Timer class could look something like this:

class Timer
{
public:
    Timer()
    {
        _StartTimepoint = std::chrono::steady_clock::now();
    }

    ~Timer() {}

    void Stop()
    {

        _stopped = true;
        using namespace std::chrono;
        auto endTimepoint = steady_clock::now();

        auto start = time_point_cast<milliseconds>(_StartTimepoint).time_since_epoch().count();
        auto end = time_point_cast<milliseconds>(endTimepoint).time_since_epoch().count();


        _ms = end - start;

        _secs   = _ms   / 1000;
        _ms    -= _secs * 1000;
        _mins   = _secs / 60;
        _secs  -= _mins * 60;
        _hour   = _mins / 60;
        _mins  -= _hour * 60;

    }


    double GetTime(){
        if(!_stopped) Stop();
            return _ms   (((_hour * 60)  _mins)*60   _secs) * 1000;
        }


private:
    std::chrono::time_point< std::chrono::steady_clock> _StartTimepoint;
    size_t _secs,_ms,_mins,_hour;
    bool _stopped = false;
};

I haven't tested this thoroughly. This all has nothing to do with cuda of course.

CodePudding user response:

You declare a local variable _ms with the same name as your member variable. During the stop function the local variable takes precedence over the member variable, so you never actually store a value in the member. You can show this by initializing the member to some value in the class definition and you will see that same value pop out at the end. The 'auto' has nothing to do with it - defining it strictly as a double will have the same results of creating a local that takes precedence over the member.

Because you left the member un-initialized, it is filled in with whatever happened to be on the stack where it was created. This is why each instantiation of the timer has the same value within the loop.

one very small fix:

auto _msB = end - start;
_secs = _msB / 1000;
_ms = _msB - _secs * 1000;

You should also initialize your member variables so that debugging is easier, even if they are going to get overwritten during use.

I would also encourage letting the chrono library do more of the work for you. A very brief snippet would be:

auto duration = endTimepoint - _StartTimepoint;    

auto ms = duration_cast<milliseconds>(duration);

_ms = ms.count();

You can do a duration cast for each unit type of name without having the possibility of a typo on the conversions or unnecessarily losing precision.

I'm not sure why you want differential time numbers as part of the timer, but if you must do that arithmetic you can do it by using the std::ratio on the count values that come out of duration casts.

  • Related