nvprof - Warning: No profile data collected-CodePudding

On attempting to use nvprof to profile my program, I receive the following output with no other information:

<program output>
======== Warning: No profile data collected.

The code used follows this classic first cuda program. I have had nvprof work on my system before, however I recently had to re-install cuda.

I have attempted to follow the suggestions in this post which suggested to include cudaDeviceReset() and cudaProfilerStart/Stop() and to use some extra profiling flags nvprof --unified-memory-profiling off without luck.

This nvidia developer forum post seems to run into a similar error, however the suggestions here seemed to indicate needing to use a different compiler than nvcc due to some OpenACC library I do not use.

System Specifications

System: Windows 11 x64 using WSL2
CPU: i7 8750H
GPU: gtx 1050 ti
CUDA Version: 11.8

For completeness, I have included my program code, though I imagine it has more to due with my system:

Compiling:

nvcc add.cu -o add_cuda

Profiling:

nvprof ./add_cuda

add.cu:

#include <iostream>
#include <math.h>
#include <cuda_profiler_api.h>

// function to add the elements of two arrays
__global__
void add(int n, float *x, float *y)
{
  for (int i = 0; i < n; i  )
      y[i] = x[i]   y[i];
}

int main(void)
{
  int N = 1<<20; // 1M elements

  cudaProfilerStart();

  // Allocate Unified Memory -- accessible from CPU or GPU
  float *x, *y;
  cudaMallocManaged(&x, N*sizeof(float));
  cudaMallocManaged(&y, N*sizeof(float));

  // initialize x and y arrays on the host
  for (int i = 0; i < N; i  ) {
    x[i] = 1.0f;
    y[i] = 2.0f;
  }

  // Run kernel on 1M elements on the GPU
  add<<<1, 1>>>(N, x, y);

  // Wait for GPU to finish before accessing on host
  cudaDeviceSynchronize();

  // Check for errors (all values should be 3.0f)
  float maxError = 0.0f;
  for (int i = 0; i < N; i  )
    maxError = fmax(maxError, fabs(y[i]-3.0f));
  std::cout << "Max error: " << maxError << std::endl;

  // Free memory
  cudaFree(x);
  cudaFree(y);

  cudaDeviceReset();
  cudaProfilerStop();

  return 0;
}

How can I resolve this to get actual profiling information using nvprof?

CodePudding user response：

As per the documentation, there is currently no profiling support in CUDA for WSL. This is why there is no profiling data collected when you are using nvprof.