Home > Software engineering >  How to Pass Vector of int into CUDA global function
How to Pass Vector of int into CUDA global function

Time:02-19

I'm writing my first CUDA program and encounter a lot of issues, as my main programming language is not C .

In my console app I have a vector of int that holds a constant list of numbers. My code should create new vectors and check matches with the original constant vector.

I don't know how to pass / copy pointers of a vector into the GPU device. I get this error message after I tries to convert my code from C# into C and work with the Kernel:

"Error calling a host function("std::vector<int, ::std::allocator > ::vector()") from a global function("MagicSeedCUDA::bigCUDAJob") is not allowed"

This is part of my code:

std::vector<int> selectedList;
FillA1(A1, "0152793281263155465283127699107744880041");
selectedList = A1;
bigCUDAJob<< <640, 640, 640>> >(i, j, selectedList);

__global__ void bigCUDAJob(int i, int j, std::vector<int> selectedList)
    {    
        std::vector<int> tempList;
        // here comes code that adds numbers to tempList
        // code to find matches between tempList and the 
        // parameter selectedList 
    }   

How to modify my code so I won't get compiler errors? I can work with array of int as well.

CodePudding user response:

I don't know how to pass / copy pointers of a vector into the GPU device

First, remind yourself of how to pass memory that's not in an std::vector to a CUDA kernel. (Re)read the vectorAdd example program, part of NVIDIA's CUDA samples.

cudaError_t status;
std::vector<int> selectedList;

// ... etc. ...

int *selectedListOnDevice = NULL;
std::size_t selectedListSizeInBytes = sizeof(int) * selectedList.size();
status = cudaMalloc((void **)&selectedListOnDevice, selectedListSizeInBytes);
if (status != cudaSuccess) { /* handle error */ }
cudaMemcpy(selectedListOnDevice, selectedList.data(), selectedListSizeInBytes);
if (status != cudaSuccess) { /* handle error */ }

// ... etc. ...

// eventually:
cudaFree(selectedListOnDevice);

That's using the official CUDA runtime API. If, however, you use my CUDA API wrappers (which you absolutely don't have to), the above becomes:

auto selectedListOnDevice = cuda::memory::make_unique<int[]>(selectedList.size());
cuda::memory::copy(selectedListOnDevice.get(), selectedList.data());

and you don't need to handle the errors yourself - on error, an exception will be thrown.

Another alternative is to use NVIDIA's thrust library, which offers an std::vector-like class called a "device vector". This allows you to write:

thrust::device_vector<int> selectedListOnDevice = selectedList;

and it should "just work".

I get this error message:

Error calling a host function("std::vector<int, ::std::allocator >
::vector()") from a global function("MagicSeedCUDA::bigCUDAJob") is
not allowed 

That issue is covered in Using std::vector in CUDA device code , as @paleonix mentioned. In a nutshell: You just cannot have std::vector appear in your __device__ or __global__ functions, at all, no matter how you try and write it.

I'm writing my first CUDA program and encounter a lot of issues, as my main programming language is not C .

Then, regardless of your specific issue with an std::vector, you should take some time to study C programming. Alternatively, you could brush up on C programming, as you can write CUDA kernels which are C'ish rather than C 'ish; but C 'ish features are actually quite useful when writing kernels, not just on the host-side.

  • Related