Home > Net >  Mixing cuda and cpp templates and lambdas
Mixing cuda and cpp templates and lambdas

Time:10-04

Example code: https://github.com/Saitama10000/Mixing-cuda-and-cpp-templates-and-lambdas

  • I want to have a kernel in a .cu file that takes an extended __host__ __device__lambda as parameter and use it to operate on data.
  • I am using a .cuh file to wrap the kernel execution in a wrapper function.
  • I include the .cuh file in main.cpp and use the wrapper function to do the computations.
  • I need this .cuh, .cu type of organizing the code
  • I'm using c 20

The example code doesn't compile. I am supposed to add a template instantiation in the .cu file, but I don't know how. I've tried this:

typedef float(*op)(float);
template std::vector<float> f<op>(std::vector<float> const&, op);

but I still get this compilation error:

In file included from Mixing-cuda-and-cpp-templates-and-lambdas/main.cpp:6:
Mixing-cuda-and-cpp-templates-and-lambdas/kernel.cuh:6:20: error: ‘std::vector<float> f(const std::vector<float>&, FUNC) [with FUNC = main()::<lambda(float)>]’, declared using local type ‘main()::<lambda(float)>’, is used but never defined [-fpermissive]
    6 | std::vector<float> f(std::vector<float> const& a, FUNC func);
      |                    ^
Mixing-cuda-and-cpp-templates-and-lambdas/kernel.cuh:6:20: warning: ‘std::vector<float> f(const std::vector<float>&, FUNC) [with FUNC = main()::<lambda(float)>]’ used but never defined
make[2]: *** [CMakeFiles/main.dir/build.make:82: CMakeFiles/main.dir/main.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:95: CMakeFiles/main.dir/all] Error 2
make: *** [Makefile:103: all] Error 2

CodePudding user response:

There are two problems with your approach.

First, each lambda has it's own type even if parameters and function body are the same.

For example, the following assertion fails

#include <type_traits>

int main(){
    auto lambda1 = [](){};
    auto lambda2 = [](){};

    static_assert(std::is_same<decltype(lambda1), decltype(lambda2)>::value, "not same");
}

That means, even if you somehow manage to explitely instantiate your template with the type of the lambda it won't be the type of the lambda which you will pass to your function. This problem can be solved by using functors instead of lambdas. Define a set of functors which may be used to call the function, and use them for template instantiation.

Second, you want to pass a __host__ __device__ function. This annotation is a CUDA C extension which cannot be compiled with a standard C compiler. You have to use a CUDA compiler instead, which in turn allows you to place your kernel and wrappers in the .cuh file.

  • Related