I tried to implement vector addition using Unified Memory architecture. Here is my code
#include<stdio.h>
#include<cuda.h>
#define n 10
__global__ void vec_add(float *c, float *a, float *b, int n){
int i;
//Get global thread ID
i = blockDim.x*blockIdx.x threadIdx.x;
if(i<n){
c[i] = a[i] b[i];
}
}
int main(int argc, char* argv[]){
int thread_count;
float *a, *b, *c;
thread_count = strtol(argv[1], NULL, 10);
cudaMallocManaged(&c, n*sizeof(float));
cudaMallocManaged(&a, n*sizeof(float));
cudaMallocManaged(&b, n*sizeof(float));
for(int i=0; i<n; i ){
a[i]=1.0;
b[i]=2.0;
}
//Launch Kernel
vec_add<<<1,thread_count>>>(c, a, b, n);
//Synchronize threads
cudaDeviceSynchronize();
for(int i=0; i<n; i ){
printf("%f %f =%f\n", a[i], b[i], c[i]);
}
cudaFree(c);
cudaFree(a);
cudaFree(b);
return 0;
}
I got error while run the codeexpected a ")"
. I did not found the parathesis problem. How could I recover from the error? Also I need a brief structure description about how to write cuda program using unified memory.
Thank you.
CodePudding user response:
here is the brief description.
The problem you have is here:
#define n 10
__global__ void vec_add(float *c, float *a, float *b, int n){
You may not know how a C preprocessor macro (#define
) works. It creates a substitution that will be performed by the preprocessor. So what you are telling the preprocessor to do is to change your kernel definition line like this
__global__ void vec_add(float *c, float *a, float *b, int 10){
^^^^^
And of course that is not valid C syntax for a function definition. One possible way to fix this would be to change your variable name in the (kernel) function definition to be something other than n
, perhaps like this:
#define n 10
__global__ void vec_add(float *c, float *a, float *b, int nk){
int i;
//Get global thread ID
i = blockDim.x*blockIdx.x threadIdx.x;
if(i<nk){
c[i] = a[i] b[i];
}
}
Even though this happens to be a CUDA kernel definition, the problem here would be exactly the same if you wrote an ordinary function definition, and used n
as one of the function parameters. This is related to C understanding, not anything specific or unique to CUDA.