When using streams in cuda, is it necessary to perform any synchronization between memory allocations and usage of this memory by a stream (assuming cudaMallocAsync is not available, which it is not for me).
example:
cudaStream_t stream;
cudaStreamCreateWithFlags(&stream, cudaStreamNonBlocking);
... Other code
int *a;
gpuErrchk(cudaMalloc((void **)&a, sizeof(int)));
foo<<<1, 1, 0, stream>>>(a);
gpuErrchk(cudaStreamSynchronize(stream));
cudafree(a);
Is there a chance that when calling foo in such a situation, that the memory allocation is not completed?
Would I be forced to do a device synchronize after the allocation to be sure?
CodePudding user response:
There is no chance that the memory allocation will not be completed. You don't need an explicit device synchronization after a memory allocation.
When the cudaMalloc
call returns, the memory is allocated and usable.