I use cv::cuda::remap instead of cv::remap to take advantage of CUDA acceleration to speed up video undistortion. Both versions of the program can run normally, but, while the camera matrix, distortion coefficients, map1 and map2 which come from cv::initUndistortRectifyMap() are all the same, the undistorted result image of the CPU version cv::remap is correct as follow:

but the CUDA version cv::cuda::remap results in a problem:

The code snippet for the CPU version is as follows:

cv::cuda::GpuMat gpuMat(m_height, m_width, CV_8UC4, (void *)dpFrame);
cv::Mat mat;
gpuMat.download(mat);
cv::remap(mat, mat, m_map1, m_map2, cv::INTER_LINEAR);
gpuMat.upload(mat);

GPU version:

cv::cuda::GpuMat gpuMat(m_height, m_width, CV_8UC4, (void *)dpFrame);
cv::cuda::remap(gpuMat, gpuMat, m_gpuMap1, m_gpuMap2, cv::INTER_LINEAR);

Among them, dpFrame is of type CUdeviceptr, m_map1 and m_map2 are calculated by cv::initUndistortRectifyMap, m_gpuMap1 and m_gpuMap2 are of type cv::cuda::GpuMat obtained by uploading m_map1 and m_map2 to GPU.

cv::remap and cv::cuda::remap are the same algorithm, why are their results different? I tried both versions of OpenCV 455 and 460, and neither works.

I'm stuck here and don't know how to go forward. Any suggestions are really appreciated. Thanks.

CodePudding user response：

Okey, I also tried your code and got the similar results. I ended up with getting a correct result after a few tests.

My code simply flips an image with remap. Here is your code result to my input:

Code

cv::cuda::GpuMat gpuMat(m_height, m_width, CV_8UC4, (void *)dpFrame);
cv::cuda::remap(gpuMat, gpuMat, m_gpuMap1, m_gpuMap2, cv::INTER_LINEAR);

Input

Output

Then I just add a new declaration of cv::cuda::GpuMat and put it to output of resize function. Here is the code.

    cv::cuda::GpuMat gpuMat(m_height, m_width, CV_8UC4, (void *)dpFrame);
    cv::cuda::GpuMat gpuMat2;
    cv::cuda::remap(gpuMat, gpuMat2, m_gpuMap1, m_gpuMap2, cv::INTER_LINEAR);

    gpuMat2.download(mat);

New Output

I dont have a clear answer to the question why. Since we deal with gpu, it seems better to define different types for input and output of resize