Home > Software design >  Memory Crash on Intel CPU when running OpenCL code for particular input
Memory Crash on Intel CPU when running OpenCL code for particular input

Time:03-04

I'm working on running my OpenCL codes on CPU and GPU. I'm experiencing some memory crash issue when I run my OpenCL code on CPU that is for particular input vector...I'm not seeing the issue when I run the same code for same vector on GPU. After the kernel execution, when I execute any other opencl functions or printf it's showing "Access violation Error". I'm not experiencing the same error when I run on GPU. Then on further debugging, when I changed the order of buffer creation (clCreateBuffers) , it is working fine without any issue. Say previously I created buffers in this order buffer1, buffer 2, buffer 3, it showed exception. Now when I changed to buffer1, buffer3, buffer2, it is running fine and no access violation error is thrown on CPU. And also when the buffers are in this order buffer1, buffer2, buffer3 and I additionally created buffer4 (which is just dummy buffer we are not using that in code), program is executed successfully.

Can anyone please explain me why this error is happening. It's something weird that same code is working fine on GPU and even other input vectors are getting properly executed on CPU, while at the same time when we run a particular vector with different order of buffer creation the error is not happening.

Thanks.

I have found same issue on stackoverflow:Unhandled exception in clEnqueueReadBuffer on Intel CPU

CodePudding user response:

The issue has been resolved. It was due to accessing buffers out of bounds, where CPU caught it and thrown exception but GPU ignored it. For that particular vector the logic has to be changed slightly so that it won't access out of their bounds. Thanks for all the replies.

CodePudding user response:

I could imagine that depending on the input data a memory address is computed and this address is then accessed. For weird input data, the memory address could be out of bounds and you read/wrtite a value in a Nirwana memory region. The GPU could be tolerant to the access violation without throwing an error. I'd be very sharp-eyed on checking the code for some programming error.

  • Related