While trying to do ndimage.convolve on big numpy.memmap, exception occurs:
Exception has occurred: _ArrayMemoryError Unable to allocate 56.0 GiB for an array with shape (3710, 1056, 3838) and data type float32
Seems that convolve creates a regular numpy array which won't fit into memory.
Could you tell me please if there is a workaround?
Thank you for any input.
CodePudding user response:
Scipy and Numpy often create new arrays to store the output value returned. This temporary array is stored in RAM even when the array is stored on a storage device and accessed with memmap. There is an output
parameter to control that in many functions (including ndimage.convolve
). However, this does not prevent internal in-RAM temporary arrays to be created (though such array are not very frequent and often not huge). There is not much more you can do if the output
parameter is not present or a big internal is created. The only thing to do is to write your own implementation that does not allocate huge in-RAM array. C modules, Cython and Numba are pretty good for this. Note that doing efficient convolutions is far from being simple when the kernel is not trivial and there are many research paper addressing this problem.
CodePudding user response:
Instead of running your own implementation, another approach that might work would be to use dask's wrapped ndfilters
with a dask array created from the memmap. That way, you can delegate the chunking/out-of-memory-calculation parts to Dask.
I haven't actually done this myself, but I see no reason why it wouldn't work!