Home > Blockchain >  Operate on Numpy array from C extension without memory copy
Operate on Numpy array from C extension without memory copy

Time:12-22

I'm new to C extensions for NumPy and I'm wondering if the following workflow is possible.

  1. Pre-allocate an array in NumPy
  2. Pass this array to a C extension
  3. Modify array data in-place in C
  4. Use the updated array in Python with standard NumPy functions

In particular, I'd like to do this while ensuring I'm making zero new copies of the data at any step.

I'm familiar with boilerplate on the C side such as PyModuleDef, PyMethodDef, and the PyObject* arguments but a lot of examples I've seen involve coercion to C arrays which to my understanding involves copying and/or casting. I'm also aware of Cython though I don't know if it does similar coercions or copies under the hood. I'm specifically interested in simple indexed get- and set- operations on ndarray with numeric (eg. int32) values.

Could someone provide a minimal working example of creating a NumPy array, modifying it in-place in a C extension, and using the results in Python subsequently?

CodePudding user response:

Cython doesn't create new copies of numpy arrays unless you specifically request it to do so using numpy functions, so it is as efficient as it can be when dealing with numpy arrays, see Working with NumPy

choosing between writing raw C module and using cython depends on the purpose of the module written. if you are writing a module that will only be used by python to do a very small specific task with numpy arrays as fast as possible, then by all means do use cython, as it will automate registering the module correctly as well as handle the memory and prevent common mistakes that people do when writing C code (like memory management problems), as well as automate the compiler includes and allow an overall easier access to complicated functionality (like using numpy iterators).

however if your module is going to be used in other languages and has to be run independently from python and has to be used with python without any overhead, and implements some complex C data structures and requires a lot of C functionality then by all means create your own C extension (or even a dll), and you can pass pointers to numpy arrays from python (using numpy.ctypeslib.as_ctypes_type), or pass the python object itself and return it (but you must make a .pyd/so instead of dll), or even create numpy array on C side and have it managed by python (but you will have to understand the numpy C API).

  • Related