How can I return a multidimensional array from a shared library in C to python as an np.array?-CodePudding

I am currently trying to figure a way to return a mutidimensional array (of doubles) from a shared library in C to python and make it an np.array. My current approach looks like this:

shared library ("utils.c")

#include <stdio.h>

void somefunction(double *inputMatrix, int d1_inputMatrix, int d2_inputMatrix, int h_inputMatrix, int w_inputMatrix, double *kernel, int d1_kernel, int d2_kernel, int h_kernel, int w_kernel, int stride) {
    
    double result[d1_kernel][d2_kernel][d2_inputMatrix][h_inputMatrix-h_kernel 1][w_inputMatrix-w_kernel 1];
    // ---some operation--

    return result;

}

Now, I compile utils.c with: cc -fPIC -shared -o utils.so utils.c

python ("somefile.py")

from ctypes import *
import numpy as np

so_file = "/home/benni/Coding/5.PK/Code/utils.so"
utils = CDLL(so_file)

INT = c_int64
ND_POINTER_4 = np.ctypeslib.ndpointer(dtype=np.float64, ndim=4, flags="C")

utils.convolve.argtypes = [ND_POINTER_4, INT, INT, INT, INT, ND_POINTER_4, INT, INT, INT, INT, INT]
a = ... #some np.array  with 4 dimensions
b = ... #some np.array  with 4 dimensions

result = utils.somefunction(a, a.shape[0], a.shape[1], a.shape[2], a.shape[3], b, b.shape[0], b.shape[1], b.shape[2], b.shape[3], 1)

Now, how do I convert the result of utils.somefunction() to an np.array? I know that, in order to solve my problem, I have to specify utils.convolve.restype. But what do I have to put for restype if I want the return type to be an np.array?

CodePudding user response：

First of all, a scoped C array allocated on the stack (like in somefunction) must never be returned by a function. The space of the stack will be reused by other function like the one of CPython for example. The returned array must be allocated on the heap instead.

Moreover, writing a function working with Numpy arrays using ctypes is pretty cumbersome. As you found out, you need to pass the full shape in parameter. But the thing is you also need to pass the strides for each dimension and for each input arrays in parameter of the function since they may not be contiguous in memory (for example np.transpose change this). That being said, we can assume that the input array is contiguous for sake of performance and sanity. This can be enforced with np.ascontiguousarray. The pointer of the views a and b can be extracted using numpy.ctypeslib.as_ctypes, but hopefully ctype can do that automatically. Furthermore, the returned array is currently a C pointer an not a Numpy array. Thus, you need to create a Numpy array with the right shape and strides from it numpy.ctypeslib.as_array. Because the resulting shape is not known form the caller, you need to retrieve it from the callee function using several integer pointers (one per dimension). In the end, this results in a pretty-big ugly highly-bug-prone code (which will often silently crash if anything goes wrong not to mention the possible memory leaks if you do not pay attention to that). You can use Cython to do most of this work for you.

Assuming you do not want to use Cython or you cannot, here is an example code with ctypes:

import ctypes
import numpy as np

# Example of input
a = np.empty((16, 16, 12, 12), dtype=np.float64)
b = np.empty((8, 8, 4, 4), dtype=np.float64)

# Better than CDLL regarding the Numpy documentation.
# Here the DLL/SO file is found in:
# Windows:  ".\utils.dll"
# Linux:    "./libutils.so"
utils = np.ctypeslib.load_library('utils', '.')

INT = ctypes.c_int64
PINT = ctypes.POINTER(ctypes.c_int64)
PDOUBLE = ctypes.POINTER(ctypes.c_double)
ND_POINTER_4 = np.ctypeslib.ndpointer(dtype=np.float64, ndim=4, flags="C_CONTIGUOUS")

utils.somefunction.argtypes = [
    ND_POINTER_4, INT, INT, INT, INT, 
    ND_POINTER_4, INT, INT, INT, INT, 
    PINT, PINT, PINT, PINT, PINT
]
utils.somefunction.restype = PDOUBLE

d1_out, d2_out, d3_out, d4_out, d5_out = INT(), INT(), INT(), INT(), INT()
p_d1_out = ctypes.pointer(d1_out)
p_d2_out = ctypes.pointer(d2_out)
p_d3_out = ctypes.pointer(d3_out)
p_d4_out = ctypes.pointer(d4_out)
p_d5_out = ctypes.pointer(d5_out)
out = utils.somefunction(a, a.shape[0], a.shape[1], a.shape[2], a.shape[3],
                         b, b.shape[0], b.shape[1], b.shape[2], b.shape[3],
                         p_d1_out, p_d2_out, p_d3_out, p_d4_out, p_d5_out)
d1_out = d1_out.value
d2_out = d2_out.value
d3_out = d3_out.value
d4_out = d4_out.value
d5_out = d5_out.value
result = np.ctypeslib.as_array(out, shape=(d1_out, d2_out, d3_out, d4_out, d5_out))

# Some operations

# WARNING: 
# You should free the memory of the allocated buffer 
# with `free(out)` when you are done with `result` 
# since Numpy does not free it for you: it just creates 
# a view and does not take the ownership.
# Note that the right libc must be used, otherwise the
# call to free will cause an undefined behaviour
# (eg. crash, error message, nothing)

Here is the C code (be aware of the fixed-length types):

/* utils.c */

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>

double* somefunction(
        double* inputMatrix, int64_t d1_inputMatrix, int64_t d2_inputMatrix, int64_t h_inputMatrix, int64_t w_inputMatrix, 
        double* kernel, int64_t d1_kernel, int64_t d2_kernel, int64_t h_kernel, int64_t w_kernel,
        int64_t* d1_out, int64_t* d2_out, int64_t* d3_out, int64_t* d4_out, int64_t* d5_out
    )
{
    *d1_out = d1_kernel;
    *d2_out = d2_kernel;
    *d3_out = d2_inputMatrix;
    *d4_out = h_inputMatrix - h_kernel   1;
    *d5_out = w_inputMatrix - w_kernel   1;

    const size_t size = *d1_out * *d2_out * *d3_out * *d4_out * *d5_out;
    double* result = malloc(size * sizeof(double));

    if(result == NULL)
    {
        fprintf(stderr, "Unable to allocate an array of %d bytes", size * sizeof(double));
        return NULL;
    }

    /* Some operation: fill `result` */

    return result;
}

Here is the command used to build the library with GCC:

# On Windows
gcc utils.c -shared -o utils.dll

# On Linux
gcc utils.c -fPIC -shared -o libutils.so

For more information, please read this: