I want to copy a 2D numpy array (matrix) in a C function a get it back in python (and then do some calculation on it in C taking the speed advantage of C) . Therefore I need the C function matrix_copy
to return a 2D array (or, I guess, a pointer to it). I tried with the following code but I get the following output (where one can see the second dimension of the array is lost).
matrix_in.shape:
(300, 200)
matrix_out.shape:
(300,)
How could I change the code (I guess the matrix_copy.c
adding some pointer magic) so I could obtain an exact copy of the matrix_in
in matrix_out
?
Here is the main.py
script:
from ctypes import c_void_p, c_double, c_int, cdll
from numpy.ctypeslib import ndpointer
import numpy as np
import pdb
n = 300
m = 200
matrix_in = np.random.randn(n, m)
lib = cdll.LoadLibrary("matrix_copy.so")
matrix_copy = lib.matrix_copy
matrix_copy.restype = ndpointer(dtype=c_double,
shape=(n,))
matrix_out = matrix_copy(c_void_p(matrix_in.ctypes.data),
c_int(n),
c_int(m))
print("matrix_in.shape:")
print(matrix_in.shape)
print("matrix_out.shape:")
print(matrix_out.shape)
Here is the matrix_copy.c
script:
#include <stdlib.h>
#include <stdio.h>
double * matrix_copy(const double * matrix_in, int n, int m){
double * matrix_out = (double *)malloc(sizeof(double) * (n*m));
int index = 0;
for(int i=0; i< n; i ){
for(int j=0; j<m; j ){
matrix_out[i j] = matrix_in[i j];
//matrix_out[i][j] = matrix_in[i][j];
// some heavy computations not yet implemented
}
}
return matrix_out;
}
which I compile with the command
cc -fPIC -shared -o matrix_copy.so matrix_copy.c
And as a side note, why does the notation matrix_out[i][j] = matrix_in[i][j];
throws me an error on compilation?
matrix_copy.c:10:26: error: subscripted value is not an array, pointer, or vector
matrix_out[i][j] = matrix_in[i][j];
~~~~~~~~~~~~~^~
matrix_copy.c:10:44: error: subscripted value is not an array, pointer, or vector
matrix_out[i][j] = matrix_in[i][j];
CodePudding user response:
The second dimension is 'lost' because you explicitly omit it in the named shape
argument of ndpointer
. Change:
matrix_copy.restype = ndpointer(dtype=c_double, shape=(n,))
to
matrix_copy.restype = ndpointer(dtype=c_double, shape=(n,m), flags='C')
Where flags='C'
additionally notes that the returned data is stored contiguously in row major order.
With regards to matrix_out[i][j] = matrix_in[i][j];
throwing an error, consider that matrix_in
is of type const double *
. matrix_in[i]
would yield a value of type const double
- how do you further index this value (i.e., with [j]
)?
If you want to emulate accessing a 2-dimensional array via a single pointer, you must calculate offsets manually. matrix_out[i j]
is not sufficient, as you must account for the span of each sub array:
matrix_out[i * m j] = matrix_in[i * m j];
Note that in C, size_t
is the generally preferred type to use when dealing with memory sizes or array lengths.
matrix_copy.c
, simplified:
#include <stdlib.h>
double *matrix_copy(const double *matrix_in, size_t n, size_t m)
{
double *matrix_out = malloc(sizeof *matrix_out * n * m);
for (size_t i = 0; i < n; i )
for (size_t j = 0; j < m; j )
matrix_out[i * m j] = matrix_in[i * m j];
return matrix_out;
}
matrix.py
, with more explicit typing:
from ctypes import c_void_p, c_double, c_size_t, cdll, POINTER
from numpy.ctypeslib import ndpointer
import numpy as np
c_double_p = POINTER(c_double)
n = 300
m = 200
matrix_in = np.random.randn(n, m).astype(c_double)
lib = cdll.LoadLibrary("matrix_copy.so")
matrix_copy = lib.matrix_copy
matrix_copy.argtypes = c_double_p, c_size_t, c_size_t
matrix_copy.restype = ndpointer(
dtype=c_double,
shape=(n,m),
flags='C')
matrix_out = matrix_copy(
matrix_in.ctypes.data_as(c_double_p),
c_size_t(n),
c_size_t(m))
print("matrix_in.shape:", matrix_in.shape)
print("matrix_out.shape:", matrix_out.shape)
print("in == out", matrix_in == matrix_out)
CodePudding user response:
The incoming data is a probably single block of memory. You need to create the substructure.
In my C code I have to do the following on data (block) coming in via swig:
void divide2DDoubleArray(double * &block, double ** &subblockdividers, int noofsubblocks, int subblocksize){
/* The starting address of a block of doubles is used to generate
* pointers to subblocks.
*
* block: memory containing the original block of data
* subblockdividers: array of subblock addresses
* noofsubblocks: specify the number of subblocks produced
* subblocksize: specify the size of the subblocks produced
*
* Design by contract: application should make sure the memory
* in block is allocated and initialized properly.
*/
// Build 2D matrix for cols
subblockdividers=new double *[noofsubblocks];
subblockdividers[0]= block;
for (int i=1; i<noofsubblocks; i) {
subblockdividers[i] = &subblockdividers[i-1][subblocksize];
}
}
Now the pointer returned in subblockdividers can be used the way you would like to.
Don't forget to free subblockdividers when your done. (Note: adjustments might be needed to compile this as C code)