Home > Mobile >  How to pass huge 2D numpy array to c function
How to pass huge 2D numpy array to c function

Time:11-25

Good day,

I am new to C and ctypes in Python.

I am trying to pass a C function into the Python code.

I keep receiving the following error: "access violation reading 0x0..." when reading a 2D arrays ("St" of shape 10,000 x 521 and "dZ" of shape 10,000 x 520) in the following C function:

#include <math.h>
#include <stdio.h>

double change (double * dZ, double * St, size_t lenTaus, size_t lenSims)
{
    size_t i, j;
    double a, b;
    for (i = 0; i < lenSims; i  ) /*Iterate through simulations.*/
    {
        for (j = 0; j < (lenTaus - 1); j  ) /*Iterate through taus.*/
        {
            a = St[lenTaus * i   j];
            b = dZ[lenTaus * i   j];
        }
    }
    return 0.0;
}

The variables "lenSims" and "lenTaus" are 10,000 and 521 respectively.

The Python code to call the C function is:

import ctypes
impor t numpy as np
cCode = ctypes.CDLL("cCode_e.so") ### Read the C code in a form of shared library.
cCode.change.argtypes = [ctypes.POINTER(ctypes.c_double), ctypes.POINTER(ctypes.c_double), ctypes.c_size_t, ctypes.c_size_t] ### Let know what kind of input we provide to the C function.
cCode.change.restype = ctypes.c_double ### Let know what kind of output we expect from the C function.
St_Python = np.zeros([10000,521])
dZ_Python = np.random.randn(10000,520)
St = St_Python.ctypes.data_as(ctypes.POINTER(ctypes.c_double)) ### Convert a numpy array into a pointer to an array of doubles.
dZ = dZ_Python.ctypes.data_as(ctypes.POINTER(ctypes.c_double)) ### Convert a numpy array into a pointer to an array of doubles.
lenTaus = St_Python.shape[1] ### Find the number of columns in the original array.
lenSims = St_Python.shape[0] ### Find the number of rows in the original array.
out = cCode.change(dZ, St, lenTaus, lenSims) ### Call the C function

If I understand the problem correctly, I work with memory incorrectly when passing the whole arrays as pointers to the C function. But I do not know how to pass them in the correct way.

May I ask for your help?

Best regards,

Evgenii

CodePudding user response:

It looks that the problem is caused by a buffer overflow.

Assuming that the arrays are defined as:

St_Python = np.zeros([10000,521])
dZ_Python = np.random.randn(10000,520)

In C function the parameters lenTaus and LenSims are 521 and 10000 respectively. As result the final offset at which dZ is accessed is:

lenTaus * i   j = lenTaus * (lenSims-1)   (lenTaus - 1 - 1)
                = 521*9999   521-1-1
                = 5209998

The size of dz is 10000 * 520 what is 5200000 that is smaller than the final offset thus there is a buffer overflow and Undefined Behavior is invoked.

One of solutions is to change offset calculations for dZ to:

            b = dZ[(lenTaus - 1) * i   j];
  • Related