Home > OS >  Cython initialize matrix with zeros
Cython initialize matrix with zeros

Time:06-19

Description

I simply want to create a matrix of rows x cols that is filled with 0s. Always working with numpy I thought using np.zeros as described in the docs is the easiest:

DTYPE = np.int
ctypedef np.int_t DTYPE_t

def f1():
    cdef:
        int dim = 40000
        int i, j
        np.ndarray[DTYPE_t, ndim=2] mat = np.zeros([40000, 40000], dtype=DTYPE)
        
    for i in range(dim):
        for j in range(dim):
            mat[i, j] = 1

Then I compared this using the arrays in c:

def f2():
    cdef:
        int dim = 40000
        int[40000][40000] mat
        int i, j
    
    for i in range(dim):
        for j in range(dim):
            mat[i][j] = 1

The numpy version took 3 secs on my pc whereas the c version only took2.4e-5 secs. However when I return the array from f2() I noticed it is not zero filled (of course here it can't be, i==j however when not filling it it won't return a 0 array either). How can this be done in cython. I know in regular C it would be like: int arr[n][m] = {};.

Question

How can the c array be filled with 0s? (I would go for numpy instead if there is something obvious wrong in my code)

CodePudding user response:

You do not want to be writing code like this:

  1. int[40000][40000] mat generates a 6 gigabyte array on the stack (assuming 4 byte ints). Typically maximum stack sizes are of the order of a few Mb. I have no idea how this isn't crashing your PC.

  2. However when I return the array from f2() [...]

    The array you have allocated is completely local to the function. From a C point of view you cannot return it since it ceases to exist after the function has finished. I think Cython may convert it to a (nested) Python list for you. This requires a slow copy element-by-element and is not what you want.

For what you're doing here you're much better just using Numpy.


Cython doesn't support a good equivalent of the C arr = {} so if you do want initialize sensible, small C arrays you need to use of one:

  1. loops,
  2. memset (which you can cimport from libc.string),
  3. Create a typed memoryview of it and do memview[:,:] = 0

The numpy version took 3 secs on my pc whereas the c version only took2.4e-5 secs.

This kind of difference usually suggests that the C compiler has optimized some code out (by detecting that the result is unused). It is unlikely to be a genuine speed-up.

  • Related