Home > Net >  Improving np.fromfuction performance in terms
Improving np.fromfuction performance in terms

Time:10-20

I am trying to create a big array for a high dim in y_shift = np.fromfunction(lambda i,j: (i)>>j, ((2**dim), dim), dtype=np.uint32). For example dim=32. I have two questions

1.- How to improve the speed in term of time

2.- How to avoid the message for dim=32 zsh: killed python3

EDIT:: Alternative you can consider to use uint8 instead of uint32

y_shift = np.fromfunction(lambda i,j: (1&(i)>>j), ((2**dim), dim), dtype=np.uint8)

CodePudding user response:

To answer your question:

You get the error zsh: killed python3 because you run out of memory.

If you want to run the code you initially proposed:

dim =32
y_shift = np.fromfunction(lambda i,j: (i)>>j, ((2**dim), dim), dtype=np.uint32)

You would need more than 500GB of memory, see here.

I would recommend thinking of alternatives and avoid trying to save the entire array to memory.

CodePudding user response:

fromfunction just does 2 things:

args = indices(shape, dtype=dtype)
return function(*args, **kwargs)

It makes the indices array

In [247]: args = np.indices(((2**4),4))
In [248]: args.shape
Out[248]: (2, 16, 4)

and it passes that array to your function

In [249]: args[0]>>args[1]
Out[249]: 
array([[ 0,  0,  0,  0],
       [ 1,  0,  0,  0],
       [ 2,  1,  0,  0],
       [ 3,  1,  0,  0],
       ...
       [14,  7,  3,  1],
       [15,  7,  3,  1]])

With dim=32:

In [250]: ((2**32),32)
Out[250]: (4294967296, 32)

the resulting args array will be (2, 4294967296, 32). There's no way around that in terms of speed or memory use.

  • Related