I am trying to create a big array for a high dim
in y_shift = np.fromfunction(lambda i,j: (i)>>j, ((2**dim), dim), dtype=np.uint32)
. For example dim=32. I have two questions
1.- How to improve the speed in term of time
2.- How to avoid the message for dim=32 zsh: killed python3
EDIT:: Alternative you can consider to use uint8 instead of uint32
y_shift = np.fromfunction(lambda i,j: (1&(i)>>j), ((2**dim), dim), dtype=np.uint8)
CodePudding user response:
To answer your question:
You get the error zsh: killed python3
because you run out of memory.
If you want to run the code you initially proposed:
dim =32
y_shift = np.fromfunction(lambda i,j: (i)>>j, ((2**dim), dim), dtype=np.uint32)
You would need more than 500GB of memory, see here.
I would recommend thinking of alternatives and avoid trying to save the entire array to memory.
CodePudding user response:
fromfunction
just does 2 things:
args = indices(shape, dtype=dtype)
return function(*args, **kwargs)
It makes the indices
array
In [247]: args = np.indices(((2**4),4))
In [248]: args.shape
Out[248]: (2, 16, 4)
and it passes that array to your function
In [249]: args[0]>>args[1]
Out[249]:
array([[ 0, 0, 0, 0],
[ 1, 0, 0, 0],
[ 2, 1, 0, 0],
[ 3, 1, 0, 0],
...
[14, 7, 3, 1],
[15, 7, 3, 1]])
With dim=32:
In [250]: ((2**32),32)
Out[250]: (4294967296, 32)
the resulting args
array will be (2, 4294967296, 32)
. There's no way around that in terms of speed or memory use.