Home > Mobile >  what is Numpy's memory management when overwriting an array with a partial view of itself?
what is Numpy's memory management when overwriting an array with a partial view of itself?

Time:12-10

In the second assignment of the code below:

original = np.ones(1000000000, dtype=np.int8).reshape(10, 10000, 10000)  # 1GB in size
original = original[0, :, :]  # 0.1GB in size?

Does numpy?

  1. copy the first 100000000 elements of original into a new array
  2. deallocate the totality of original
  3. assigns the name original to the new 0.1GB array

I ask because original[0, :, :] by itself does not cause any allocation, it's just a (partial) view. I wonder if the second assignment still keeps the original 1GB in memory or goes through the steps I just enumerated.

CodePudding user response:

as original[0, :, :] is a view, it has a reference to the main array, so the main array cannot be deallocated as its reference count is not zero, so you end up with original being a view into a 1GB array in memory.

in order to remove the 1GB array from memory you should copy the view to be a real array instead of a view.

code to check that

import numpy as np
import os, psutil
process = psutil.Process(os.getpid())
print("before allocation", process.memory_info().rss/2**20)  # memory in MBytes
original = np.ones(1000000000, dtype=np.int8).reshape(10, 10000, 10000)  # 1GB in size
print("after allocation", process.memory_info().rss/2**20)  # memory in MBytes
original = original[0, :, :]  # 0.1GB in size?
print("after slicing", process.memory_info().rss/2**20)  # memory in MBytes
original = original.copy()
print("after copy", process.memory_info().rss/2**20)  # memory in MBytes
before allocation 26.75
after allocation 980.4296875
after slicing 980.4296875
after copy 122.125
  • Related