I have some troubles after I created a class to process raster images. The class includes different methods for checking DBs and processing the images.
The usage script is super simple:
from hidroclabc import HidroCLVariable, mod13q1extractor
ndvi = HidroCLVariable('ndvi', some_db)
evi = HidroCLVariable('evi', some_db)
nbr = HidroCLVariable('nbr', some_db)
modext = mod13q1extractor(ndvi,evi,nbr)
modext.run_extraction()
The method run_extraction()
is the following:
for scene in scenes_to_process:
if scene not in self.ndvi.indatabase:
print(f'Processing scene {scene} for ndvi')
r = re.compile('.*' scene '.*')
selected_files = list(filter(r.match, scenes_path))
start = time.time()
file_date = datetime.strptime(scene, 'A%Y%j').strftime('%Y-%m-%d')
mos = mosaic_raster(selected_files,'250m 16 days NDVI')
mos = mos * 0.1
temporal_raster = os.path.join(tempfolder,'ndvi_' scene '.tif')
result_file = os.path.join(tempfolder,'ndvi_' scene '.csv')
mos.rio.to_raster(temporal_raster, compress='LZW')
run_WeightedMeanExtraction(temporal_raster,result_file)
write_line(self.ndvi.database, result_file, self.ndvi.catchment_names, scene, file_date, nrow = 1)
end = time.time()
time_dif = str(round(end - start))
currenttime = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
print(f'Time elapsed for {scene}: {str(round(end - start))} seconds')
write_log(hcl.log_veg_o_modis_ndvi_mean,scene,currenttime,time_dif,self.ndvi.database)
os.remove(temporal_raster)
os.remove(result_file)
The method does several steps for getting an observation for a given variable. The code works, but it doesn't release memory. Since it's a loop, with every iteration the used memory increases:
When I close the terminal window executing this process, the memory used drops significantly:
This is happening with a Linux Ubuntu Server LTR 22. When I run the same code in. my laptop (macOS), the memory usage is quite different (with half of server's RAM):
This didn't happen with functional programming approach, the memory crashed when I placed the for loop inside a class.
How can I improve the memory management of my class?
CodePudding user response:
Super simple fix. Cleaning the garbage collector at the end of the loop in the class' method:
import gc
#code
for scene in scenes_to_process:
if scene not in self.ndvi.indatabase:
# code
gc.collect()
Now the memory usage looks beautiful: