I am trying to write a byte array at the beginning of a file and at a (much) later point I want to split them again, an retrieve the original file. the byte_array is just a small jpeg.
# write a byte array at the beginning of a file
def write_byte_array_to_beginning_of_file( byte_array, file_path, out_file_path ):
with open( file_path, "rb" ) as f:
with open( out_file_path, "wb" ) as f2:
f2.write( byte_array )
f2.write( f.read( ) )
while the function works, it hogs a lot of memory. It seems like it reads the files to memory first befor doing something. There are some files in excess of 40gb that i need to work on, and it's only done on a small NAS with 8Gb of RAM.
What would be a memory conscious to achieve this?
CodePudding user response:
You can read from the original file in chunks instead of reading the whole thing.
def write_byte_array_to_beginning_of_file( byte_array, file_path, out_file_path, chunksize = 10 * 1024 * 1024 ):
with open( file_path, "rb" ) as f, open( out_file_path, "wb" ) as f2:
f2.write( byte_array )
while True:
block = f.read(chunksize)
if not block:
break
f2.write(block)
This reads it in chunks of 10 MB by default, which you can override.