I'm trying to read a file which is of 3GB and compressed in json.gz
. I found one thing that its content between lines 1 to 10 cannot be read by 12GB
of RAM
as
for f in s_file:
print(f)
try to read from memory. And I'm trying it on GooleColab.
This is code, I have successfully read
size_to_read = 1024000
def gunzip_shutil(source_filepath, dest_filepath,):
with gzip.open(source_filepath, 'rb') as s_file:
print(s_file.read(size_to_read).decode())
gunzip_shutil(f"{default_path_download}/{name}", "")
What I want is that read from 0 to 8192 then it starts reading from 8192 so that I can see bytes by bytes as it won't overload memory.
Something like :
print(f.read(range(4096, 8192)))
Thanks For Your Help!
CodePudding user response:
You can try something like this:
test.py:
N = 8192
def main():
buf = bytearray(N)
with open("10G.bin", "rb") as f:
while f.readinto(buf):
mview = memoryview(buf)
print(mview[32:40].tobytes())
break # tmp
if __name__ == "__main__":
main()
The memoryview object just references a particular slice without copying bytes.