I am trying to unzip multiple .gz extentions files into single .txt file. All these files have json data.
I tried the following code:
from glob import glob
import gzip
for fname in glob('.../2020-04/*gz'):
with gzip.open(fname, 'rb') as f_in:
with open('.../datafiles/202004_twitter/decompressed.txt', 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
But the decompressed.txt file only has the last .gz file's data.
CodePudding user response:
Just shuffle f_out
to the outside, so you open it before iterating over the input files and keep that one handle open.
from glob import glob
import gzip
with open('.../datafiles/202004_twitter/decompressed.txt', 'wb') as f_out:
for fname in glob('.../2020-04/*gz'):
with gzip.open(fname, 'rb') as f_in:
shutil.copyfileobj(f_in, f_out)
CodePudding user response:
Use "wba"
mode instead. a
opens in append mode. w
alone will erase the file upon opening.