Home > Software design >  Iterating over ZipFile to get list of file names and sizes
Iterating over ZipFile to get list of file names and sizes

Time:05-03

I am trying to iterate through a folder which contains n subfolders, each of which has a subfolder with TIFF files in it. Using the zipfile module, I've tried the following:

path = 'D:\Project\I20\top'
with ZipFile(path, 'r') as zipObj:
    listOfiles = zipObj.infolist()
    for elem in listOfiles:
        print(elem.filename, ' : ', elem.file_size, ' : ')

I am getting the following error when I try to do this:

Traceback (most recent call last):
  File "D:\Test\algo\checksize.py", line 30, in <module>
    with ZipFile(path, 'r') as zipObj:
  File "C:\Users\manaT\AppData\Local\Programs\Python\Python39\lib\zipfile.py", line 1239, in __init__
    self.fp = io.open(file, filemode)
PermissionError: [Errno 13] Permission denied: 'D:\\Project\\I20\\top'

I have tried running Atom as administrator but that doesn't work. I have tried changing the drive's properties to allow full access to authenticated users.

enter image description here enter image description here

The folder properties are still read only and every time I change it it reverts back to read only.

Is there a fix for this? If there is another method that will allow me to loop through the files in the folders within the zip files and store their names and sizes in a dictionary that would help as well.

CodePudding user response:

If want to get list of .zip files in a folder then can use glob() or rglob() on the directory. Also, the ZipFile class expects a .zip file path as the argument not a directory. Then you can iterate over the file entries in the zip file.

from pathlib import Path
from zipfile import ZipFile

zips = {} # dictionary of zip files and sizes
path = Path(r'D:\Project\I20\top')
for file in path.glob('*.zip'):
    with ZipFile(file, 'r') as zipObj:
        for entry in zipObj.infolist():
            print(entry.filename, ' : ', entry.file_size, ' : ')
            # store filename and size in dictionary
            zips[entry.filename] = entry.file_size

If want to recursively find .zip files in sub-folders in a target folder then replace glob() with rglob().

If zip file includes directory entries add if not entry.filename.endswith('/'): to ignore directory entries before printing the entry and/or adding it to the dictionary.

CodePudding user response:

You don't open a directory using ZipFile, you can only open a zip file. You need to read the list of files in the zipfile:

with open(zipFile, 'r') as f:
   files = f.infolist()
filenames = [file.filename for file in files]

You will now have a list of strings representing filenames. You can now manipulate these strings as if they were filenames and figure out what's in what directory.

  • Related