Home > Software engineering >  Python: iterating over zip files to get list of file names and sizes
Python: iterating over zip files to get list of file names and sizes

Time:05-02

I am trying to iterate through a folder which contains n subfolders, each of which has a subfolder with TIFF files in it. Using the zipfile module, I've tried the following:

path = 'D:\Project\I20\top'
with ZipFile(path, 'r') as zipObj:
    listOfiles = zipObj.infolist()
    for elem in listOfiles:
        print(elem.filename, ' : ', elem.file_size, ' : ')

I am getting the following error when I try to do this:

Traceback (most recent call last):
  File "D:\Test\algo\checksize.py", line 30, in <module>
    with ZipFile(path, 'r') as zipObj:
  File "C:\Users\manaT\AppData\Local\Programs\Python\Python39\lib\zipfile.py", line 1239, in __init__
    self.fp = io.open(file, filemode)
PermissionError: [Errno 13] Permission denied: 'D:\\Project\\I20\\top'

I have tried running Atom as administrator but that doesn't work. I have tried changing the drive's properties to allow full access to authenticated users.

enter image description here enter image description here

The folder properties are still read only and every time I change it it reverts back to read only.

Is there a fix for this? If there is another method that will allow me to loop through the files in the folders within the zip files and store their names and sizes in a dictionary that would help as well.

CodePudding user response:

If want to get list of .zip files in a folder then can use glob() or rglob() on the directory. Also, the ZipFile class expects a .zip file path as the argument not a directory. Then you can iterate over the file entries in the zip file.

from pathlib import Path
from zipfile import ZipFile

path = Path(r'D:\Project\I20\top')
for file in path.glob('*.zip'):
  with ZipFile(file, 'r') as zipObj:    
    for entry in zipObj.infolist():
      print(entry.filename, ' : ', entry.file_size, ' : ')

If want to recursively find .zip files in target folder then replace glob(_) with rglob().

CodePudding user response:

You don't open a directory using ZipFile, you can only open a zip file. You need to read the list of files in the zipfile:

with open(zipFile, 'r') as f:
   files = f.infolist()
filenames = [file.filename for file in files]

You will now have a list of strings representing filenames. You can now manipulate these strings as if they were filenames and figure out what's in what directory.

  • Related