Home > front end >  Only archive certain folders using python
Only archive certain folders using python

Time:07-09

Using following folder structure (every element is a folder

├─folderA
│  ├─a1
│  ├─a2
│  └─a3

Is there any way to zip folders a1 and a2 into a folderA.zip (including all file contents and subfolders) like so:

├─folderA.zip
│  ├─a1
│  ├─a2

Thank you

CodePudding user response:

Slightly modified answer from here: Python zip multiple directories into one zip file

import os
import zipfile


def zipdir(path, ziph):
    # ziph is zipfile handle
    for root, dirs, files in os.walk(path):
        for file in files:
            ziph.write(os.path.join(root, file),
                       os.path.relpath(os.path.join(root, file),
                                       os.path.join(path, '..')))


def zipit(zip_dir, sub_dir_list):
    zipf = zipfile.ZipFile(zip_dir   '.zip', 'w', zipfile.ZIP_DEFLATED)
    for sub_dir in sub_dir_list:
        zipdir(sub_dir, zipf)
    zipf.close()

zip_dir = '/folderA'
sub_dir_list = ['a1', 'a2']
zipit(zip_dir, sub_dir_list)

This creates a /folderA.zip including the specified sub directories.

CodePudding user response:

Here's a different approach that uses patterns for filtering items to be archived.

Listing:

code00.py:

#!/usr/bin/env python

import glob
import os
import sys
import zipfile as zf


def archive(src_dir, pattern="**", arc_name=None):
    items = 0
    files = 0
    if arc_name:
        if not arc_name.endswith(".zip"):
            arc_name  = ".zip"
    else:
        arc_name = os.path.basename(src_dir)   ".zip"
    if pattern != "**":
        pattern = os.path.join(pattern, "**")
    with zf.ZipFile(arc_name, mode="w", compression=zf.ZIP_DEFLATED) as zipf:
        for f in glob.iglob(os.path.join(src_dir, pattern), recursive=True):
            items  = 1
            if os.path.isdir(f):
                continue
            zipf.write(os.path.normpath(f))
            files  = 1
    return items, files


def main(*argv):
    dir_name = "folderA"
    patterns = (
        "**",
        "a1*",
        "a[12]*",
    )
    for idx, pat in enumerate(patterns):
        print("Pattern {:d} (\"{:s}\"): {:d} items out of which {:d} files".format(idx, pat, *archive(dir_name, pat, dir_name   str(idx))))


if __name__ == "__main__":
    print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
                                                   64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    rc = main(*sys.argv[1:])
    print("\nDone.")
    sys.exit(rc)

Output:

[cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q072900280]> sopr.bat
### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ###

[prompt]> tree /a /f
Folder PATH listing for volume SSD0-WORK
Volume serial number is AE9E-72AC
E:.
|   code00.py
|
\---FolderA
     ---a1
    |   |   f11.txt
    |   |
    |   \---a11
    |           f111.txt
    |
     ---a2
    |   \---a21
    |       \---a211
    |               f2111.tzt
    |
    \---a3
            f31.txt


[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.09_test0\Scripts\python.exe" ./code00.py
Python 3.9.9 (tags/v3.9.9:ccb0e6a, Nov 15 2021, 18:08:50) [MSC v.1929 64 bit (AMD64)] 064bit on win32

Pattern 0 ("**"): 11 items out of which 4 files
Pattern 1 ("a1*"): 4 items out of which 2 files
Pattern 2 ("a[12]*"): 8 items out of which 3 files

Done.

[prompt]> dir /b
code00.py
FolderA
folderA0.zip
folderA1.zip
folderA2.zip

Didn't check how symlinks are handled, but I guess that ZipFile dereferences them.

  • Related