Home > database >  python module ZipFile get base folder using regex
python module ZipFile get base folder using regex

Time:02-06

Assume this zip file "acme_example.zip" contains below content of the files/folders :

acme/one.txt
acme/one1.txt
acme/one2.txt
acme/one3.txt
acme/one4.txt
__MACOSX
.DS_Store

And i am using this below script

    output_var = []
    skip_st = '__MACOSX'
    with ZipFile('acme_example.zip','r') as ZipObj:
        listfFiles = ZipObj.namelist()
        for elm in listfFiles:
            p = Path(elm).parts[0]
            if p not in output_var:
                output_var.append(p)
        return re.sub(skip_st, '', ''.join(str(item) for item in output_var))

This above script will exclude "__MAXOSX" but is there a way to also exclude ".DS_Store" so that we will only return "acme" as folder name?

CodePudding user response:

As you iterate over the values, that would be better to exclude them at this moment, also as they are already strings, you can simplify the code in the join part

skip_st = ['__MACOSX', '.DS_Store']
with ZipFile('acme_example.zip','r') as ZipObj:
    listfFiles = ZipObj.namelist()
    for elm in listfFiles:
        p = Path(elm).parts[0]
        if p not in output_var and p not in skip_st:
            output_var.append(p)
    return ''.join(output_var)

So you know, here's how you can filter at the end

  • with a list

    skip_st = ['__MACOSX', '.DS_Store']
    # ...
    return ''.join(item for item in output_var not in skip_st)
    
  • with a pattern

    skip_st = '__MACOSX|.DS_Store'
    # ...
    return re.sub(skip_st, '', ''.join(output_var))
    
  • Related