Home > Software engineering >  Recursively unzip all subdirectories while retaining file structure
Recursively unzip all subdirectories while retaining file structure

Time:11-27

I'm new to bash scripting, and i'm finding it hard to solve this one.

I have a parent folder containing a mixture of sub directories and zipped sub directories.

Within those sub directories are also more nested zip files.

Not only are there .zip files, but also .rar and .7z files which also contain nested zips/rars/7zs.

I want to unzip, unrar and un7z all my nested sub directories recursively until the parent folder no longer contains any .rar, .zip, .7zip files. (these eventually need to be removed when they have been extracted). There could be thousands of sub directories all at different nesting depths. You could have zipped folders or zipped files.

However I want to retain my folder structure, so the unzipped folders must stay in the same place where it has been unzipped

I have tried this script that works for unzipping, but it does not retain the file structure.

#!/bin/bash

while [ "`find . -type f -name '*.zip' | wc -l`" -gt 0 ]

do
    find . -type f -name "*.zip" -exec unzip -- '{}' \; -exec rm -- '{}' \;
done

                             

I want for example:

folder 'a' contain zipped folder 'b.zip' which contains a zipped text file pear.zip (which is pear.txt that has been zipped to pear.zip a/b.zip(/pear.zip))

I would like folder 'a' to contain 'b' to contain pear.txt 'a/b/pear.txt'

The script above brings 'b' (b is empty) and pear both into folder 'a' where the script is executed which is not what I want. eg 'a/b' and 'a/pear.txt'

CodePudding user response:

You could try this:

#!/bin/bash

while :; do
    mapfile -td '' archives \
    < <(find . -type f -name '*.zip' -o -name '*.7z' -print0)

    [[ ${#archives[@]} -eq 0 ]] && break

    for i in "${archives[@]}"; do
        case $i in
            *.zip) unzip -d "$(dirname "$i")" -- "$i";;
            *.7z)  7z x "-o$(dirname "$i")" -- "$i";;
        esac
    done

    rm -rf "${archives[@]}" || break
done
  • Every archive is listed by find. That list is extracted in the correct location and the archives removed. This repeats, until zero archives are found.

  • You can add an equivalent unrar command (I'm not familiar with it).

  • Add -o -name '*.rar' to find, and another case to case. If there's no option to specify a target directory with unrar, you could use cd "$(dirname "$i")" && unrar "$i".

  • There are some issues with this script. In particular, if extraction fails, the archive is still removed. Otherwise it would cause an infinite loop. You can use unzip ... || exit 1 to exit if extraction fails, and deal with that manually.

  • It's possible to both avoid removal and also an infinite loop, by counting files which aren't removed, but hopefully not necessary.

  • I couldn't test this properly. YMMV.

  • Related