Home > OS >  Bash delete directories containing certain files
Bash delete directories containing certain files

Time:12-06

Given a directory structure like

parent
  - child1
    - file1
    - file2
    - file3
  - child2
    - file1
    - file3
  - child3
    - file1
    - file2
    - file3
    - file4

what command will delete all child directories of parent that contain all or some of file1, file2, file3 but no others. I.e. in the example child1 and child2 should be deleted but child3 should remain since it also contains file4.

Please post both a dry run and actual version of the command to first check which folders would be deleted.

CodePudding user response:

You would probably need a function that deletes the child directories only if it does not contain a set of input file(s) you want to check for.

#!/bin/bash
delete_dir() {
    local subdir=$1
    local string_of_files=$2

    #convert list of files into array
    IFS=','
    read -ra files_to_keep <<< "$string_of_files"

    local list_of_files=()
    if [ -d "$subdir" ]; then
        for i in $subdir/*; do list_of_files=("${list_of_files[@]}" $(basename $i)); done

        local list_of_matched_files=()
        for i in ${list_of_files[@]}; do
            if [[ " ${files_to_keep[@]} " =~ " $i " ]]; then
               list_of_matched_files=("${list_of_matched_files[@]}" "$i")
            fi
        done

        if [ "${#list_of_matched_files[@]}" -eq 0 ]; then
            echo "deleting $subdir"
            #rm -r $subdir
        else
            echo "Not deleting $subdir, since it contains files you want to keep!!"
        fi
    else
        echo "directory $subdir not found"
    fi
    }

# Example1: function call
delete_dir child1 file4

# Example2: For your case you can loop through subdirectories like,
for dir in $(ls -d parent/child*); do
    delete_dir $dir file4
done

example output:

$ ./test.sh
Not deleting child1/, since it contains files you want to keep!!
Not deleting child2/, since it contains files you want to keep!!
deleting child3/

You'd be better off using python for such operations if you're are at liberty to choose, since you can make it much simpler and modular.

CodePudding user response:

Assuming the parent directory only contains child folders, this one-liner would list the folders containing only files on the marked for deletion list:

for child in *; do if [ "$(ls -1 $child | egrep -v '(file1|file2|file3)'|wc -l)" -eq "0" ]; then echo "would delete $child"; fi ; done

This one line command would delete them:

for child in *; do if [ "$(ls -1 $child | egrep -v '(file1|file2|file3)'|wc -l)" -eq "0" ]; then rm -rf $child; fi ; done

CodePudding user response:

I think this is what you want? Loops through the files in parent directory and deletes them if the resulting file path doesn't contain file4

this shows the files that will be deleted

while IFS= read -r -d $'\0' f; do
  if [[ "$f" =~ 'file4' ]]; then
    continue;
  else
    echo "rm "$f""
  fi
done < <(find parent -type f -print0)

this deletes them

while IFS= read -r -d $'\0' f; do
  if [[ "$f" =~ 'file4' ]]; then
    continue;
  else
    rm "$f"
  fi
done < <(find parent -type f -print0)

CodePudding user response:

You can use compgen -G and globs for testing the child directories.
The following pure bash code will remove any sub-directory of parent that only has file1 or file2 or file3 files:

#!/bin/bash
shopt -s extglob
for child in parent/*/
do
    # child must have at least one file
    compgen -G "$child/*" > /dev/null || continue

    # child doesn't have any other file than file1, file2 and file3
    compgen -G "$child/!(file1|file2|file3)" > /dev/null || echo rm -rf "$child"
done
  • Related