I have two directories with files that end in two different extensions:
Folder A (1348 FILES)
file.profile
file1.profile
file2.profile
file3.profile #<-- odd one out
Folder B (1204 FILES)
file.dssp
file1.dssp
file2.dssp
I have some files in folder A
that are not found in folder B
and should be removed for example file3.profile
would be deleted as it is not found in folder B
. I just want to retain those that are common in their filename, but excluding extension to end up with 1204 files in both
I saw some bash lines using diff but it does not consider this case, where the ones I want to remove are those that are not found in the corresponding other file.
CodePudding user response:
Here is a way to do it:
- for both A and B directories, list the files under each directory, without the extension.
- compare both lists, show only the file that does not appear in both.
Code:
#!/bin/bash
>a.list
>b.list
for file in A/*
do
basename "${file%.*}" >>a.list
done
for file in B/*
do
basename "${file%.*}" >>b.list
done
comm -23 a.list b.list
# cleanup
rm -f a.list b.list
"${file%.*}"
removes the extensionbasename
removes the pathcomm -23 ...
shows only the lines that appear only in a.list
CodePudding user response:
With find
:
find 'folder A' -type f -name '*.fasta.profile' -exec sh -c \
'! [ -f "folder B/$(basename -s .fasta.profile "$1").dssp" ]' _ {} \; -print
Replace -print
by -delete
when you will be convinced that it does what you want.
Or, maybe a bit faster:
find 'folder A' -type f -name '*.fasta.profile' -exec sh -c \
'for f in "$@"; do [ -f "folder B/$(basename -s .fasta.profile "$f").dssp" ] || echo rm "$f"; done' _ {}
Remove echo
when you will be convinced that it does what you want.
CodePudding user response:
Try this Shellcheck-clean Bash program:
#! /bin/bash -p
folder_a=PATH_TO_FOLDER_A
folder_b=PATH_TO_FOLDER_B
shopt -s nullglob
for ppath in "$folder_a"/*.profile; do
pfile=${ppath##*/}
dfile=${pfile%.profile}.dssp
dpath=$folder_b/$dfile
[[ -f $dpath ]] || echo rm -v -- "$ppath"
done
- It currently just prints what it would do. Remove the
echo
once you are sure that it will do what you want. shopt -s nullglob
makes globs expand to nothing when nothing matches (otherwise they expand to the glob pattern itself, which is almost never useful in programs).- See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for information about the string manipulation mechanisms used (e.g.
${ppath##*/}
).