This question pertains to the situation where
- An image was uploaded, say
mypicture.jpg
- Wordpress created multiple copies of it with different resolutions like
mypicture-300x500.jpg
andmypicture-600x1000.jpg
- You delete the original image only
In this scenario, the remaining photos on the filesystem are mypicture-300x500.jpg
and mypicture-600x1000.jpg
.
How can you script this to find these "dangling" images with the missing original and delete the "dangling" images.
CodePudding user response:
You could use find
to find all lower resolution pictures with the -regex
test:
find . -type f -regex '.*-[0-9] x[0-9] \.jpg'
And this would be much better than trying to parse the ls
output which is for humans only, not for automation. A safer (and simpler) bash script could thus be:
#!/usr/bin/env bash
while IFS= read -r -d '' f; do
[[ "$f" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ] &&
echo rm -f "$f"
done < <(find . -type f -regex '.*-[0-9] x[0-9] \.jpg' -print0)
(delete the echo
once you will be convinced that it works as expected).
Note: we use the
-print0
action and the emptyread
delimiter (-d ''
) to separate the file names with theNUL
character instead of the newline character. This is preferable because it works as expected even if you have unusual file names (e.g., with spaces).
Note: as we test the file name inside the loop we could simply search for files (
find . -type f -print0
). But I suspect that if you have a large number of files the performance would be negatively impacted. So keeping the-regex
test is probably better.
Bash loops are OK but they tend to become really slow when the number of iteration increases. So, let's incorporate our simple bash script in a single find
command with the -exec
action:
find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -print
Note:
bash -c
takes a script to execute as first argument, then the positional parameters to pass to the script, starting with$0
. This is why we pass_
(my favourite for don't care), followed by{}
(the current file path).
Note:
find
action but here it is needed because-exec
is one of thefind
actions that inhibit the default behaviour.
This will print a list of files. Check that it is correct and, once you will be satisfied, add the -delete
action:
find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -delete -print
See man find
and man bash
for more explanations.
Demo:
$ touch mypicture.jpg mypicture-300x500.jpg mypicture-600x1000.jpg
$ find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -print
$ rm -f mypicture.jpg
$ find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -print
./mypicture-300x500.jpg
./mypicture-600x1000.jpg
$ find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9] x[0-9] \.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -delete -print
./mypicture-300x500.jpg
./mypicture-600x1000.jpg
$ ls *.jpg
ls: cannot access '*.jpg': No such file or directory
One last note: if, by accident, one of your full resolution picture matches the regular expression for lower resolution pictures (e.g., if you have a
balloon-1x1.jpg
full resolution picture) it will be deleted. This is unfortunate but according your specifications there is no easy way to distinguish it from an orphan lower resolution picture. Be careful...
CodePudding user response:
I've written a Bash script that will attempt to find the original filename (i.e. mypicture.jpg
) based on scraping away the WordPress resolution (i.e. mypicture-300x500.jpg
), and if it's not found, delete the "dangling image" (i.e. rm -f mypicture-300x500.jpg
)
#!/bin/bash
for directory in $(find . -type d)
do
for image in $(ls $directory)
do
echo "The current filename is $image"
resolution=$(echo $image | rev | cut -f 1 -d "-" | rev | xargs)
echo "The resolution is $resolution"
extension=$(echo $resolution | rev| cut -f 1 -d "." | rev | xargs)
echo "The extension is $extension"
resolutiononly=$(echo $resolution | sed "s@.$extension@@g")
echo "The resolution only is $resolutiononly"
pattern="[0-9] x[0-9] "
if [[ $resolutiononly =~ $pattern ]]; then
echo "The pattern matches"
originalfilename=$(echo $image | sed "s@-$resolution@.$extension@g")
echo "The current filename is $image"
echo "The original filename is $originalfilename"
if [[ -f "$originalfilename" ]]; then
echo "The file exists $originalfilename"
else
rm -f $directory/$image
fi
else
break
fi
done
done