I am looking to automatically clean directories that contain original photos and smaller resolutions of a single photo.
I have the following structure for file names
original_image.jpg
original_image-1024x768.jpg
original_image-800x600.jpg
original_image-640x480.jpg
Is there a way, using a windows script (cmd, not PowerShell) to look through files in a directory, and delete any files that has the same name followed by a dash, a group of digits, and x, another group of digits, then the samme extension as the original file?
CodePudding user response:
OOh - not that easy! Be careful!
@ECHO OFF
SETLOCAL enabledelayedexpansion
rem The following setting for the source directory is a name
rem that I use for testing and deliberately include names which include spaces to make sure
rem that the process works using such names. These will need to be changed to suit your situation.
SET "sourcedir=u:\your files"
FOR /f "delims=" %%b IN (
'dir /b /a-d "%sourcedir%\*" ^|findstr /i /v /r ".*-[0-9]*x[0-9]*.*" '
) DO (
SET "filter=%%~nb-[0-9]*x[0-9]*\%%~xb"
SET "filter=!filter: =\ !"
FOR /f "delims=" %%e IN (
'dir /b /a-d "%sourcedir%\%%~nb*%%~xb" ^|findstr /i /r "!filter!" '
) DO ECHO DEL "%%e"
)
GOTO :EOF
Always verify against a test directory before applying to real data.
The required DEL commands are merely ECHO
ed for testing purposes. After you've verified that the commands are correct, change ECHO DEL
to DEL
to actually delete the files.
The outer loop (%%b) processes a 'dir' list of the directory /a-d
without directorynames and /b
in basic form (names only - no headers. footers or details.)
The list is passed to a findstr
command to filter the filename pattern required. The pipe |
must be escaped
by a caret^
to tell cmd
that the pipe belongs to the single-quoted command to be executed, not to the for
.
The findstr
filter is /i
case-insensitive /r
using a regular expression. The /v
option outputs those lines that do not match the filter. The regular expression is .*
any number of any character, -
-literal dash [0-9]*
any number of numeric characters x
literal "x" [0-9]*
any numerics again and .*
any characters.
The delims=
causes the filenames to be delivered literally to %%b
by setting no delimiters and hence just one token. See for /?
from the prompt or endless examples on SO for documentation.
Next step is to set up the filter for the next findstr
. This is %%~nb
the name part of the filename in %%b
and %%~xb
the extension part (including the dot). The \
escapes the dot contributed by %%~xb
, making it a literal dot instead of a single-character-match.
The next step replaces each "space" with "<kbd>space" See set /?
from the prompt or endless examples on SO for documentation.
Finally, execute another dir
but this time, look for files matching the pattern of the filename and the extension of %%b
, separated by anything and filtering using the string established in filter
.
delayedexpansion
is required since filter
is being changed within a code block (parenthesised sequence of lines) - so !var!
retrieves the current value of the variable where %var%
is the original vale (when the block was encountered).
Why the extra complexity?
Suppose the file list includes
original_image.jpg
original_image-1024x768.jpg
original_image-800x600.jpg
"original image.jpg"
"original image-1024x768.jpg"
"original image-800x600.jpg"
Then because a Space in a findstr
causes an or
of the strings before and after the space, so whereas the %%e
dir
selects on those files matching "original image*.jpg", this includes "original image.jpg". The regex constructed would be "original image-[0-9]*x[0-9]*\.jpg"
which matches "original" and therefore "original image.jpg" will be selected for deletion.