Home > Software design >  Delete smaller resolutions of an image based on file name
Delete smaller resolutions of an image based on file name

Time:08-30

I am looking to automatically clean directories that contain original photos and smaller resolutions of a single photo.

I have the following structure for file names

original_image.jpg
original_image-1024x768.jpg
original_image-800x600.jpg
original_image-640x480.jpg

Is there a way, using a windows script (cmd, not PowerShell) to look through files in a directory, and delete any files that has the same name followed by a dash, a group of digits, and x, another group of digits, then the samme extension as the original file?

CodePudding user response:

OOh - not that easy! Be careful!

@ECHO OFF
SETLOCAL enabledelayedexpansion
rem The following setting for the source directory is a name
rem that I use for testing and deliberately include names which include spaces to make sure
rem that the process works using such names. These will need to be changed to suit your situation.

SET "sourcedir=u:\your files"

FOR /f "delims=" %%b IN (
 'dir /b /a-d "%sourcedir%\*" ^|findstr /i /v /r ".*-[0-9]*x[0-9]*.*" '
) DO (
 SET "filter=%%~nb-[0-9]*x[0-9]*\%%~xb"
 SET "filter=!filter: =\ !"
 FOR /f "delims=" %%e IN (
  'dir /b /a-d "%sourcedir%\%%~nb*%%~xb" ^|findstr /i /r "!filter!" '
 ) DO ECHO DEL "%%e"
)

GOTO :EOF

Always verify against a test directory before applying to real data.

The required DEL commands are merely ECHOed for testing purposes. After you've verified that the commands are correct, change ECHO DEL to DEL to actually delete the files.

The outer loop (%%b) processes a 'dir' list of the directory /a-d without directorynames and /b in basic form (names only - no headers. footers or details.)

The list is passed to a findstr command to filter the filename pattern required. The pipe | must be escaped by a caret^ to tell cmd that the pipe belongs to the single-quoted command to be executed, not to the for.

The findstr filter is /i case-insensitive /r using a regular expression. The /v option outputs those lines that do not match the filter. The regular expression is .* any number of any character, - -literal dash [0-9]* any number of numeric characters x literal "x" [0-9]* any numerics again and .* any characters.

The delims= causes the filenames to be delivered literally to %%b by setting no delimiters and hence just one token. See for /? from the prompt or endless examples on SO for documentation.

Next step is to set up the filter for the next findstr. This is %%~nb the name part of the filename in %%b and %%~xb the extension part (including the dot). The \ escapes the dot contributed by %%~xb, making it a literal dot instead of a single-character-match.

The next step replaces each "space" with "<kbd>space" See set /? from the prompt or endless examples on SO for documentation.

Finally, execute another dir but this time, look for files matching the pattern of the filename and the extension of %%b, separated by anything and filtering using the string established in filter.

delayedexpansion is required since filter is being changed within a code block (parenthesised sequence of lines) - so !var! retrieves the current value of the variable where %var% is the original vale (when the block was encountered).

Why the extra complexity?

Suppose the file list includes

original_image.jpg
original_image-1024x768.jpg
original_image-800x600.jpg
"original image.jpg"
"original image-1024x768.jpg"
"original image-800x600.jpg"

Then because a Space in a findstr causes an or of the strings before and after the space, so whereas the %%e dir selects on those files matching "original image*.jpg", this includes "original image.jpg". The regex constructed would be "original image-[0-9]*x[0-9]*\.jpg" which matches "original" and therefore "original image.jpg" will be selected for deletion.

  • Related