Home > database >  Create a directory based on the start of a file name
Create a directory based on the start of a file name

Time:11-08

I wish to migrate a massive collection of files (eBooks), based on text at the start (author's name) I've tried coding a batch file, but I'm missing something fundamental.

Almost all of the files are

"Authors name - Book title.extention"
"Authors name - [Book series] - Book title.extention"

I wish to pass each file through a script (preferably a Batch file) that identifies the Author's name part (basically looks for the first " - ", and then creates a directory based on the starting trimmed string.

I've crafted a FOR & IF command based on research, but part of it doesn't work

SET str=%1%

FOR /L %%a in (1,1,50) do (

    if "!str:~%%a,3!"==" - " (
        set DName=!str:~0,%%a-1!
        MD DName
        MOVE %1 DName
        EXIT
    )
)

I don't fully understand the string truncation command "STRING:~STARTING POSITIONWITHINSTRING,STRINGLENGTH". I keep finding examples of people using it like it just works, but I'm failing at finding documentation explaining how it's meant to operate.

Can what I'm asking be done as a BATCH file? What am I doing wrong?

CodePudding user response:

Use a little trick to get rid of the delimiter string and title (Attention: it does not work with recommended syntax set "var=value", so use extreme care to not add any spaces with the set folder... command)

@echo off
setlocal
set "file=%~1"
SET "file=Authors name - Book title.extention"
set folder=%file: - =&REM %
ECHO md "%folder%" 2>nul
ECHO move "%file%" "%folder%\"

Remove the SET line (just for demonstration) and (after verifying it's working as intended), both ECHO commands.

Works even with author names containing -, like Karl-Heinz Writer because this splits by a string <space><dash><space>, which for isn't able to do.

CodePudding user response:

The task can be done with a batch file with following code:

@echo off
setlocal EnableExtensions DisableDelayedExpansion
for /F "eol=| delims=" %%I in ('dir "* - *" /A-D /B 2^>nul') do for /F "eol=| delims=-" %%J in ("%%I") do (
    if not exist "%%~nxJ" md "%%~nxJ"
    if exist "%%~nxJ" move "%%I" "%%~nxJ\"
)
endlocal

The command DIR searches

  • in the current directory
  • just for files because of option /A-D (attribute not directory)
  • of which file names is matched by the wildcard pattern * - *
  • and outputs the found file names in bare format because of option /B which means just with file name and extension without path.

It is possible that no file name is found matching the criteria which would result in an error message written to handle STDERR. This possible error message is suppressed by redirecting it to device NUL.

Read the Microsoft documentation about Using command redirection operators for an explanation of 2>nul. The redirection operator > must be escaped with caret character ^ on FOR command line to be interpreted as literal character when Windows command interpreter processes this command line before executing command FOR which executes the embedded dir command line in a separate command process started in background with cmd.exe /c and the command line within ' appended as additional arguments.

The output of DIR in background command process is captured by FOR and processed line by line.

FOR on using option /F skips empty lines which do not occur here at all. The default behavior on processing a captured line by FOR is splitting the line up into substrings using normal space and horizontal tab as string delimiters and assigned just the first space/tab separated string to the specified loop variable I. This line splitting behavior is not wanted for the file names output by DIR as the file names definitely contain spaces and the entire file name is needed for further processing the files. For that reason the option delims= is used to define an empty list of string delimiters to turn off line splitting behavior and so get the file name with extension assigned to the loop variable I.

FOR on using option /F would also look on first character of first substring and would ignore the line for further processing if it starts with a semicolon which is the default end of line character. It is very unlikely that an author name starts with ;, but there is nevertheless used the option eol=| to redefine the end of line character with a vertical bar which no file name can contain ever.

The second for /F is used to split up the file name string using - as string delimiter and get just the first substring assigned to the specified loop variable J. That works for your file names as long as the author names do not contain - inside. So J is defined on each file name with Authors name  with the space left to the hyphen as trailing space.

The Microsoft documentation about Naming Files, Paths, and Namespaces explains how Windows handles file/folder names with trailing spaces/dots by removing them. For that reason %%~nxJ is used to reference the string up to first - with trailing space(s) and dot(s) removed independent on subdirectory according to author's name already existing or not.

The first IF condition is used to check if the subdirectory according to author's name does not exist in which case the subdirectory is created using the command MD to make the directory.

The second IF condition is used to verify if the subdirectory according to author's name exists now and if that is the case as expected, the file is moved using command MOVE into the subdirectory.

A variant of above is:

@echo off
setlocal EnableExtensions DisableDelayedExpansion
for /F "eol=| delims=" %%I in ('dir "* - *" /A-D /B 2^>nul') do for /F "eol=| tokens=1* delims=-" %%J in ("%%I") do (
    if not exist "%%~nxJ" md "%%~nxJ"
    if exist "%%~nxJ" for /F "eol=| tokens=*" %%L in ("%%K") do move "%%I" "%%~nxJ\%%L"
)
endlocal

The difference is that every file is moved into the subdirectory with a new name with everything up to first hyphen and all spaces after the first hyphen removed from the file name to have the files in the subdirectories without author's name in file name.

The main advantage of using command line without the usage of environment variables and delayed variable expansion is that this code works also for file names containing one or more exclamation marks.

For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.

  • dir /?
  • echo /?
  • endlocal /?
  • for /?
  • if /?
  • md /?
  • move /?
  • setlocal /?
  • Related