Home > Back-end >  Selecting x unique random files from a folder and its subfolders with a batch file
Selecting x unique random files from a folder and its subfolders with a batch file

Time:08-08

I've got some batch code that selects three random files from a folder (and its subfolders), but it's possible for it to end up selecting the same file more than once. I'd like it to always select unique files.

It creates a temporary text file with all the options, so I've tried to get it to remove the selected line and subtract one from the total file count each time one is selected, so that it's removed from the pool of options.

Here's the whole thing:

@echo off
setlocal

:: Create numbered list of files in a temporary file
set "tempFile=%temp%\%~nx0_fileList_%time::=.%.txt"
set "tempFileTwo=%temp%\%~nx0_fileList2_%time::=.%.txt"
pushd %1
dir /b /s /a-d *.mov *.mp4 | findstr /n "^" >"%tempFile%"
popd

:: Count the files
for /f %%N in ('type "%tempFile%" ^| find /c /v ""') do set cnt=%%N

for /l %%N in (1 1 3) do call :openRandomFile

:: Delete the temp files
del "%tempFile%"
del "%tempFileTwo%"

exit /b

:openRandomFile
set /a "randomNum=(%random% %% cnt)   1"
for /f "tokens=1* delims=:" %%A in (
  'findstr "^%randomNum%:" "%tempFile%"'
) do (
start "" "%%B"
echo Selection: %%B
set /a cnt -= 1
findstr /l /v "%%B" "%tempFile%" > "%tempFileTwo%"
copy /y "%tempFileTwo%" "%tempFile%"
)
exit /b

The part that doesn't work is:

findstr /l /v "%%B" "%tempFile%" > "%tempFileTwo%"
copy /y "%tempFileTwo%" "%tempFile%"

I was hoping my findstr would find the selected file and remove it, and copy everything else over, but it apparently matches everything and copies a totally blank file. Not sure what I'm doing wrong.

CodePudding user response:

There's a good chance that

findstr /l /v /C:"%%B" "%tempFile%" > "%tempFileTwo%"

may restore sanity.

If %%B contains spaces, then each individual word in "%%B" will be matched. The /c: instructs findstr to treat the quoted string as a string to match, not a list of substrings.

An easier way to achieve your objective would be:

In your main routine, before the pushd

:: remove variables starting $
FOR  /F "delims==" %%b In ('set $ 2^>Nul') DO SET "%%b="

where $ can be any string you might desire, like #opened. This line clears any variable that starts $.

And in :openRandomFile, after setting randomNum,

if defined $%randomNum% goto openRandomFile
set $%randomNum%=Y 

which establishes a flag $%randomnum% to prevent the number from being processed more than once.

I'm sure you can work out that adjusting the tempfiles can then be removed - including the decrement of cnt...

CodePudding user response:

The %RANDOM% number may return duplicates, which we have to account for. The following adaptation of your code retries to select a random file when an already chosen one would have to be returned:

@echo off
rem // Explicitly preset configuration:
setlocal EnableExtensions DisableDelayedExpansion

:: Create numbered list of files in a temporary file
set "tempFile=%temp%\%~nx0_fileList_%time::=.%.txt"
set "tempFileTwo=%temp%\%~nx0_fileList2_%time::=.%.txt"
rem // Improve quotation and regard failure if `%1` does not point to a directory:
pushd "%~1" && (
    dir /b /s /a-d *.mov *.mp4 | findstr /n "^" > "%tempFile%"
    popd
) || del "%tempFile%" 2> nul

:: Count the files
set "cnt=0" & for /f %%N in ('type "%tempFile%" ^| find /c /v ""') do set "cnt=%%N"

rem // Regard case when there are less than three files:
set "num=3" & if %cnt% lss 3 set "num=%cnt%"
for /l %%N in (1 1 %num%) do call :openRandomFile

:: Delete the temp files
del "%tempFile%"
del "%tempFileTwo%"

endlocal
exit /b

:openRandomFile
set /a "randomNum=(%random% %% cnt)   1"
(for /f "tokens=1* delims=:" %%A in (
    'findstr "^%randomNum%:" "%tempFile%"'
) do (
    echo Selection: %%B
    start "" "%%B"
    set /a "cnt -= 1"
    rem /* Ensure to match the whole line, including the colon-separated count prefix,
    rem    and escape `findstr` meta-characters to prevent wrong matches: */
    set "LINE=%%A:%%B"
    setlocal EnableDelayedExpansion
    set "LINE=!LINE:\=\\!" & set "LINE=!LINE:.=\.!"
    set "LINE=!LINE:[=\]!" & set "LINE=!LINE:]=\]!"
    set "LINE=!LINE:^=\^!" & set "LINE=!LINE:$=\$!"
    findstr /V /X /I /C:"!LINE!" "!tempFile!" > "!tempFileTwo!"
    endlocal
    copy /y "%tempFileTwo%" "%tempFile%"
)) || (
    rem /* Detect if `for /F` loop did not iterate, meaning that an already selected
    rem    file would have become chosen once again, hence retry random selection: */
    goto :openRandomFile
)
exit /b

However, I would choose another strategy to get files without duplicates, namely, to build a list of files with each one preceded with a random number, then sorting that list by said numbers and eventually taking the first few elements from the list.

Here is a possible implementation, using a temporary file to build the list:

@echo off
setlocal EnableExtensions DisableDelayedExpansion

set "_TMPF=%TEMP%\%~nx0_%RANDOM%.lst"
rem // Change into target directory:
pushd "%~1" && (
    rem // Write to temporary file:
    > "%_TMPF%" (
        rem // Build list of matching files with a random number preceded:
        for /F "delims=" %%F in ('dir /S /B /A:-D-H-S "*.mov" "*.mp4"') do (
            set "FILE=%%F"
            setlocal EnableDelayedExpansion
            echo/!RANDOM!:!FILE!
            endlocal
        )
    )
    rem // Sort list of files by preceding random numbers:
    sort "%_TMPF%" /O "%_TMPF%"
    rem // Read from temporary file:
    < "%_TMPF%" (
        rem // Fetch the first three items from the list of files:
        setlocal EnableDelayedExpansion
        for /L %%I in (1,1,3) do (
            set /P FILE="" && (
                echo Selection: !FILE:*:=!
                start "" "!FILE:*:=!"
            )
        )
        endlocal
    )
    rem // Return from target directory:
    popd
    rem // Clean up temporary file:
    del "%_TMPF%" 2> nul
)

endlocal
exit /B

This is another possible approach, using variables like $FILES[] that constitute a pseudo-array:

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem // Change into target directory:
pushd "%~1" && (
    rem // Clean up (pseudo-)array variable:
    for /F "delims==" %%V in ('set "$FILES[" 2^> nul') do set "%%V="
    rem /* Build list of matching files in array with random number as indices
    rem    plus a counter, just to avoid duplicate variable/element names: */
    set /A "INDEX=0"
    for /F "delims=" %%F in ('dir /S /B /A:-D-H-S "*.mov" "*.mp4"') do (
        setlocal EnableDelayedExpansion & for %%R in (!RANDOM!_!INDEX!) do (
            endlocal & set "$FILES[%%R]=%%F" & set /A "INDEX =1"
        )
    )
    rem /* Fetch the first three elements of the array of files, which becomes
    rem    implicitly sorted by the indices by the `set` command: */
    set /A "INDEX=0"
    for /F "tokens=1* delims=:=" %%E in ('set "$FILES[" 2^> nul') do (
        setlocal EnableDelayedExpansion
        if !INDEX! lss 3 endlocal & (
            echo Selection: %%F
            start "" "%%F"
        ) else endlocal
        set /A "INDEX =1"
    )
    rem // Return from target directory:
    popd
)

endlocal
exit /B
  • Related