I am writing a batch script that I need to parse text out of a .csv and have ran into a roadblock:
I have a for-loop set up to grab data from each line (this works fine) but I end up needing a value that is separated by multiple lines. For example (I placed what I want to be considered a single entry in parenthesis for context):
(data I need,flag_for_which_process_to_run,dontcare,"data I need
data continued
data continued
this could continue for any number of lines",dontcare,dontcare,dontcare,dontcare)
(repeat)
Is there any way to get a batch script to parse this out without breaking the for loop? If it's helpful, the data in %%d
is encased in double quotes. Code is below, the section I am referring to is the second if inside the for loop.
SETLOCAL EnableDelayedExpansion
for /f "tokens=1,2,3,4 delims=," %%a in (sample.csv) do (
REM Skip if %%b is not flag1
if "%%b"=="flag1" (
.
.
.
)
REM Skip if %%b is not otherflag
if "%%b"=="otherflag" (
REM Set the %%a variable
set device=%%a
echo "%%d"> output\tmp\temp.txt
)
)
CodePudding user response:
Given that the first three tokens/values are unquoted (so they cannot contain quotation marks or commas on their own) and the whole CSV file does not contain escape or back-space characters, the following script, when the CSV file is provided as a command line argument, should extract the values you are interested in (it just echoes them out):
@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_FILE=%~1" & rem // (CSV file; `%~1` is first command line argument)
rem // Get carriage-return character:
for /F %%C in ('copy /Z "%~0" nul') do set "_CR=%%C"
rem // Get line-feed character:
(set ^"_LF=^
%= blank line =%
^")
rem // Get escape and back-space characters:
for /F "tokens=1,2" %%E in ('prompt $E$S$H ^& for %%Z in ^(.^) do rem/') do set "_ESC=%%E" & set "_BS=%%F"
set "CONT="
rem // Read CSV file line by line:
for /F usebackq^ delims^=^ eol^= %%L in ("%_FILE%") do (
rem // Branch for normal lines:
if not defined CONT (
rem // Get relevant tokens/values:
for /F "tokens=1-3* delims=, eol=," %%A in ("%%L") do (
set "DEVICE=%%A" & set "FLAG=%%B" & set "LINE=%%D"
if not "%%D"=="%%~D" (
rem // Fourth token begins with a `"`, hence remove it and enter branch for continued lines then:
for /F delims^=^ eol^= %%E in ("%%D"^") do set "LINE=%%~E"
set "DATA=" & set "CONT=#"
) else (
rem // Fourth token does not begin with a '"', hence it cannot be continued:
for /F "delims=, eol=," %%E in ("%%D") do (
rem // Do something with the data, like echoing:
echo/
echo FLAG=%%B
echo DEVICE=%%A
echo DATA=%%E
)
)
)
) else set "LINE=%%L"
rem // Branch for continued lines:
if defined CONT (
setlocal EnableDelayedExpansion
rem // Temporarily replace escaped (doubled) `"` with back-space character:
set "LINE=!LINE:""=%_BS%!"
rem // Collect continued data with line-breaks replaced by escape characters:
for /F delims^=^"^ eol^=^" %%D in ("!DATA!%_ESC%!LINE!") do endlocal & set "DATA=%%D"
setlocal EnableDelayedExpansion
if not "!LINE!"=="!LINE:"=!^" (
rem /* There is a single `"` (plus a `,`), which is taken as the end of the continued fourth token;
rem hence replacing back line-breaks and (unescaped) `"`: */
set "DATA=!DATA:*%_ESC%=!" & set "DATA=!DATA:%_BS%="!^"
for %%E in ("!_CR!!_LF!") do set "DATA=!DATA:%_ESC%=%%~E!"
rem // Do something with the data, like echoing:
echo/
echo FLAG=!FLAG!
echo DEVICE=!DEVICE!
echo DATA=!DATA!
endlocal
set "CONT="
) else endlocal
)
)
endlocal
exit /B