Home > Software engineering >  How to parse multi-line value from a csv in batch
How to parse multi-line value from a csv in batch

Time:12-13

I am writing a batch script that I need to parse text out of a .csv and have ran into a roadblock:

I have a for-loop set up to grab data from each line (this works fine) but I end up needing a value that is separated by multiple lines. For example (I placed what I want to be considered a single entry in parenthesis for context):

(data I need,flag_for_which_process_to_run,dontcare,"data I need
data continued
data continued
this could continue for any number of lines",dontcare,dontcare,dontcare,dontcare)
(repeat)

Is there any way to get a batch script to parse this out without breaking the for loop? If it's helpful, the data in %%d is encased in double quotes. Code is below, the section I am referring to is the second if inside the for loop.

SETLOCAL EnableDelayedExpansion

for /f "tokens=1,2,3,4 delims=," %%a in (sample.csv) do ( 
    REM Skip if %%b is not flag1
    if "%%b"=="flag1" (
        .
        .
        .
    )
    REM Skip if %%b is not otherflag
    if "%%b"=="otherflag" (


        REM Set the %%a variable
        set device=%%a
        echo "%%d"> output\tmp\temp.txt
        
    )
)

CodePudding user response:

Given that the first three tokens/values are unquoted (so they cannot contain quotation marks or commas on their own) and the whole CSV file does not contain escape or back-space characters, the following script, when the CSV file is provided as a command line argument, should extract the values you are interested in (it just echoes them out):

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem // Define constants here:
set "_FILE=%~1" & rem // (CSV file; `%~1` is first command line argument)
rem // Get carriage-return character:
for /F %%C in ('copy /Z "%~0" nul') do set "_CR=%%C"
rem // Get line-feed character:
(set ^"_LF=^
%= blank line =%
^")
rem // Get escape and back-space characters:
for /F "tokens=1,2" %%E in ('prompt $E$S$H ^& for %%Z in ^(.^) do rem/') do set "_ESC=%%E" & set "_BS=%%F"

set "CONT="
rem // Read CSV file line by line:
for /F usebackq^ delims^=^ eol^= %%L in ("%_FILE%") do (
    rem // Branch for normal lines:
    if not defined CONT (
        rem // Get relevant tokens/values:
        for /F "tokens=1-3* delims=, eol=," %%A in ("%%L") do (
            set "DEVICE=%%A" & set "FLAG=%%B" & set "LINE=%%D"
            if not "%%D"=="%%~D" (
                rem // Fourth token begins with a `"`, hence remove it and enter branch for continued lines then:
                for /F delims^=^ eol^= %%E in ("%%D"^") do set "LINE=%%~E"
                set "DATA=" & set "CONT=#"
            ) else (
                rem // Fourth token does not begin with a '"', hence it cannot be continued:
                for /F "delims=, eol=," %%E in ("%%D") do (
                    rem // Do something with the data, like echoing:
                    echo/
                    echo FLAG=%%B
                    echo DEVICE=%%A
                    echo DATA=%%E
                )
            )
        )
    ) else set "LINE=%%L"
    rem // Branch for continued lines:
    if defined CONT (
        setlocal EnableDelayedExpansion
        rem // Temporarily replace escaped (doubled) `"` with back-space character:
        set "LINE=!LINE:""=%_BS%!"
        rem // Collect continued data with line-breaks replaced by escape characters:
        for /F delims^=^"^ eol^=^" %%D in ("!DATA!%_ESC%!LINE!") do endlocal & set "DATA=%%D"
        setlocal EnableDelayedExpansion
        if not "!LINE!"=="!LINE:"=!^" (
            rem /* There is a single `"` (plus a `,`), which is taken as the end of the continued fourth token;
            rem    hence replacing back line-breaks and (unescaped) `"`: */
            set "DATA=!DATA:*%_ESC%=!" & set "DATA=!DATA:%_BS%="!^"
            for %%E in ("!_CR!!_LF!") do set "DATA=!DATA:%_ESC%=%%~E!"
            rem // Do something with the data, like echoing:
            echo/
            echo FLAG=!FLAG!
            echo DEVICE=!DEVICE!
            echo DATA=!DATA!
            endlocal
            set "CONT="
        ) else endlocal
    )
)

endlocal
exit /B
  • Related