Home > database >  I found a bug in Batch. How can I circumvent this bug?
I found a bug in Batch. How can I circumvent this bug?

Time:08-05

CMD is misinterpreting code on the false side of an if statement, resulting in a crash.

Here is some test code, which fails should the end user enter y or Y:

@Echo Off

Set "var="
Set "input="

:YorN
Set /P "input=Leave var empty? [Y(crash)|N]"
(Set input) 2>NUL | %SystemRoot%\System32\findstr.exe /I /L /X "input=Y input=N" 1>NUL
If ErrorLevel 1 GoTo YorN
 
If /I "%input%" == "n" Set "var=content1;content2;"

If Not "%var%" == "" (
    For /F "Tokens=1,2 Delims=;" %%G In ("%var:~0,-1%") Do If Not "%%G" == "" Echo "%%G" "%%H"
) Else (
    Echo As per your choosing, var is empty. Because of the if  statement the "for" command didn't get interpreted and CMD didn't crash. You will not see this message.
)

Pause
Exit /B

This version however, with only one minor line break change works as intended.

@Echo Off

Set "var="
Set "input="

:YorN
Set /P "input=Leave var empty? [Y(crash)|N]"
(Set input) 2>NUL | %SystemRoot%\System32\findstr.exe /I /L /X "input=Y input=N" 1>NUL
If ErrorLevel 1 GoTo YorN
 
If /I "%input%" == "n" Set "var=content1;content2;"

If Not "%var%" == "" (
    For /F "Tokens=1,2 Delims=;" %%G In ("%var:~0,-1%"
    ) Do If Not "%%G" == "" Echo "%%G" "%%H"
) Else (
    Echo As per your choosing, var is empty. Because of the if  statement the "for" command didn't get interpreted and CMD didn't crash. You will see this message.
)

Pause
Exit /B

Could somebody please explain to me what is causing this issue, or confirm that this is a bug in cmd.exe?

CodePudding user response:

There is no bug, but the behavior is not obvious.

A minimal example shows the problem.

@echo off

set "var="
set "other=content"
echo First char of var is "%var:~0,1%" my other var=%other%

You get:

First char of var is "~0,1other

If you add any text to var it works as expected.

The variable var is undefined, not empty! The problem is the expansion rule of undefined variables, the parser stops the variable expansion part, if it finds a double colon in an expression, but the variable is undefined.
In this case the parser ignores (and removes) the variable expansion after reading %var:.
But the the parser looks at the remaining line of ~0,1%" my other var=%other%.
It splits the line to

  1. ~0,1 -- Normal text
  2. %" my other var=% -- This is a percent expansion of the variable with the name " my other var=, this variable is undefined, cmd.exe removes the complete part
  3. other -- Normal text
  4. % -- The trailing opening percent is removed, because there is no other percent sign

In your complicated example the line feed seems to solve the situation, because the part If Not "%%G" ... is on a separate line and the first percent of %%G is not used as the closing percent of the expression %") Do If Not "%.
In the failure case, it gets worse, because the closing parenthesis of the FOR block is removed and then the FOR command scans the rest of the file for a closing parenthesis and that ends in total rubbish.

For better understanding you could read: SO: Percent Expansion Rules from @dbenham

CodePudding user response:

What you have found (referring to revision 13 of your question) is something that I consider a bug (or at least a terrible design flaw) – but the problem is neither the if nor the for statement, it is the sub-string syntax:

If you follow the percent expansion rules very carefully, you may notice, that sub-string expansion (%VAR:~[integer][,[integer]]%) or sub-string substitution (%VAR:[*]search=[replace]%) behaves odd in case variable VAR is not defined. Here is an excerpt of that post with the most relevant sections highlighted:

Phase 1) Percent Expansion Starting from left, scan each character for % or <LF>. If found then

  • 1.05 (truncate line at <LF>)
  • If the character is <LF> then
    • Drop (ignore) the remainder of the line from the <LF> onward
    • Goto Phase 2.0
  • Else the character must be %, so proceed to 1.1
  • 1.1 (escape %) skipped if command line mode
  • […]
  • 1.2 (expand argument) skipped if command line mode
  • […]
  • 1.3 (expand variable)
  • […]
  • Else if command extensions are enabled then
    […]
    • If next character is % then
      […]
    • Else if next character is : then
      • If VAR is undefined then
        • If batch mode then
          Remove %VAR: and continue scan.
        • […]
      • […]
  • 1.4 (strip %)
    • […]

Applying this to your code portions, we can conclude the following:

  • First code portion:

    If var is not defined, %var:~0,-1% becomes parsed to ~0,-1%, because of Remove %VAR: and continue scan in the above excerpt of Phase 1, leaving behind the remaining command line ~0,-1%") Do If Not "%%G" == "" Echo "%%G" "%%H", which is interpreted as:

    • the literal string ~0,1,
    • the (undefined) variable %") Do If Not "% (which becomes stripped),
    • the (undefined) variable %G" == "" Echo "% (which becomes stripped),
    • the (undefined) variable %G" "% (which becomes stripped),
    • and the remainder %H" (which the single %-sign becomes removed from),

    constituting the command line For /F "Tokens=1,2 Delims=;" %G In ("~0,-1H" after Phase 1. The next line ) Else ( provides an expected closing parenthesis, but there is Do expected rather than Else. That is why the specific error message Else was unexpected at this time. arises.

    Hence the %-sign in the fragment ~0,-1% is considered as an opening one for another (yet undefined) variable, impairing the whole remainder of the command line.

  • Second code portion:

    If var is not defined, %var:~0,-1% also becomes parsed to ~0,-1%, because of Remove %VAR: and continue scan in the above excerpt of Phase 1, but leaving behind the remaining command line ~0,-1%", resulting in just the literal string ~0,-1" (with the %-sign stripped).

    In this situation however, there is a line-break (<LF>) following, so Phase 1 is ended because of Goto Phase 2.0 and so, parsing newly begins with Phase 1 in the next line.

The key to all this is the statement and continue scan, meaning that from that point on, the next detected %-sign is recognised as an opening one.

  • Related