I am seeing an annoying trailing double quote when passing in a quoted directory in Windows to Python.
This is my Python test program, print_args.py
:
import sys
print(sys.argv)
This is what I get when I run it from the command line. Note that the quoted directory is the standard format generated by tab completion in the Windows shell. The double quotes are needed because of the spaces in the path.
>py print_args.py -test "C:\Documents and Settings\"
['print_args.py', '-test', 'C:\\Documents and Settings"']
The trailing backslash has been replaced with a double quote, presumably because Python is reading it as a quoted double quote, rather than matching it to the leading quote.
If instead of passing the parameter to Python, I pass it to a batch script which just echoes it, then I get the trailing backslash as expected.
So Python seems to be parsing the argument as a string literal rather than a string input. Is that correct?
Can anyone illuminate?
Edited to add:
Further reading suggests to me that Python is doing some significant parsing of the windows command line arguments to construct sys.argv. I think Windows passes the entire command line string, in this case mostly unchanged, to Python and it uses its own internal logic to break it into the strings in the sys.argv list. This processing must allow escaped double quotes as a special case. I would be pleased to see some documentation or the code...
CodePudding user response:
You need to escape the backslash, since a backslash itself is the escape character for both Python and PowerShell. Otherwise, \"
means passing in a "
literal instead of using it to enclose the string. The following works fine for me:
PS> py print_args.py -test "C:\\Documents and Settings\\"
Output:
['rando.py', '-test', 'C:\\Documents and Settings\\']
CodePudding user response:
Command line processing on Windows is not completely standardised but in the case of Python and many other programs it uses the Microsoft C runtime behaviour. This is specified, for example, here. It says
A string surrounded by double quote marks is interpreted as a single argument, which may contain white-space characters. [...] If the command line ends before a closing double quote mark is found, then all the characters read so far are output as the last argument.
A double quote mark preceded by a backslash (") is interpreted as a literal double quote mark (").
The second of these two rules prevents the second double quote being read as terminating the argument - instead a double quote is appended. Then the first rule allows the argument to end without a terminating double quote.
Note that this section also says
The command line parsing rules used by Microsoft C/C code are Microsoft-specific.
This is even more confusing when using PowerShell.
PS> py print_args.py -test 'C:\Documents and Settings\'
['print_args.py', '-test', 'C:\\Documents and Settings"']
Here PowerShell parsing preserves the final backslash and drops the single quotes. Then it adds double quotes (because of the spaces in the path) before passing the command line to the C runtime which parses it according to the rules, escaping the double quote added by PowerShell.
However, this all does conform to the documented behaviour and is not a bug.