Home > Net >  Array not recognized by powershell parser when other operators are involved
Array not recognized by powershell parser when other operators are involved

Time:07-30

Assigning an array looks like this:

PS> $x = "a", "b"
PS> $x
a
b

Now, i wanted to add a 'root string' ("r") to any element so I did this (actually i used a variable, but for the sakeness of simplicity let's just use a string here):

PS> $x = "r"   "a" , "r"   "b"
PS> $x
ra rb

Looking at the output, I didn't get the array that I expected, but a single string with a "space" (I checked: it's a 32 ascii char, so a space, not a tab or another character).
That is: the comma seems to be interpreted as a string join operator, which I couldn't find any reference to. Even worst, I get the feeling of not understanding how the parser works here. I had a look at about_Parsing; what I found seems not to apply to this case.

Commas (,) introduce lists passed as arrays, except when the command to be called is a native application, in which case they are interpreted as part of the expandable string. Initial, consecutive or trailing commas are not supported.

The first obvious fix that I came up with is the following:

PS> $x = ("r"   "a") , ("r"   "b")
PS> $x
ra
rb

Maybe there are others, and I am expecially intrested in the ones that reveal how the parser actually works. What I would like to fix the most is my knowledge of the parsing rules.

CodePudding user response:

To flesh out the helpful comments on the answer:

tl;dr

  • Due to operator precedence, your command is parsed as "r" ("a" , "r") "b", causing array "a", "r" to be implicitly stringified to verbatim a r, resulting in two string concatenation operations yielding a single string with verbatim content ra rb.

  • Using (...) is indeed the correct way to override operator precedence.


"r" "a" , "r" "b"

  • is an expression involving operators.

    • Expressions are parsed in expression mode, which contrasts with argument mode; the latter applies to commands, i.e. named units of functionality that are called with shell-typical syntax (whitespace-separated arguments, quotes around simple strings optional). Arguments (parameter values) in argument mode are parsed differently from operands in expression mode, as explained in the conceptual about_Parsing help topic. Your quote about , relates to argument mode, not expression mode.
  • The conceptual about_Operator_Precedence help topic describes the relative precedence among operators, from which you can glean that ,, the array constructor operator has higher precedence than the operator

Therefore, your expression is parsed as follows (using (...), the grouping operator, to make the implicit rules explicit):

"r"   ("a" , "r")   "b"
  • is polymorphic in PowerShell, and with a [string] instance as the LHS the RHS is coerced to a string too.

  • Therefore, array "a" , "r" is stringified, which uses PowerShell's custom array stringification, namely joining the (potentially stringified) array elements with a space.[1]

    • That is, the array stringifies to a string with verbatim content a r.
    • As an aside: The same stringification is applied in the context of string interpolation via expandable (double-quoted) strings ("..."); that is, "$("a", "r")" also yields verbatim a r

Therefore, the above is equivalent to:

"r"   "a r"   "b"

which yields verbatim ra rb.

(...) is indeed the appropriate way to ensure the desired precedence:

("r"   "a"), ("r"   "b")  # -> array 'ra', 'rb'

[1] Space is the default separator character. Technically, you can override it via the $OFS preference variable, though that is rarely used in practice.

CodePudding user response:

Another way to do it. The type of the first term controls what type of operation the plus performs. The first term here is an empty array. If you want the plus to do both kinds of operations, there's no getting around extra parentheses to change the operator precedence.

@()   'ra'   'rb'

ra
rb

Or more commonly:

'ra','rb'   'rc'

ra
rb
rc
  • Related