I came across a weird behavior of jq
involving a variable on the left hand side of a pipe.
For your information, this question was inspired by the jq
manual: under Scoping
(https://stedolan.github.io/jq/manual/#Advancedfeatures) where it mentions an example filter ... | .*3 as $times_three | [. $times_three] | ...
. I believe the correct version is ... | (.*3) as $times_three | [. $times_three] | ...
.
First (https://jqplay.org/s/ffMPsqmsmt)
filter:
. * 3 as $times_three | .
input:
3
output:
9
Second (https://jqplay.org/s/yOFcjRAMLL)
filter:
. * 4 as $times_four | .
input:
3
output:
9
What is happening here?
But (https://jqplay.org/s/IKrTNZjKI8)
filter:
(. * 3) as $times_three | .
input:
3
output:
3
And (https://jqplay.org/s/8zoq2-HN1G)
filter:
(. * 4) as $times_four | .
input:
3
output:
3
So if parenthesis (.*3)
or (.*4)
is used when the variable is declared then filter behaves predictably.
But if parenthesis is not used .*3
or .*4
then strangely the output is 9 for both.
Can you explain?
CodePudding user response:
Contrary to what the examples in the Scoping section assume, . * 4 as $times_four | .
is equivalent to . * ( 4 as $times_four | . )
and therefore squares its input.
You might expect
. * 4 as $times_four | .
to be equivalent to
( . * 4 ) as $times_four | .
And as you point out, some example even suggest this is the case. However, the first snippet is actually equivalent to the following:
. * ( 4 as $times_four | . )
And since … as $x
produces its context[1], that's the same as
. * ( . | . )
or
. * .
jq
's operator precedence is inconsistent and/or quirky.
"def" | "abc" "def" | length
means"def" | ( "abc" "def" ) | length )
, but"def" | "abc" "def" as $x | length
means"def" | "abc" ( "def" as $x | length )
.
This behaviour suggests that that as
isn't a binary operator of the form X as $Y
as one might expect, but a ternary operator of the form X as $Y | Z
.
And, in fact, this is how it's documented:
Variable / Symbolic Binding Operator:
... as $identifier | ...
This leads to surprises, especially since it binds a lot more tightly than expected, including by whomever authored the examples in the Scoping section.
- It might produce it multiple times e.g.
.[] as $x
.
CodePudding user response:
Indeed, there seems to be a mistake in the manual. In section Scoping it is contrasting the (faulty) examples
... | .*3 as $times_three | [. $times_three] | ... # faulty!
and
... | (.*3 as $times_three | [. $times_three]) | ... # faulty!
While the overall statement stays valid, both examples are missing additional parentheses around .*3
. Thus, it should actually read
... | (.*3) as $times_three | [. $times_three] | ...
and
... | ((.*3) as $times_three | [. $times_three]) | ...
respectively.
From the manual under section Variable / Symbolic Binding Operator:
The expression
exp as $x | ...
means: for each value of expressionexp
, run the rest of the pipeline with the entire original input, and with$x
set to that value. Thusas
functions as something of a foreach loop.
This means that a variable assignment takes the one expression left of as
and assigns its evaluation to the defined variable right of as
(and this happens as many times as exp
produces an output). But, as everything in jq is a filter, the assignment itself also is, and as such it needs to have an output itself. If you look closely, the full title of that section
Variable / Symbolic Binding Operator:
... as $identifier | ...
also features a pipe symbol next to it, which indicates that it belongs to the assignment's structure. Try just running . as $x
. You will get an error because the | ...
part is missing. Thus, to simply keep the input context as is (apart from maybe duplicating it as many times as the expression left of as
produced an output), a complete assignment would rather look like … as $x | .
, or, if the input context is what you wanted to capture in the variable, . as $x | .
That said, let's clarify what happens with your examples by putting explicit parentheses around the assignments:
3 | . * 3 as $times_three | .
3 | . * (3 as $times_three | .)
3 | . * . # with $times_three set to 3
3 * 3 # with $times_three set to 3
9 # with $times_three set to 3
3 | . * 4 as $times_four | .
3 | . * (4 as $times_four | .)
3 | . * . # with $times_four set to 4
3 * 3 # with $times_four set to 4
9 # with $times_four set to 4
3 | (. * 3) as $times_three | .
3 | ((. * 3) as $times_three | .)
3 | ((3 * 3) as $times_three | .)
3 | (9 as $times_three | .)
3 | . # with $times_three set to 9
3 # with $times_three set to 9
3 | (. * 4) as $times_four | .
3 | ((. * 4) as $times_four | .)
3 | ((3 * 4) as $times_four | .)
3 | (12 as $times_four | .)
3 | . # with $times_four set to 12
3 # with $times_four set to 12