Can you help me understand how R interprets square brackets with forms such as y[i:j - k]
?
dummy data:
y <- c(1, 2, 3, 5, 7, 8)
Here's what I do understand:
y[i]
is the ith element of vector y.y[i:j]
is the ith to jth element (inclusive) of vector y.y[-i]
is vector y without the first i elements. etc. etc.
However, what I don't understand is what happens when you start mixing these options, and I haven't found a good resource for explaining it.
For example:
y[1-1:4]
[1] 5 7 8
So y[1-1:4]
returns the vector without the first three elements. But why?
and
y[1-4]
[1] 1 2 5 7 8
So y[1-4]
returns the vector without the third element. Is that because 1-4 = -3 and it's interpretting it the same as y[-3]
? If so, that doesn't seem consistent with my previous example where y[1-1:4]
would presumably be interpretted as y[0:4]
, but that isn't the case.
and
y[1:1 2-1]
[1] 2
Why does this return the second element? I encountered this while I was trying to code something along the lines of: y[i:i j - k]
and it took me a while to figure out that I should write y[i:(i j - k)]
so the parenthesis captured the whole of the right-hand-side of the colon. But I still can't figure out what logic R was doing when I didn't have those brackets.
Thanks!
CodePudding user response:
It's best to look closer at precedence and the integer sequences you use for subsetting. These are evaluated before subsetting with []
. Note that -
is a function with two arguments (1
, 1:4
) which are evaluated beforehand and so
> 1-1:4
[1] 0 -1 -2 -3
Negative indices in []
mean exclusion of the corresponding elements. There is no "0" element (and so subsetting at 0
returns an empty vector of the present type -- numeric(0)
). We thus expect y[1-1:4]
to drop the first three elements in y
and return the remainder.
As you write correctly y[1-4]
is y[-3]
, i.e. omission of the third element.
Similar as above, in 1:1 2-1
, 1:1
evaluates to a one-element vector 1
, the rest is simple arithmetic.
For more on operator precedence, see Hadley's excellent book.