Home > Software design >  Unexpected behavior of .I in dt[.I,]
Unexpected behavior of .I in dt[.I,]

Time:05-25

From data.table help, I considered .I to be more or less equivalent to seq_len(nrow(dt)). But seems it doesn't work the same when placed in [i, context of data.table. In the following example I am trying to flag (modify value of a column) all rows after 10th:

data(iris)
setDT(iris)
iris[               .I   > 10L, newCol:='10 !'] # doesn't work
iris[seq_len(nrow(iris)) > 10L, newCol:='10 !'] # this works

Why .I doesn't work here?

CodePudding user response:

From the documentation:

.SD, .BY, .N, .I, .GRP, and .NGRP are read-only symbols for use in j. .N can be used in i as well. .I can be used in by as well. See the vignettes, Details and Examples here and in data.table. .EACHI is a symbol passed to by; i.e. by=.EACHI.

You could achieve the desired behavior by chaining

data(iris)
setDT(iris)
iris[, .i := .I][.i> 10L, newCol:='10 !'][,.i := NULL]

Be aware to not name a variable .I in a data.table because then trying to drop it will result in an error, hence, iris[, .I := .I][, .I := NULL] won't work.

CodePudding user response:

Ben373 provides a nice snippet from the documentation, and a working solution. In addition, in this case one could also do:

iris[, newCol:=fifelse(.I>10, '10 !', as.character(NA))]
  • Related