Is that possible to get the index of unselected rows of data frame in R?-CodePudding

I want the indices of the unselected rows when using sample() in R. Consider the following case.

df <- data.frame(id = c(1,1,2,2,3,3),
                 v1 = c(2,2,9,4,7,1),
                 v2 = c(3,5,8,5,8,5))
ss  <- ceiling(0.5*nrow(df)) #size
set.seed(123)
rid <- sample(seq_len(nrow(df)),size=ss,replace=F)

Now, the rows 3,6,2 are randomly selected. Is there a way to know indices of unselected rows (1,4,5)?

Thanks!

CodePudding user response：

You can use df[-rid,]:

df <- data.frame(
  id = c(1, 1, 2, 2, 3, 3),
  v1 = c(2, 2, 9, 4, 7, 1),
  v2 = c(3, 5, 8, 5, 8, 5)
)
ss <- ceiling(0.5 * nrow(df)) # size
set.seed(123)
rid <- sample(seq_len(nrow(df)), size = ss, replace = F)

rid
#> [1] 3 6 2
df
#>   id v1 v2
#> 1  1  2  3
#> 2  1  2  5
#> 3  2  9  8
#> 4  2  4  5
#> 5  3  7  8
#> 6  3  1  5

df[rid,]
#>   id v1 v2
#> 3  2  9  8
#> 6  3  1  5
#> 2  1  2  5
df[-rid, ]
#>   id v1 v2
#> 1  1  2  3
#> 4  2  4  5
#> 5  3  7  8
rownames(df[-rid, ])
#> [1] "1" "4" "5"

^{Created on 2021-11-05 by the reprex package (v2.0.1)}