The following sequence:
1 2 3 1 2 3 1 2 3
can be generated with a loop
x <- c(); for (i in 1:3) x <- c(x, 1:3)
or (preferably) without a loop
x <- rep(1:3, 3)
Now, I want to remove i
from the sequence in the i-th iteration, to get:
2 3 1 3 1 2
It is easy to achieve this by modifying the loop, but how to do this without a loop?
CodePudding user response:
This is an idea:
x <- rep(1:3, 3)
x[x != rep(1:3, each = 3)]
# [1] 2 3 1 3 1 2
CodePudding user response:
I thought about this before. I did:
n <- 3
x <- rep(1:n, n)[-seq.int(1, n * n, by = n 1)]
#[1] 2 3 1 3 1 2
Nothing will be faster than this for big n
(of course, unless we code the entire loop in C/C ).
interpretation
It is as same as dropping diagonal elements from the following matrix:
matrix(rep(1:n, n), n)
# [,1] [,2] [,3]
#[1,] 1 1 1
#[2,] 2 2 2
#[3,] 3 3 3
which is
matrix(x, ncol = n)
# [,1] [,2] [,3]
#[1,] 2 1 1
#[2,] 3 3 2
This essentially gives the index for training data in a leave-one-out cross-validation.
benchmark
f1 <- function (n) rep.int(1:n, n)[-seq.int(1, n * n, n 1)]
## Darren Tsai's method
## the logic is also dropping diagonal elements from a matrix
## try mat[row(mat) != col(mat)] for a square matrix `mat`
## but this takes more memory
f2 <- function (n) {
z <- 1:n
x <- rep.int(z, n)
x[x != rep(z, each = n)]
}
n <- 1000
library(microbenchmark)
microbenchmark("Li" = f1(n), "Tsai" = f2(n))
#Unit: milliseconds
# expr min lq mean median uq max
# Li 15.14039 15.18756 19.06687 16.78678 20.44281 52.86618
# Tsai 61.45718 62.56886 66.01448 62.86677 65.42081 107.46628
CodePudding user response:
A fast solution using sequence
:
n <- 3L
sequence(rep(n, n - 1L), 1:n) %% n 1L
#> [1] 2 3 1 3 1 2
Benchmark (based on Zheyuan Li's):
f1 <- function (n) rep.int(1:n, n)[-seq.int(1, n * n, n 1)]
f2 <- function (n) {
z <- 1:n
x <- rep.int(z, n)
x[x != rep(z, each = n)]
}
f3 <- function(n) sequence(rep(n, n - 1L), 1:n) %% n 1L
n <- 1000L
microbenchmark::microbenchmark(Li = f1(n),
Tsai = f2(n),
Blood = f3(n),
check = "equal")
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> Li 6.5381 6.7037 8.545134 8.53075 9.0220 32.4803 100
#> Tsai 11.7392 12.0008 14.599620 14.00125 14.4217 43.2622 100
#> Blood 3.0204 3.0617 3.514819 3.09375 3.2193 7.8742 100