I want to substract a vector (S_0
) from every row of a matrix (S_t
). Unfortunately, calculating my for loop takes so much time as the number of rows is 1 million.
i <- 1
n <- 1000000
X_t <- data.frame(matrix(0, nrow = n, ncol = 10))
for (i in i:n) {
X_t[i,] <- S_t[i, ] - S_0
}
S_0
is a vector of length 10
S_t
is a data frame of dimension n x 10 containing values from prior calculations
My first idea was to transform S_0
into a matrix of dimension n x 10
(all rows are identical then). Maybe it's faster to substract a matrix from a matrix? Unfortunately, I could not find out how to do this effeciently without using another for loop.
Furthermore, I tried this:
data.frame(matrix(S_0, nrow = n, ncol = 10))
but the output was not what I expected as the order of the numbers was mixed up within every row.
CodePudding user response:
You can use col
to transpose the vector and keepe the type of S_t
X_t <- S_t - S_0[col(S_t)]
S_0 <- 1:10
S_t <- data.frame(matrix(0, nrow = 5, ncol = 10))
X_t <- S_t - S_0[col(S_t)]
X_t
# X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
#1 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
#2 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
#3 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
#4 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
#5 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
str(X_t)
#'data.frame': 5 obs. of 10 variables:
# $ X1 : num -1 -1 -1 -1 -1
# $ X2 : num -2 -2 -2 -2 -2
# $ X3 : num -3 -3 -3 -3 -3
# $ X4 : num -4 -4 -4 -4 -4
# $ X5 : num -5 -5 -5 -5 -5
# $ X6 : num -6 -6 -6 -6 -6
# $ X7 : num -7 -7 -7 -7 -7
# $ X8 : num -8 -8 -8 -8 -8
# $ X9 : num -9 -9 -9 -9 -9
# $ X10: num -10 -10 -10 -10 -10
S_t <- matrix(0, nrow = 5, ncol = 10)
X_t <- S_t - S_0[col(S_t)]
str(X_t)
# num [1:5, 1:10] -1 -1 -1 -1 -1 -2 -2 -2 -2 -2 ...
Another option is using sweep
, also keeping the type.
sweep(S_t, 2, S_0)
CodePudding user response:
You can use t
twice:
S_t <- data.frame(matrix(0, nrow = 1000000, ncol = 10))
S_0 <- 1:10
X_t <- t(t(S_t) - S_0)
# > head(X_t)
# X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
# [1,] -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
# [2,] -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
# [3,] -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
# [4,] -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
# [5,] -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
# [6,] -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
Benchmark: t
is the fastest
bench::mark(t(t(S_t) - S_0),
S_t - S_0[col(S_t)],
sweep(S_t, 2, S_0),
check = FALSE, iterations = 10)
# expression min median itr/s…¹ mem_a…² gc/se…³ n_itr
#1 t(t(S_t) - S_0) 211ms 321ms 3.10 229MB 2.17 10
#2 S_t - S_0[col(S_t)] 691ms 874ms 1.13 509MB 1.82 10
#3 sweep(S_t, 2, S_0) 638ms 735ms 1.34 548MB 2.54 10