Let the following be the dataset:
What I need to do is to create new columns wherein I need to multiply all a columns with b columns and name the newly created column as a1_b1, a1_b2........ a1_b4, a2_b1, a2_b2 as shown in the figure.
I am using R for data analysis. Even though I have stated only two columns by two columns, in reality, it is 1600 by 25. Hence the question.
CodePudding user response:
This might be fast enough:
set.seed(42)
DF <- data.frame(a1 = sample(1:10),
a2 = sample(1:10),
b1 = sample(1:10),
b2 = sample(1:10))
a <- grep("a", names(DF))
b <- grep("b", names(DF))
combs <- expand.grid(a, b)
res <- do.call(mapply, c(list(FUN = \(...) do.call(`*`, DF[, c(...)])), combs))
colnames(res) <- paste(names(DF)[combs[[1]]], names(DF)[combs[[2]]], sep = "_")
cbind(DF, res)
# a1 a2 b1 b2 a1_b1 a2_b1 a1_b2 a2_b2
#1 1 8 9 3 9 72 3 24
#2 5 7 10 1 50 70 5 7
#3 10 4 3 2 30 12 20 8
#4 8 1 4 6 32 4 48 6
#5 2 5 5 10 10 25 20 50
#6 4 10 6 8 24 60 32 80
#7 6 2 1 4 6 2 24 8
#8 9 6 2 5 18 12 45 30
#9 7 9 8 7 56 72 49 63
#10 3 3 7 9 21 21 27 27
CodePudding user response:
The operation in the question is the transpose of the KhatriRao product. We use the Matrix package which comes with R so it does not have to be installed. Using the input in the Note at the end,
pick out the two portions, transpose them, use KhatriRao
and transpose back giving a sparse matrix (class "dgCMatrix"). We can use as.matrix
to convert to a dense matrix as shown or as.data.frame(as.matrix(...))
to convert to a data.frame.
library(Matrix)
rownames(dat) <- 1:nrow(dat)
ix <- grep("a", colnames(dat))
as.matrix(t(KhatriRao(t(dat[, -ix]), t(dat[, ix]), make.dimnames = TRUE)))
giving:
a1:b1 a2:b1 a1:b2 a2:b2
1 101 838.3 108.3 898.89
2 204 1050.6 220.6 1136.09
3 309 1957.0 357.0 2261.00
4 416 1664.0 464.0 1856.00
5 525 1638.0 578.0 1803.36
6 749 2118.6 838.6 2372.04
Note
dat <- setNames(cbind(BOD, BOD 100), c("a1", "a2", "b1", "b2"))
dat
giving
a1 a2 b1 b2
1 1 8.3 101 108.3
2 2 10.3 102 110.3
3 3 19.0 103 119.0
4 4 16.0 104 116.0
5 5 15.6 105 115.6
6 7 19.8 107 119.8