Home > Enterprise >  Multiplying a selection of columns by another selection in R
Multiplying a selection of columns by another selection in R

Time:11-03

I'm trying to create new columns in a data frame from the products of multiplying one selection of columns with another. E.g:

df <- as.data.frame(matrix(rep(1:6, 3), nrow = 3,
                           dimnames = list(NULL, letters[1:6])))
df

 A data.frame: 3 × 6 
 a  b   c   d   e   f 
 
 1  4   1   4   1   4 
 2  5   2   5   2   5 
 3  6   3   6   3   6

df <- df %>% mutate(df$a:df$c * df$d:df$f)
df

 A data.frame: 3 × 6 
 a  b   c   d   e   f   a*d a*e a*f b*d b*e b*f c*d c*e c*f
 
 1  4   1   4   1   4     4   1   4  16   4  16   4   4  16
 2  5   2   5   2   5    10   4  10  25  10  25  10   4  10
 3  6   3   6   3   6    12   9  12  36  18  36  18   9  18

I want to find an easy general way to create product columns and add to the dataset. In the example I try to multiply columns a, b and c with columns d, e and f and add all possible combinations to the dataframe. The syntax above obviously doesn't work, so I want to find the easiest solution to accomplish this.

CodePudding user response:

While not simple, this is easy enough.

# the data
df <- as.data.frame(matrix(rep(1:6, 3), nrow = 3,
                           dimnames = list(NULL, letters[1:6])))

library(dplyr)
library(rlang)

# set the column groups you want to multiply
cols1 <- c("a", "b", "c")
cols2 <- c("d", "e", "f")

# create the multiplication expressions.
col_mult <- set_names(c(outer(cols1, cols2, paste, sep = "*")))
col_expr <- parse_exprs(col_mult)

# use the !!! operator to execute them all
df %>% 
  mutate(!!!col_expr)

Which gives the following:

  a b c d e f a*d b*d c*d a*e b*e c*e a*f b*f c*f
1 1 4 1 4 1 4   4  16   4   1   4   1   4  16   4
2 2 5 2 5 2 5  10  25  10   4  10   4  10  25  10
3 3 6 3 6 3 6  18  36  18   9  18   9  18  36  18

If you were doing this a lot in complex cases, you could go all out with it and make a function so that you can use the tidyselect helpers. Again, not the most simple thing, but it would fit the bill of a "general way".

mutate_product <- function(df, cols_x, cols_y) {
  .cols_x <- names(tidyselect::eval_select(enexpr(cols_x), df))
  .cols_y <- names(tidyselect::eval_select(enexpr(cols_y), df))
  
  col_mult <- set_names(c(outer(.cols_x, .cols_y, paste, sep = "*")))
  col_expr <- parse_exprs(col_mult)
  
  mutate(df, !!!col_expr)
}

df %>% 
  mutate_product(a:c, starts_with("d"))
#   a b c d e f a*d b*d c*d
# 1 1 4 1 4 1 4   4  16   4
# 2 2 5 2 5 2 5  10  25  10
# 3 3 6 3 6 3 6  18  36  18

CodePudding user response:

You could do this in base R with:

cbind(df, setNames(outer(1:3, 4:6, function(x, y) df[x] * df[y]),
                   as.vector(outer(letters[1:3], letters[4:6], paste, sep = " * "))))

#>   a b c d e f a * d b * d c * d a * e b * e c * e a * f b * f c * f
#> 1 1 4 1 4 1 4     4    16     4     1     4     1     4    16     4
#> 2 2 5 2 5 2 5    10    25    10     4    10     4    10    25    10
#> 3 3 6 3 6 3 6    18    36    18     9    18     9    18    36    18

Data

df <- setNames(as.data.frame(matrix(1:6, ncol = 6, nrow = 3)), letters[1:6])

df
#>   a b c d e f
#> 1 1 4 1 4 1 4
#> 2 2 5 2 5 2 5
#> 3 3 6 3 6 3 6

CodePudding user response:

Let's try

p <- data.frame(
  lapply(
    df[4:6],
    function(x) x * df[1:3]
  )
)

cbind(
  df,
  setNames(p, gsub("(\\w )\\.(\\w )", "\\2*\\1", names(p)))
)

which gives

  a b c d e f a*d b*d c*d a*e b*e c*e a*f b*f c*f
1 1 4 1 4 1 4   4  16   4   1   4   1   4  16   4
2 2 5 2 5 2 5  10  25  10   4  10   4  10  25  10
3 3 6 3 6 3 6  18  36  18   9  18   9  18  36  18
  • Related