Home > Back-end >  tidyverse solution for multiplying columns by a vector
tidyverse solution for multiplying columns by a vector

Time:04-01

I looked for solutions here: Multiply columns in a data frame by a vector and here: What is the right way to multiply data frame by vector?, but it doesn't really work.

What I want to do is a more or less clean tidyverse way where I multiply columns by a vector and then add these as new columns to the existing data frame. Taking teh data example from the first link:

c1 <- c(1,2,3)
c2 <- c(4,5,6)
c3 <- c(7,8,9)
d1 <- data.frame(c1,c2,c3)

  c1 c2 c3
1  1  4  7
2  2  5  8
3  3  6  9

v1 <- c(1,2,3)

my desired result would be:

  c1 c2 c3 pro_c1 pro_c2 pro_c3
1  1  4  7      1      8     21
2  2  5  8      2     10     24
3  3  6  9      3     12     27

I tried:

library(tidyverse)
d1 |>
  mutate(pro = sweep(across(everything()), 2, v1, "*"))

But here the problem is the new columns are actually a data frame within my data frame. And I'm struggling with turning this data frame-in-data frame into regular columns. I assume, I could probably first setNames on this inner data frame and then unnest, but wondering if there's a more direct way by looping over each column with across and feed it with the first/second/third element of v1?

(I know I could probably also first create a standalone data frame with the three new multiplied columns, give them a unique name and then bind_cols on both, d1 and the df with the products.)

CodePudding user response:

This is perhaps ridiculous, but you could use

library(dplyr)

d1 %>% 
  mutate(across(everything(), 
                ~.x * v1[which(names(d1) == cur_column())],
                .names = "pro_{.col}"))

which returns

  c1 c2 c3 pro_c1 pro_c2 pro_c3
1  1  4  7      1      8     21
2  2  5  8      2     10     24
3  3  6  9      3     12     27

CodePudding user response:

Just for the fun part, I trialed & errored a bit more after seeing some of your solutions. Since I started treating myself to the pain of using the base R native pipe which doesn't yet allow for passing a "." argument silently as the first argument, I had to fiddle around with it a bit more:

library(tidyverse)
d1 |> 
  (\(x)(bind_cols(x, x |>
                       map2_dfc(v1, `*`) |> 
                       rename_with(.cols = everything(),
                                   .fn   = ~paste0("pro_", .)))))()

  c1 c2 c3 pro_c1 pro_c2 pro_c3
1  1  4  7      1      8     21
2  2  5  8      2     10     24
3  3  6  9      3     12     27

CodePudding user response:

If it is by row, then one option is c_across

library(dplyr)
library(stringr)
library(tibble)
new <- as_tibble(setNames(as.list(v1), names(d1)))
d1 %>% 
  rowwise %>% 
  mutate(c_across(everything()) * new) %>%
  rename_with(~ str_c("pro_", .x), everything()) %>%
  bind_cols(d1, .)

-output

   1 c2 c3 pro_c1 pro_c2 pro_c3
1  1  4  7      1      8     21
2  2  5  8      2     10     24
3  3  6  9      3     12     27

Or another option is map2

library(purrr)
map2_dfc(d1, v1,  `*`) %>%
   rename_with(~ str_c("pro_", .x), everything()) %>%
   bind_cols(d1, .)

-output

 c1 c2 c3 pro_c1 pro_c2 pro_c3
1  1  4  7      1      8     21
2  2  5  8      2     10     24
3  3  6  9      3     12     27

Also, with the OP' approach, it is a data.frame column. It can be unpacked

library(tidyr)
d1 |> 
    mutate(pro = sweep(cur_data(), 2, v1, `*`)) |> 
    unpack(pro, names_sep = "_")

-output

# A tibble: 3 × 6
     c1    c2    c3 pro_c1 pro_c2 pro_c3
  <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>
1     1     4     7      1      8     21
2     2     5     8      2     10     24
3     3     6     9      3     12     27

EDIT: Based on @deschen comments with names_sep

CodePudding user response:

Here is another way you may want to use. Here some notes on how it works:

  • accumulate2 takes a 3-argument function where in our case the first argument is the original data frame d1, the second argument is vector v1 and the third is seq_len(length(d1)) which I used to be able to iterate and select the desired column for multiplications.
  • When we set an initial value with .init, that would be our first element of the output list (since accumulate2 returns a list). Then in the next iteration the first value of both vectors are used for our function and they are defined by ..2 and ..3.
  • This process goes on until all vector values are used and in the end I bound all the elements together to form a data frame.
library(dplyr)
library(purrr)

v1 %>%
  accumulate2(seq_len(length(d1)), ~ set_names(..2 * d1[..3], paste0("pro_", names(d1)[.y])), .init = d1) %>%
  exec(cbind, !!!.)

  c1 c2 c3 pro_c1 pro_c2 pro_c3
1  1  4  7      1      8     21
2  2  5  8      2     10     24
3  3  6  9      3     12     27

CodePudding user response:

Here is a dplyr-ized version of the usual apply(. , 1, fun) paradigm:

d1 %>% apply(1, "*", v1) %>% t %>% cbind(d1, .)

  c1 c2 c3 c1 c2 c3
1  1  4  7  1  8 21
2  2  5  8  2 10 24
3  3  6  9  3 12 27
  • Related