I'm working on a survey research project in which we need to multiply one group of columns against another group of columns. I can do this easily in base R, but I can't figure out how to do it within a tidy, pipe-based workflow. I found many solutions for multiplying a set of columns against one column, but not for multiple columns. Any help is greatly appreciated.
The example below demonstrates how I solve the problem in base R.
library(tidyverse)
df <- tibble(
a1 = c(1, 2, 3, 4, 5),
a2 = c(5, 4, 3, 2, 1),
a3 = c(1, 5, 2, 4, 3),
b1 = c(1, 1, 2, 1, 1),
b2 = c(3, 3, 5, 4, 1),
b3 = c(2, 1, 4, 2, 1)
)
new_df <- df[, c("a1", "a2", "a3")] * df[, c("b1", "b2", "b3")]
names(new_df) <- c("c1", "c2", "c3")
Created on 2022-06-14 by the reprex package (v2.0.1)
CodePudding user response:
Not everything needs to be tidy. The base R solution that you have works perfectly fine, is neat and scalable.
The same can be achieved with tidyverse
but it would not be neat. Here's a way with across
.
library(dplyr)
df %>%
mutate(across(starts_with('a'), .names = '{sub("a", "c", col)}') *
across(starts_with('b'))) %>%
select(starts_with('c'))
# c1 c2 c3
# <dbl> <dbl> <dbl>
#1 1 15 2
#2 2 12 5
#3 6 15 8
#4 4 8 8
#5 5 1 3
The base R code can also be written as -
df %>% select(starts_with('a')) * df %>% select(starts_with('b'))
CodePudding user response:
You could stack the two groups of columns pairwisely, multiply them together, and then pivot the long data to wide.
library(tidyverse)
df %>%
rowid_to_column("id") %>%
pivot_longer(-id, names_to = c(".value", "set"), names_pattern = "(.)(.)") %>%
mutate(c = a * b) %>%
pivot_wider(id, names_from = set, values_from = c, names_prefix = "c")
# # A tibble: 5 × 4
# id c1 c2 c3
# <int> <dbl> <dbl> <dbl>
# 1 1 1 15 2
# 2 2 2 12 5
# 3 3 6 15 8
# 4 4 4 8 8
# 5 5 5 1 3
CodePudding user response:
Here is a mostly tidyverse
option, except for using split.default
. We can split into a list of dataframes based on the ending, then we can use reduce
to perform the calculation for each dataframe, then return as 1 single dataframe (and finally add the c
prefix to the column names).
library(tidyverse)
df %>%
split.default(., sub("\\D ", "", names(.))) %>%
map_df(., ~ reduce(.x, .f = `*`)) %>%
rename_with( ~ paste0("c", .x))
Output
c1 c2 c3
<dbl> <dbl> <dbl>
1 1 15 2
2 2 12 5
3 6 15 8
4 4 8 8
5 5 1 3
CodePudding user response:
I agree with @Ronak Shah that sometimes it's easier with base R functions but here is the pipe solution you are asking:
df <- tibble(
a1 = c(1, 2, 3, 4, 5),
a2 = c(5, 4, 3, 2, 1),
a3 = c(1, 5, 2, 4, 3),
b1 = c(1, 1, 2, 1, 1),
b2 = c(3, 3, 5, 4, 1),
b3 = c(2, 1, 4, 2, 1)
)
data.frame(df) %>%
mutate(c = select(cur_data(), a1:a3)*select(cur_data(), b1:b3)) %>%
invoke(.f = data.frame) %>%
rename_with(~str_remove(.,".a"))
The outputs looks like this:
a1 a2 a3 b1 b2 b3 c1 c2 c3
1 1 5 1 1 3 2 1 15 2
2 2 4 5 1 3 1 2 12 5
3 3 3 2 2 5 4 6 15 8
4 4 2 4 1 4 2 4 8 8
5 5 1 3 1 1 1 5 1 3