Sample data:
sampdat <- data.frame(grp=rep(c("a","b","c"),c(2,3,5)), x1=seq(0,.9,0.1),x2=seq(.3,.75,0.05), y1=c(1:10), y2=c(11:20))
I would like to have the following data, but i have 100 variables for which i'd like to apply a function with two variables:
myfun <- function(x,y) {
z=x*y
}
needdat <- sampdat %>% mutate(z1=x1*y1, z2=x2*y2)
What is the most efficient approach to doing this using dplyr's across and summarise?
Thanks in advance for your suggestions/solutions!
Best, SaM
CodePudding user response:
Easier would be to use two across
library(dplyr)
library(stringr)
sampdat %>%
mutate(across(starts_with('x'),
.names = "{str_replace(.col, 'x', 'z')}") *
across(starts_with('y')))
-output
grp x1 x2 y1 y2 z1 z2
1 a 0.0 0.30 1 11 0.0 3.3
2 a 0.1 0.35 2 12 0.2 4.2
3 b 0.2 0.40 3 13 0.6 5.2
4 b 0.3 0.45 4 14 1.2 6.3
5 b 0.4 0.50 5 15 2.0 7.5
6 c 0.5 0.55 6 16 3.0 8.8
7 c 0.6 0.60 7 17 4.2 10.2
8 c 0.7 0.65 8 18 5.6 11.7
9 c 0.8 0.70 9 19 7.2 13.3
10 c 0.9 0.75 10 20 9.0 15.0
Or with dplyover
library(dplyover)
sampdat %>%
mutate(across2(starts_with('x'), starts_with('y'),
~ .x * .y, .names = "z{xcol}"))