Imagine the following data:
data <- tribble(
~a1, ~a2, ~b1, ~b2, ~c1, ~c2,
32, 32, 50, 12, 12, 50,
48, 20, 55, 43, 10, 42
)
For i = {1, 2} I want to compute deltai = (ai - ci) / ((ai bi) * ci ai)
.
(I am explicitly using random numbers and a random function; no solution can be found by recognising and exploiting some pattern.)
The straightforward way would be to do
data <- data %>%
mutate(
delta1 = (a1 - c1) / ((a1 b1) * c1 a1),
delta2 = (a2 - c2) / ((a2 b2) * c2 a2)
)
but it introduces a lot of repetition.
I could do
delta <- function(a, b, c) {
return((a - c) / ((a b) * c a))
}
data <- data %>%
mutate(
delta1 = delta(a1, b1, c1),
delta2 = delta(a2, b2, c2)
)
which makes it possible to easily change the delta()
function later on, but this still looks like a lot of repetition.
My question: is there a way to compute delta1
and delta2
with one line of mutate?
You might think the amount of repetition is OK, but I might need to compute several others terms like gammai
or alphai
. Duplicating lines doesn't feel like a good solution.
I thought I could solve the issue by doing
for (i in c(1, 2)) {
data <- data %>%
mutate("delta{i}" := delta(paste0('a', i), paste0('b', i), paste0('c', i)))
}
but I got
Error in `mutate()`:
! Problem while computing `delta1 = delta(paste0("a", i), paste0("b", i), paste0("c", i))`.
Caused by error in `a - c`:
! non-numeric argument to binary operator
Run `rlang::last_error()` to see where the error occurred.
and it somewhat feels wrong to loop over mutates.
I have seen solutions in Mutate multiple / consecutive columns (with dplyr or base R), How can I mutate multiple variables using dplyr? or Mutating multiple columns in a data frame using dplyr, but the solutions are much less readable than copying and pasting the line and living with the duplication.
Ideally, I am hoping to find a smart use of across
that would allow me to write something like mutate("delta{i}" := delta(a{i}, b{i}, c{i}))
.
CodePudding user response:
With glue
You can leverage glue
functions. This is probably the nicest and most flexible way:
library(glue)
cols <- c("1", "2")
exprs <- glue("(a{cols} - c{cols}) / ((a{cols} b{cols}) * c{cols} a{cols})")
names(exprs) <- glue("delta{cols}")
data |>
mutate(!!!rlang::parse_exprs(exprs))
# A tibble: 2 × 8
a1 a2 b1 b2 c1 c2 delta1 delta2
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 32 32 50 12 12 50 0.0197 -0.00806
2 48 20 55 43 10 42 0.0353 -0.00825
With across
If you want to make it with across
, you can use a bunch of them like so:
library(dplyr)
data %>%
mutate((across(starts_with("a"), .names = "delta{sub('a', '', .col)}") -
across(starts_with("c"))) /
((across(starts_with("a")) across(starts_with("b"))) *
across(starts_with("c")) across(starts_with("a"))))
# A tibble: 2 × 8
a1 a2 b1 b2 c1 c2 delta1 delta2
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 32 32 50 12 12 50 0.0197 -0.00806
2 48 20 55 43 10 42 0.0353 -0.00825
By pivoting
But, you should maybe rather go with pivoting to long and back to wide:
library(dplyr)
library(tidyr)
data %>%
mutate(rown = row_number()) %>%
pivot_longer(-rown,
names_to = c(".value", "number"),
names_pattern = "([a-z])(\\d)") %>%
group_by(rown) %>%
mutate(delta = (a - c) / ((a b) * c a)) %>%
pivot_wider(names_from = number,
values_from = a:delta,
names_sep = "")
# A tibble: 2 × 9
# Groups: rown [2]
rown a1 a2 b1 b2 c1 c2 delta1 delta2
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 32 32 50 12 12 50 0.0197 -0.00806
2 2 48 20 55 43 10 42 0.0353 -0.00825