Let's say I have a tibble like the following:
tib <- tribble(~id, ~var1_countrya, ~var2_countrya, ~var1_countryb, ~var2_countryb,
1, 1, 2, 1, 2,
2, 5, 3, 1, 3,
3, 6, 3, 1, 3
)
I'd like to create new columns delta_countrya = var2_countrya - var1_countrya
and delta_countryb = var2_countryb - var2_countryb
by selecting the pairs of variables using a regex and dplyr pipes. The final tibble should look like this
id var1_countrya var2_countrya delta_countrya var1_countryb var2_countryb delta_countryb
1 1 2 1 1 2 1
2 5 3 -2 1 3 2
3 6 3 -3 1 3 2
I've struggled to even set up the mutate clause in a way it creates multiple new variables, let alone finding the pairs of variables I'm looking for and taking the difference.
CodePudding user response:
With your example, you could convert your columns to variables, split the columns so the var and the country are separate, then re-combine so the different vars are in columns, and finally calculate your delta
library(dplyr)
tib <- tribble(~id, ~var1_countrya, ~var2_countrya, ~var1_countryb, ~var2_countryb,
1, 1, 2, 1, 2,
2, 5, 3, 1, 3,
3, 6, 3, 1, 3
) # your code had an error, need to replace ')' with comma
tib2 = tib %>%
pivot_longer(-id, names_to = "var") %>%
separate(var, c("Var", "Country")) %>%
pivot_wider(names_from = Var) %>%
mutate(Delta = var2-var1)
tib2
# # A tibble: 6 x 5
# id Country var1 var2 Delta
# <dbl> <chr> <dbl> <dbl> <dbl>
# 1 1 countrya 1 2 1
# 2 1 countryb 1 2 1
# 3 2 countrya 5 3 -2
# 4 2 countryb 1 3 2
# 5 3 countrya 6 3 -3
# 6 3 countryb 1 3 2
CodePudding user response:
base
tib[paste0("delta_country", letters[1:2])] <- tib[paste0("var2_country", letters[1:2])] - tib[paste0("var1_country", letters[1:2])]
# id var1_countrya var2_countrya var1_countryb var2_countryb delta_countrya delta_countryb
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 1 2 1 2 1 1
# 2 2 5 3 1 3 -2 2
# 3 3 6 3 1 3 -3 2
tidyverse
library(tidyverse)
tib[paste0("delta_country", letters[1:2])] <- map2(tib %>% select(starts_with("var2_country")), tib %>% select(starts_with("var1_country")),
~.x - .y)
tib
# # id var1_countrya var2_countrya var1_countryb var2_countryb delta_countrya delta_countryb
# # <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# # 1 1 1 2 1 2 1 1
# # 2 2 5 3 1 3 -2 2
# # 3 3 6 3 1 3 -3 2
CodePudding user response:
We can also take advantage of base::Reduce
here:
Reduce(function(x, y) {
cbind(x, x[, y] - x[, y - 1])
}, seq(1, ncol(tib), 2)[-1], init = tib) |>
setNames(c(names(tib), paste0("delta_country", unique(gsub(".*([[:alpha:]]$)", "\\1", names(tib)[-1])))))
id var1_countrya var2_countrya var1_countryb var2_countryb delta_countrya delta_countryb
1 1 1 2 1 2 1 1
2 2 5 3 1 3 -2 2
3 3 6 3 1 3 -3 2