I want to use dplyr's mutate and subtract a variable from another one, that I have to find based on two criteria.
Here's an example of what I want to do:
# Generating the data
mode_of_travel <- c("car", "car", "plane", "plane", "train", "train")
variant <- c("slow","fast","slow","fast","slow","fast")
speed <- c(5, 7, 10, 14, 6, 7)
df <- data.frame(mode_of_travel, variant, speed)
# Data result
mode_of_travel variant speed
1 car slow 5
2 car fast 7
3 plane slow 10
4 plane fast 14
5 train slow 6
6 train fast 7
and now I want to find the difference in speed between every mode and variant and the corresponding "slow" variant:
# Computing the speed difference between the slow and the fast variant
df %>% mutate(speed_difference = speed - case_when(mode_of_travel == "car" ~ 7,
mode_of_travel == "plane" ~ 5,
mode_of_travel == "train" ~ 4))
so the output looks like this:
mode_of_travel variant speed speed_difference
1 car slow 5 0
2 car fast 7 2
3 plane slow 10 0
4 plane fast 14 4
5 train slow 6 0
6 train fast 7 1
But of course I don't want to manually do this via the "case_when" function and typing in the value. How can this be done properly?
Thanks :)
CodePudding user response:
I would describe this as within each mode_of_travel
group you want to subtract the "slow" variant
speed from the current row's speed:
df %>%
group_by(mode_of_travel) %>%
mutate(speed_difference = speed - speed[variant == "slow"]) %>%
ungroup()
# # A tibble: 6 × 4
# mode_of_travel variant speed speed_difference
# <chr> <chr> <dbl> <dbl>
# 1 car slow 5 0
# 2 car fast 7 2
# 3 plane slow 10 0
# 4 plane fast 14 4
# 5 train slow 6 0
# 6 train fast 7 1
This code assumes there is exactly one "slow" variant within each mode of travel group.