I'm working with a data frame that has multiple groups. I want to find the maximum value of a group in one column and then reference the value of another column in that same row in order to populate a new column.
Example:
item group val1 val2
<chr> <dbl> <dbl> <dbl>
1 a 1 1 1
2 b 1 2 3
3 c 1 3 2
4 d 2 1 0
5 e 2 2 1
For each group, I want to take the max Val1 and populate a new column with that row's Val2. So the example would become:
item group val1 val2 val3
<chr> <dbl> <dbl> <dbl> <dbl>
1 a 1 1 1 2
2 b 1 2 3 2
3 c 1 3 2 2
4 d 2 1 0 1
5 e 2 2 1 1
Finding the max of a group is easy but I can't figure out how to reference another value in the row and populate using that instead. Been googling all day. This is what I have tried that returns a column of NAs instead of val2 values:
library(tidyverse)
item <- c("a", "b", "c", "d", "e")
group <- c("1", "1", "1", "2", "2")
val1 <- c("1", "2", "3", "1", "2")
val2 <- c("1", "3", "2", "0", "1")
data <- tibble(item = item,
group = group,
val1 = val1,
val2 = val2)
data %>%
group_by(group) %>%
mutate(val3 = max(val1)["val2"]) -> data
Thanks in advance !
CodePudding user response:
You may try
data %>%
group_by(group) %>%
mutate(val3 = val2[which.max(val1)])
item group val1 val2 val3
<chr> <chr> <chr> <chr> <chr>
1 a 1 1 1 2
2 b 1 2 3 2
3 c 1 3 2 2
4 d 2 1 0 1
5 e 2 2 1 1
CodePudding user response:
Untested, but you could give this a try:
data %>%
group_by(group) %>%
mutate(var3, data[data$val1 == max(val1)]$val2) -> data