I have a dataframe with game, category, if it's solo play or not, and with rating. I add a new column that contains the best game in terms of max rating in each category within each solo/double grouping.
category <- c("Party","Adventure","Puzzle","Party","Adventure","Puzzle","Party","Adventure","Puzzle","Party","Adventure","Puzzle","Party","Adventure","Puzzle","Party","Adventure","Puzzle")
solo <- c("solo","solo","solo","double","double","double","solo","solo","solo","double","double","double","solo","solo","solo","double","double","double")
game <- c("Game1","Game2","Game3","Game1","Game2","Game3","Game1","Game2","Game3","Game1","Game2","Game3","Game1","Game2","Game3","Game1","Game2","Game3")
rating <- c(8,7,6,5,3,3,2,1,10,3,4,5,6,3,2,1,3,1)
df <- as.data.frame(rating)
df$game <- game
df$solo <- solo
df$category <- category
df <- df %>%
group_by(category, solo) %>%
mutate(best_game = max(rating, na.rm = TRUE)) %>%
ungroup
My question is, instead of having the highest rating (best_game) for each solo/double and category group, how can I add a column that has the game corresponding to the highest rating in best_game?
Example Output. Expected output is what I'd like to happen
rating game solo category best_game **(expected_output)**
5 Game1 double Party 5 Game1
3 Game1 double Party 5 Game1
CodePudding user response:
Just include a statment to subset the game
variable with rating == max(rating)
df <- df %>%
group_by(category, solo) %>%
mutate(best_game = game[rating == max(rating, na.rm = TRUE)]) %>%
ungroup
df
# A tibble: 18 × 5
rating game solo category best_game
<dbl> <chr> <chr> <chr> <chr>
1 8 Game1 solo Party Game1
2 7 Game2 solo Adventure Game2
3 6 Game3 solo Puzzle Game3
4 5 Game1 double Party Game1
5 3 Game2 double Adventure Game2
6 3 Game3 double Puzzle Game3
7 2 Game1 solo Party Game1
8 1 Game2 solo Adventure Game2
9 10 Game3 solo Puzzle Game3
10 3 Game1 double Party Game1
11 4 Game2 double Adventure Game2
12 5 Game3 double Puzzle Game3
13 6 Game1 solo Party Game1
14 3 Game2 solo Adventure Game2
15 2 Game3 solo Puzzle Game3
16 1 Game1 double Party Game1
17 3 Game2 double Adventure Game2
18 1 Game3 double Puzzle Game3
CodePudding user response:
This works, but should be updated with what you want to do if more than one game has the max rating?
inner_join(
df,
df %>%
group_by(category, solo) %>%
filter(max(rating, na.rm = TRUE)==rating) %>%
rename(best_game=game) %>%
select(-rating),
by=c("category","solo")
)
Output
rating game solo category best_game
1 8 Game1 solo Party Game1
2 7 Game2 solo Adventure Game2
3 6 Game3 solo Puzzle Game3
4 5 Game1 double Party Game1
5 3 Game2 double Adventure Game2
6 3 Game3 double Puzzle Game3
7 2 Game1 solo Party Game1
8 1 Game2 solo Adventure Game2
9 10 Game3 solo Puzzle Game3
10 3 Game1 double Party Game1
11 4 Game2 double Adventure Game2
12 5 Game3 double Puzzle Game3
13 6 Game1 solo Party Game1
14 3 Game2 solo Adventure Game2
15 2 Game3 solo Puzzle Game3
16 1 Game1 double Party Game1
17 3 Game2 double Adventure Game2
18 1 Game3 double Puzzle Game3