Home > OS >  How to create a new column based off a number in an existing column that matches with another column
How to create a new column based off a number in an existing column that matches with another column

Time:02-08

I have a dataframe with game, category, if it's solo play or not, and with rating. I add a new column that contains the best game in terms of max rating in each category within each solo/double grouping.

category <- c("Party","Adventure","Puzzle","Party","Adventure","Puzzle","Party","Adventure","Puzzle","Party","Adventure","Puzzle","Party","Adventure","Puzzle","Party","Adventure","Puzzle")
solo <- c("solo","solo","solo","double","double","double","solo","solo","solo","double","double","double","solo","solo","solo","double","double","double")
game <- c("Game1","Game2","Game3","Game1","Game2","Game3","Game1","Game2","Game3","Game1","Game2","Game3","Game1","Game2","Game3","Game1","Game2","Game3")
rating <- c(8,7,6,5,3,3,2,1,10,3,4,5,6,3,2,1,3,1) 
df <- as.data.frame(rating)
df$game <- game
df$solo <- solo
df$category <- category

df <- df  %>% 
  group_by(category, solo) %>% 
  mutate(best_game = max(rating, na.rm = TRUE)) %>%
  ungroup

My question is, instead of having the highest rating (best_game) for each solo/double and category group, how can I add a column that has the game corresponding to the highest rating in best_game?

Example Output. Expected output is what I'd like to happen

rating    game    solo   category  best_game  **(expected_output)**
    5       Game1   double  Party   5            Game1
    3       Game1   double  Party   5            Game1

CodePudding user response:

Just include a statment to subset the game variable with rating == max(rating)

df <- df  %>% 
        group_by(category, solo) %>% 
        mutate(best_game = game[rating == max(rating, na.rm = TRUE)]) %>%
        ungroup

df

# A tibble: 18 × 5
   rating game  solo   category  best_game
    <dbl> <chr> <chr>  <chr>     <chr>    
 1      8 Game1 solo   Party     Game1    
 2      7 Game2 solo   Adventure Game2    
 3      6 Game3 solo   Puzzle    Game3    
 4      5 Game1 double Party     Game1    
 5      3 Game2 double Adventure Game2    
 6      3 Game3 double Puzzle    Game3    
 7      2 Game1 solo   Party     Game1    
 8      1 Game2 solo   Adventure Game2    
 9     10 Game3 solo   Puzzle    Game3    
10      3 Game1 double Party     Game1    
11      4 Game2 double Adventure Game2    
12      5 Game3 double Puzzle    Game3    
13      6 Game1 solo   Party     Game1    
14      3 Game2 solo   Adventure Game2    
15      2 Game3 solo   Puzzle    Game3    
16      1 Game1 double Party     Game1    
17      3 Game2 double Adventure Game2    
18      1 Game3 double Puzzle    Game3 

CodePudding user response:

This works, but should be updated with what you want to do if more than one game has the max rating?

inner_join(
  df, 
  df %>% 
    group_by(category, solo) %>% 
    filter(max(rating, na.rm = TRUE)==rating) %>% 
    rename(best_game=game) %>% 
    select(-rating),
  by=c("category","solo")
)

Output

   rating  game   solo  category best_game
1       8 Game1   solo     Party     Game1
2       7 Game2   solo Adventure     Game2
3       6 Game3   solo    Puzzle     Game3
4       5 Game1 double     Party     Game1
5       3 Game2 double Adventure     Game2
6       3 Game3 double    Puzzle     Game3
7       2 Game1   solo     Party     Game1
8       1 Game2   solo Adventure     Game2
9      10 Game3   solo    Puzzle     Game3
10      3 Game1 double     Party     Game1
11      4 Game2 double Adventure     Game2
12      5 Game3 double    Puzzle     Game3
13      6 Game1   solo     Party     Game1
14      3 Game2   solo Adventure     Game2
15      2 Game3   solo    Puzzle     Game3
16      1 Game1 double     Party     Game1
17      3 Game2 double Adventure     Game2
18      1 Game3 double    Puzzle     Game3
  •  Tags:  
  • Related