I'm trying to create a new column in my data frame that is the result of which of the three different columns has the highest value.
The data looks like this:
head(primary_game_fb, n = 20)
pitcher game_date ff_num si_num fc_num
<dbl> <date> <dbl> <dbl> <dbl>
1 405395 2022-05-15 0 0 0
2 424144 2022-04-07 1 8 0
3 424144 2022-04-08 1 2 0
4 424144 2022-04-13 0 2 0
5 424144 2022-04-17 0 10 0
6 424144 2022-04-19 0 4 0
7 424144 2022-04-22 0 11 0
8 424144 2022-04-24 1 6 0
9 425794 2022-04-07 6 28 15
10 425794 2022-04-14 8 22 20
11 425794 2022-04-19 3 25 27
12 425794 2022-04-24 7 28 20
13 425794 2022-04-29 10 26 21
14 425794 2022-05-04 8 28 24
15 425794 2022-05-15 6 31 17
16 425794 2022-05-20 10 25 19
17 425794 2022-05-26 14 25 27
18 425794 2022-05-31 25 30 21
19 425794 2022-06-05 11 30 25
20 425794 2022-06-11 6 37 34
My attempts so far have been:
primary_game_fb <- primary_game_fb%>%
dplyr::mutate(primary_fb = case_when(ff_num > si_num & fc_num ~ "FF",
si_num > ff_num & fc_num ~ "SI",
fc_num > ff_num & si_num ~ "FC",
TRUE ~ NA_real_))
which returns the error:
Error in `dplyr::mutate()`:
! Problem while computing `primary_fb = case_when(...)`.
Caused by error in `` names(message) <- `*vtmp*` ``:
! 'names' attribute [1] must be the same length as the vector [0]
and
primary_game_fb <- primary_game_fb%>%
dplyr::mutate(primary_fb = ifelse(ff_num > si_num & fc_num, "FF",
ifelse(si_num > ff_num & fc_num, "SI",
ifelse(fc_num > ff_num & si_num, "FC", NA))))
which doesn't return an error, but doesn't do what I want it to:
head(primary_game_fb, n = 20)
# A tibble: 20 x 6
pitcher game_date ff_num si_num fc_num primary_fb
<dbl> <date> <dbl> <dbl> <dbl> <chr>
1 405395 2022-05-15 0 0 0 NA
2 424144 2022-04-07 1 8 0 NA
3 424144 2022-04-08 1 2 0 NA
4 424144 2022-04-13 0 2 0 NA
5 424144 2022-04-17 0 10 0 NA
6 424144 2022-04-19 0 4 0 NA
7 424144 2022-04-22 0 11 0 NA
8 424144 2022-04-24 1 6 0 NA
9 425794 2022-04-07 6 28 15 SI
10 425794 2022-04-14 8 22 20 SI
11 425794 2022-04-19 3 25 27 SI
12 425794 2022-04-24 7 28 20 SI
13 425794 2022-04-29 10 26 21 SI
14 425794 2022-05-04 8 28 24 SI
15 425794 2022-05-15 6 31 17 SI
16 425794 2022-05-20 10 25 19 SI
17 425794 2022-05-26 14 25 27 SI
18 425794 2022-05-31 25 30 21 SI
19 425794 2022-06-05 11 30 25 SI
20 425794 2022-06-11 6 37 34 SI
The second solution works on some rows, but not all, such as the 5th row, where primary_fb should be equal to "SI" because it has more observations that num_ff or num_fc. What am I doing wrong?
CodePudding user response:
We may need to change the logical expressions as
library(dplyr)
primary_game_fb%>%
dplyr::mutate(primary_fb = ifelse(ff_num > si_num & ff_num > fc_num,
"FF",
ifelse(si_num > ff_num & si_num > fc_num, "SI",
ifelse(fc_num > ff_num & fc_num > si_num, "FC", NA))))