I have to current data frame below:
ID | grade_all | highest |
---|---|---|
1 | pass | 1 |
1 | fail | 0 |
1 | fail | 0 |
2 | pass | 0 |
2 | fail | 1 |
3 | fail | 1 |
3 | pass | 0 |
and I want this:
ID | grade_all | highest | final_grade |
---|---|---|---|
1 | pass | 1 | pass |
1 | fail | 0 | pass |
1 | fail | 0 | pass |
2 | pass | 0 | fail |
2 | fail | 1 | fail |
3 | fail | 1 | fail |
3 | pass | 0 | fail |
I want the line that has the 1 to be the one that supersedes and put that value into each of the rows for the same student (same ID). I thought this would work but its giving me an error
df <- df %>%
group_by(ID)%>%
mutate(final_grade = grade_all[highest ==1] ```
CodePudding user response:
With the additional info that you gave I interpret your question as the following:
If highest == 1
then it should take the value of grade_all
and apply it to all members in the group for the new column final_grade
:
library(dplyr)
library(zoo)
df %>%
group_by(ID) %>%
mutate(final_grade = ifelse(highest == 1, grade_all, NA_character_),
final_grade = zoo::na.locf(final_grade))
ID grade_all highest final_grade
<int> <chr> <int> <chr>
1 1 pass 1 pass
2 1 fail 0 pass
3 1 fail 0 pass
4 2 pass 0 fail
5 2 fail 1 fail
6 3 fail 1 fail
7 3 pass 0 fail
CodePudding user response:
What about this?
> df %>%
group_by(ID) %>%
mutate(final_grade = grade_all[highest > 0]) %>%
ungroup()
# A tibble: 7 x 4
ID grade_all highest final_grade
<int> <chr> <int> <chr>
1 1 pass 1 pass
2 1 fail 0 pass
3 1 fail 0 pass
4 2 pass 0 fail
5 2 fail 1 fail
6 3 fail 1 fail
7 3 pass 0 fail
CodePudding user response:
Taking advantage of 1 being interpreted as logical TRUE and vice versa, test if any 'highest' per ID is 1 and index c('fail', 'pass') accordingly:
df %>%
group_by(ID) %>%
mutate(final_grade = c('fail','pass')[any(highest) 1])