Home > Back-end >  How to create a new column with the same values of another column based on a condition in R?
How to create a new column with the same values of another column based on a condition in R?

Time:05-01

I have to current data frame below:

ID grade_all highest
1 pass 1
1 fail 0
1 fail 0
2 pass 0
2 fail 1
3 fail 1
3 pass 0

and I want this:

ID grade_all highest final_grade
1 pass 1 pass
1 fail 0 pass
1 fail 0 pass
2 pass 0 fail
2 fail 1 fail
3 fail 1 fail
3 pass 0 fail

I want the line that has the 1 to be the one that supersedes and put that value into each of the rows for the same student (same ID). I thought this would work but its giving me an error

df <- df %>%
group_by(ID)%>%
mutate(final_grade = grade_all[highest ==1] ```

CodePudding user response:

With the additional info that you gave I interpret your question as the following: If highest == 1 then it should take the value of grade_all and apply it to all members in the group for the new column final_grade:

library(dplyr)
library(zoo)

df %>% 
  group_by(ID) %>% 
  mutate(final_grade = ifelse(highest == 1, grade_all, NA_character_),
         final_grade = zoo::na.locf(final_grade))


     ID grade_all highest final_grade
  <int> <chr>       <int> <chr>      
1     1 pass            1 pass       
2     1 fail            0 pass       
3     1 fail            0 pass       
4     2 pass            0 fail       
5     2 fail            1 fail       
6     3 fail            1 fail       
7     3 pass            0 fail   

CodePudding user response:

What about this?

> df %>%
      group_by(ID) %>%
      mutate(final_grade = grade_all[highest > 0]) %>%
      ungroup()
# A tibble: 7 x 4
     ID grade_all highest final_grade
  <int> <chr>       <int> <chr>
1     1 pass            1 pass
2     1 fail            0 pass
3     1 fail            0 pass
4     2 pass            0 fail
5     2 fail            1 fail
6     3 fail            1 fail
7     3 pass            0 fail

CodePudding user response:

Taking advantage of 1 being interpreted as logical TRUE and vice versa, test if any 'highest' per ID is 1 and index c('fail', 'pass') accordingly:

df %>%
    group_by(ID) %>%
    mutate(final_grade = c('fail','pass')[any(highest)   1])
  • Related