Consider these data:
library(dplyr)
d <- tibble(student.status = c(0, 1, NA, 0, 1, 1),
student.school.hs = c(NA, 1, NA, NA, NA, NA),
student.school.alths = c(NA, NA, NA, NA, NA, 1),
student.school.allNA = c(TRUE, FALSE, TRUE, TRUE, TRUE, FALSE))
student.status student.school.hs student.school.alt… student.school.…
<dbl> <dbl> <dbl> <lgl>
1 0 NA NA TRUE
2 1 1 NA FALSE
3 NA NA NA TRUE
4 0 NA NA TRUE
5 1 NA NA TRUE
6 1 NA 1 FALSE
I want to assign "0" to
student.school.*
whenstudent.status == 1
and when all of thestudent.school.*
columns are not NA.If all of the
student.school.*
colums are NA andstudent.status == 1
, then leave them NA.If
student.status == 0
then all thestudent.school.*
columns should stay NA
The final data should look like:
student.status student.school.hs student.school.alt… student.school.…
<dbl> <dbl> <dbl> <lgl>
1 0 NA NA TRUE
2 1 1 0 FALSE
3 NA NA NA TRUE
4 0 NA NA TRUE
5 1 NA NA TRUE
6 1 0 1 FALSE
CodePudding user response:
Perhaps this helps - loop across
columns that starts_with
the prefix 'student.school' in column name, while remove the logical column from the selection (-where(is.logical)
- as student.school.allNA
also have the same prefix but different column type), then use case_when
to change the value of the columns when it is an NA
, and if the student.school.allNA
are FALSE (negated (!
), along with student.status
is 1)
library(dplyr)
d <- d %>%
mutate(across(c(starts_with('student.school'), - where(is.logical)),
~ case_when(student.status %in% 1 & !student.school.allNA & is.na(.x) ~ 0,
TRUE ~ .x)))
-output
> d
# A tibble: 6 × 4
student.status student.school.hs student.school.alths student.school.allNA
<dbl> <dbl> <dbl> <lgl>
1 0 NA NA TRUE
2 1 1 0 FALSE
3 NA NA NA TRUE
4 0 NA NA TRUE
5 1 NA NA TRUE
6 1 0 1 FALSE