I am working with survey data where I need to identify whether responses were not applicable or simply skipped.
x1
andx2
are my primary response questions. x3 is only filled out when someone indicates ifx1
andx2
are not applicable.- I want row 1 and row 2 to equal 0 as
x1
andx2
were not applicable, and I want row 4 to equal 1 asx1
andx2
was correctly reported as "None of the above." - Row 3 is the truly missing observation as neither
x1
,x2
, orx3
were answered.
My code below works, but is there a simplier way where I can just refer to my list
variable rather than writing a lengthy AND statement using x1
and x2
? I have simplified this example as my full dataset has 10 x
variables.
library(dplyr)
df3 <- tibble(x1 = c("Yes", "No", NA, NA), x2 = c("Yes", "Yes", NA, NA), x3 = c(NA, NA, NA, "None of the above"))
list <- c("x1", "x2")
df3 %>%
mutate(
x3 = coalesce(as.integer(x3 == "None of the above"), 0),
x3 = case_when(
x3 == 1 ~ 1,
is.na(x1) & is.na(x2) ~ NA_real_,
TRUE ~ 0))
CodePudding user response:
A minimal revision is replacing is.na(x1) & is.na(x2)
with if_all(..., is.na)
, i.e.
df3 %>%
mutate(
x3 = coalesce(as.integer(x3 == "None of the above"), 0),
x3 = case_when(
x3 == 1 ~ 1,
if_all(all_of(list), is.na) ~ NA_real_,
TRUE ~ 0))
With if_all()
, you don't even need to refer to the list
variable, just pass a simple tidy-select helper instead, e.g.
if_all(x1:x2, is.na)
(This wayx1
&x2
must be consecutive columns)if_all(starts_with("x"), is.na)
if_all(num_range("x", 1:2), is.na)
CodePudding user response:
Here is one idea. We can use all_of
, select
to select the columns based on the target
vector (I changed list
to target
as list
is a function in R), then use is.na
and rowSums
to evaluate if all columns are NA
. We can combine this strategy with case_when
to create the x3
column.
library(dplyr)
target <- c("x1", "x2")
df3 %>%
mutate(
x3 = case_when(
rowSums(is.na(select(., all_of(target)))) == 0 ~0L,
rowSums(is.na(select(., all_of(target)))) != 0 & !is.na(x3) ~1L,
TRUE ~NA_integer_
),
)
# # A tibble: 4 × 3
# x1 x2 x3
# <chr> <chr> <int>
# 1 Yes Yes 0
# 2 No Yes 0
# 3 NA NA NA
# 4 NA NA 1