I'm triying to generate a new variable using multiple conditionals that evaluate against factor variables.
So, let's say I got this factor variables data.frame
x<-c("1", "2", "1","NA", "1", "2", "NA", "1", "2", "2", "NA" )
y<-c("1","NA", "2", "1", "1", "NA", "2", "1", "2", "1", "1" )
z<-c("1", "2", "3", "4", "1", "2", "3", "4", "1", "2", "3")
w<- c("01", "02", "03", "04","05", "06", "07", "01", "02", "03", "04")
df<-data.frame(x,y,z,w)
df$x<-as.factor(df$x)
df$y<-as.factor(df$y)
df$z<-as.factor(df$z)
df$w<-as.factor(df$w)
str(df)
So I need to get a new v colum on my dataframe which takes values between 1, 0 or NA
with the following conditionals:
Takes value 1 if: x = "1", y = "1", z = "1" or "2", w = "01" to "06"
Takes value 0 if it doesn't meet at least one of the conditionals.
Takes value NA
if any of x, y, z, or w is NA
.
Had tried using a pipe %>%
along mutate
and case_when
but have been unable to make it work.
So my desired result would be a new column v
in df
which would look like this:
[1] 1 NA 0 NA 1 NA NA 0 0 0 NA
CodePudding user response:
Here I also use mutate
with case_when
. Since the NA
in your dataset is of character "NA" (literal string of "NA"), we cannot use function like is.na()
to idenify it. Would recommend to change it to "real" NA
(by removing double quotes in your input).
As I've pointed out in the comment, I'm not sure why the eighth entry is "1" when the corresponding z
is not "1" or "2".
library(dplyr)
df %>% mutate(v = case_when(x == "1" & y == "1" & z %in% c("1", "2") & w %in% paste0(0, seq(1:6)) ~ "1",
x == "NA" | y == "NA" | z == "NA" | w == "NA" ~ NA_character_,
T ~ "0"))
x y z w v
1 1 1 1 01 1
2 2 NA 2 02 <NA>
3 1 2 3 03 0
4 NA 1 4 04 <NA>
5 1 1 1 05 1
6 2 NA 2 06 <NA>
7 NA 2 3 07 <NA>
8 1 1 4 01 0
9 2 2 1 02 0
10 2 1 2 03 0
11 NA 1 3 04 <NA>