Home > database >  dyplr using Mutate and Case_when with NA values
dyplr using Mutate and Case_when with NA values

Time:08-19

I'm using the mutate and case_when function to create a new variable with the result dependent on whether certain conditions are met.

The following code works until it gets to the NA values.

Running the code is successful, as in it does not produce any error or warning messages, and when I inspect the dataset it has worked where the conditions == 1 or 2, but it has not worked for the NA values.

I have tried putting NA in "" but this doesn't help

  mutate(Variable1 = case_when
         (VariableA == 1 & VariableB ==1 ~ result.y,
           VariableA == 1 & VariableB ==2 ~ result.y,
           VariableA == 2 & VariableB ==1 ~ result.y,
           VariableA == 2 & VariableB ==2 ~ result.x,
           VariableA == 2 & VariableB ==NA ~ result.x,
           VariableA == NA & VariableB ==1 ~ result.y,
           VariableA == NA & VariableB ==2 ~ result.y,
           VariableA == NA & VariableB ==NA ~ result.x,
          ))

VariableA and VariableB have values of 1, 2 and NA result.y and result.x are variables already in the dataset

First time posting, and fairly new to using R so apologies if I've not included all the information required. Thanks in advance for any suggestions

Sample of Data

VariableA   VariableB   result.x        result.y
<dbl lbl>   <dbl lbl>   <chr>           <chr>
1               2      valid postcode   valid postcode
1               2      valid postcode   valid postcode
1               NA     valid postcode   NA
1               NA     partial postcode NA
1               2      valid postcode   valid postcode
1               1      valid postcode   valid postcode
NA             NA      valid postcode   NA
2               2      partial postcode NA
2              NA      valid postcode   NA
2               1      valid postcode   valid postcode

Output from

> dput(head(Test2))
structure(list(VariableA = structure(c(NA, NA, 2, NA, 2, 2), label = "[VariableA]--Have you moved in to your current house since June of this year?", format.spss = "F1.0", labels = c(Yes = 1, 
No = 2), class = c("haven_labelled", "vctrs_vctr", "double")), 
    VariableB = structure(c(NA, NA, 2, NA, 2, NA), label = "[VariableB]-- Have you moved into your current house since January 2021?", format.spss = "F1.0", labels = c(Yes = 1, 
    No = 2), class = c("haven_labelled", "vctrs_vctr", "double"
    )), result.x = structure(c("vaild postcode", "vaild postcode", 
    "vaild postcode", "vaild postcode", "vaild postcode", "vaild postcode"
    ), label = "[category]--", format.spss = "A16", display_width = 16L), 
    result.y = c(NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_)), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

CodePudding user response:

According to the example provided, VariableA and VariableB are of class dbl lbl which cannot be used in a logical with numeric values.

You need to extract the dbl information form VariableA and VariableB with zap_labels() from haven package. Additionally, it is best practice to use is.na() as recommended in comments.

install.packages("haven")
library(haven)

Test2 %>% zap_labels() %>% mutate(Variable1 = case_when
       (VariableA == 1 & VariableB ==1 ~ result.y,
         VariableA == 1 & VariableB ==2 ~ result.y,
         VariableA == 2 & VariableB ==1 ~ result.y,
         VariableA == 2 & VariableB ==2 ~ result.x,
         VariableA == 2 & is.na(VariableB) ~ result.x,
         is.na(VariableA) & VariableB ==1 ~ result.y,
         is.na(VariableA) & VariableB ==2 ~ result.y,
         is.na(VariableA) & is.na(VariableB) ~ result.x,
       ))

# A tibble: 6 × 5
  VariableA VariableB result.x       result.y Variable1     
      <dbl>     <dbl> <chr>          <chr>    <chr>         
1        NA        NA vaild postcode NA       vaild postcode
2        NA        NA vaild postcode NA       vaild postcode
3         2         2 vaild postcode NA       vaild postcode
4        NA        NA vaild postcode NA       vaild postcode
5         2         2 vaild postcode NA       vaild postcode
6         2        NA vaild postcode NA       vaild postcode
  • Related