Home > Software engineering >  Error in `case_when` condition, but the error message isn't helpful in figuring out what's
Error in `case_when` condition, but the error message isn't helpful in figuring out what's

Time:11-24

I don't know what's going on here in this seemingly very basic recoding example I have:

library(dplyr)
df = data.frame(hcat = 1:5,
                Q12  = 41:45)

df |> 
  mutate(Q12_test = case_when(hcat <= 3 & Q12 == 41 ~ 40,
                              hcat == 5 & Q12 == 42 ~ 41,
                              hcat == 5 & Q12 == 43 ~ 42,
                              TRUE ~ Q12))

This looks like a standard case_when condition I've uses hundreds of times, but for a reason I don't understand, it errors:

<error/dplyr:::mutate_error>
Error in `mutate()`:
! Problem while computing `Q12_test = case_when(...)`.
Caused by error in `case_when()`:

---
Backtrace:
 1. dplyr::mutate(...)
 6. dplyr::case_when(...)

What am I missing here?

Note: I also played around with changing some code parts (e.g. taking out teh additional hcat condition or the thrid condition), but nothing worked.

Update: ok, the culprit seems to be the "catch all" condition at the bottom, i.e. TRUE ~ Q12. If I take it out, it works. Now, question is how would I be able to leave that in, because I don't want to recode these to NA, but instead just keep the original Q12 value.

Update 2: ok, the following code works, but I really don't know, why I need to wrap it into an as.numeric:

df |> 
  mutate(Q12_test = case_when(hcat <= 3 & Q12 == 41 ~ 40,
                              hcat == 5 & Q12 == 42 ~ 41,
                              hcat == 5 & Q12 == 43 ~ 42,
                              TRUE ~ as.numeric(Q12)))

CodePudding user response:

It's because mixed type of numeric and integer.

case_when is very sensitive to data types, so you must make variable's type same.

> str(df$Q12)
 int [1:5] 41 42 43 44 45
> str(42)
 num 42
> str(42L)
 int 42

You can make result integer by using L like

df %>% 
  mutate(Q12_test = case_when((hcat <= 3 & Q12 == 41) ~ 40L,
                              (hcat == 5 & Q12 == 42) ~ 41L,
                              (hcat == 5 & Q12 == 43) ~ 42L,
                              TRUE ~ Q12))

or make Q12 numeric.

df %>%
  mutate(Q12 = as.numeric(Q12)) %>%
  mutate(Q12_test = case_when(hcat <= 3 & Q12 == 41 ~ 40,
                              hcat == 5 & Q12 == 42 ~ 41,
                              hcat == 5 & Q12 == 43 ~ 42,
                              TRUE ~ Q12))
  • Related