I don't know what's going on here in this seemingly very basic recoding example I have:
library(dplyr)
df = data.frame(hcat = 1:5,
Q12 = 41:45)
df |>
mutate(Q12_test = case_when(hcat <= 3 & Q12 == 41 ~ 40,
hcat == 5 & Q12 == 42 ~ 41,
hcat == 5 & Q12 == 43 ~ 42,
TRUE ~ Q12))
This looks like a standard case_when condition I've uses hundreds of times, but for a reason I don't understand, it errors:
<error/dplyr:::mutate_error>
Error in `mutate()`:
! Problem while computing `Q12_test = case_when(...)`.
Caused by error in `case_when()`:
---
Backtrace:
1. dplyr::mutate(...)
6. dplyr::case_when(...)
What am I missing here?
Note: I also played around with changing some code parts (e.g. taking out teh additional hcat condition or the thrid condition), but nothing worked.
Update: ok, the culprit seems to be the "catch all" condition at the bottom, i.e. TRUE ~ Q12
. If I take it out, it works. Now, question is how would I be able to leave that in, because I don't want to recode these to NA, but instead just keep the original Q12 value.
Update 2: ok, the following code works, but I really don't know, why I need to wrap it into an as.numeric
:
df |>
mutate(Q12_test = case_when(hcat <= 3 & Q12 == 41 ~ 40,
hcat == 5 & Q12 == 42 ~ 41,
hcat == 5 & Q12 == 43 ~ 42,
TRUE ~ as.numeric(Q12)))
CodePudding user response:
It's because mixed type of numeric
and integer
.
case_when
is very sensitive to data types, so you must make variable's type same.
> str(df$Q12)
int [1:5] 41 42 43 44 45
> str(42)
num 42
> str(42L)
int 42
You can make result integer
by using L
like
df %>%
mutate(Q12_test = case_when((hcat <= 3 & Q12 == 41) ~ 40L,
(hcat == 5 & Q12 == 42) ~ 41L,
(hcat == 5 & Q12 == 43) ~ 42L,
TRUE ~ Q12))
or make Q12
numeric
.
df %>%
mutate(Q12 = as.numeric(Q12)) %>%
mutate(Q12_test = case_when(hcat <= 3 & Q12 == 41 ~ 40,
hcat == 5 & Q12 == 42 ~ 41,
hcat == 5 & Q12 == 43 ~ 42,
TRUE ~ Q12))