I am trying to set up a dummy variable and can't figure out how to set my variable to over and under 26, but not include 26.
For example, this would set under 26 to 1 and 26 and older to 0.
data$treated = ifelse(data$age <26, 1, 0)
I want it to be over or under 26.
Thanks.
CodePudding user response:
If you want to stick to base r
:
data$treated = ifelse(data$age <26, 1,
ifelse(data$age >26, 0,
NA))
A dplyr
solution:
library(dplyr)
data <- data %>% mutate(treated = case_when(age <26 ~ 1,
age >26 ~ 0))
> data
# A tibble: 9 x 2
age treated
<int> <dbl>
1 22 1
2 23 1
3 24 1
4 25 1
5 26 NA
6 27 0
7 28 0
8 29 0
9 30 0
Data
data <- structure(list(age = 22:30), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -9L))
CodePudding user response:
If age
is not 26 use (age < 26)
and otherwise NA. A few ways to do that are shown below. The first below is the most straight forward.
To simplify debugging avoid overwriting data frames so that you know that each variable always has its original value. Instead create a new data frame.
No packages are used.
transform(data, treated = ifelse(age != 26, (age < 26), NA))
transform(data, treated = (age < 26) ifelse(age != 26, 0, NA))
transform(data, treated = (replace(age, age == 26, NA) < 26))
# a bit tricky so not really recommended
transform(data, treated = (data$age < 26) * NA^(age == 26))
within(data, {
treated <- (age < 26)
treated[age == 26] <- NA
})