My DF has two dummy columns:
Invested | ProgramParticipant |
---|---|
1 | 0 |
0 | 0 |
1 | 1 |
1 | 0 |
0 | 1 |
0 | 0 |
The goal is to create a third dummy column, which only considers the rows that did not invest, and assigns a 1 if they participated in the program. If the row did not invest, a NA should be assigned. Ideally, it would look like this:
Invested | ProgramParticipant | Invested&Participated |
---|---|---|
1 | 0 | 0 |
0 | 0 | NA |
1 | 1 | 1 |
1 | 0 | 0 |
0 | 1 | NA |
0 | 0 | NA |
I tried working with standard ifelse statements, or with DPLYR's casewhen, but I cannot seem to fix the NA assignments.
CodePudding user response:
You should take a look at case_when
to avoid the confusion of nested ifelse
s:
library(dplyr)
df %>%
mutate(InvestedParticipated =
case_when(Invested == 1 & ProgramParticipant == 1 ~ 1,
Invested == 1 & ProgramParticipant == 0 ~ 0,
Invested == 0 ~ NA_real_))
output
Invested ProgramParticipant InvestedParticipated
1 1 0 0
2 0 0 NA
3 1 1 1
4 1 0 0
5 0 1 NA
6 0 0 NA
case_when
sets to NA cases that are not matched, so you could even do:
df %>%
mutate(InvestedParticipated =
case_when(Invested == 1 & ProgramParticipant == 1 ~ 1,
Invested == 1 & ProgramParticipant == 0 ~ 0))
CodePudding user response:
Does this work:
library(dplyr)
df %>% mutate(InvestednParticipated = if_else(Invested == 1 & ProgramParticipant == 1, 1,
if_else(Invested == 0, NA_real_, 0)))
Invested ProgramParticipant InvestednParticipated
1 1 0 0
2 0 0 NA
3 1 1 1
4 1 0 0
5 0 1 NA
6 0 0 NA
CodePudding user response:
If I understand you correctly, you want this:
ifelse(Invested == 1 & ProgramParticipant == 1, 1,
ifelse(Invested == 1 & ProgramParticipant == 0, 0, NA))
CodePudding user response:
There is no need to use a nested ifelse()
. A single ifelse()
is enough.
library(dplyr)
df %>%
mutate(res = ifelse(Invested == 0, NA, Invested ProgramParticipant - 1))
# Invested ProgramParticipant res
# 1 1 0 0
# 2 0 0 NA
# 3 1 1 1
# 4 1 0 0
# 5 0 1 NA
# 6 0 0 NA
Data
df <- structure(list(Invested = c(1L, 0L, 1L, 1L, 0L, 0L), ProgramParticipant = c(0L,
0L, 1L, 0L, 1L, 0L)), class = "data.frame", row.names = c(NA, -6L))