I want to write a case_when
code in a dplyr
pipeline. However, I am trying to add multiple cases within it.
For example: If a have the following data frame
id | purchases |
---|---|
a | need |
a | want |
a | none |
b | want |
b | need |
c | need |
c | need |
c | want |
d | none |
d | none |
I want to summarize the output so that case when the first 2 observations per each id are needs and case when the observation "none" is not put in consideration then put yes
in a new column. If there's no need or want for a given id then none
, else no
The output should be the following: |id|output| |--|---------| |a|no| |b|no| |c|yes| |d|none|
My code
actions %>% group_by (id) %>% arrange(id)
%>% summarise(output = case_when(first(purchases) == "need" & nth(purchases,2) =="need"~ "yes", "no"
I know the code is a bit messy, as I don't know who to add up the second condition of neglecting none
observations when the cases would result in a yes
or no
CodePudding user response:
I've tried to place your logic in a small function f()
, which can then be applied to purchases
, by id
f <- function(p) {
if(p[1]==p[2] & (p[1] %in% c("need", "want"))) return("yes")
ifelse(all(p=="none"), "none", "no")
}
df %>% group_by(id) %>% summarize(output=f(purchases))
Output
id output
<chr> <chr>
1 a no
2 b no
3 c yes
4 d none
The function checks if the first and second value of purchases are equal, and if they are either need
or want
; if so return "yes". Otherwise if all of purchases
values are "none", return "none", else return "no".
CodePudding user response:
Try this using case_when
actions %>% group_by(id) %>%
summarise(output =
case_when(isTRUE(intersect(purchases[[1]] , purchases[[2]]) == "none") ~ "none" ,
isTRUE(intersect(purchases[[1]] , purchases[[2]]) %in% c("need" , "want")) ~ "yes",
TRUE ~ "no"))