Background
I've got this dataframe df
:
df <- data.frame(ID = c("a","a","a","b", "c","c","c","c"),
event = c("red","black","blue","white", "orange","red","gray","green"),
stringsAsFactors=FALSE)
It's got some people in it (ID
) and a description of an event
. I'd like to make a new variable condition
that indicates 1 or 0 based on whether any of the cells for a given ID
contain either "red" or "blue".
The Problem
I can get this work, but only for the matching row. What I'd like is that if any of a person's cells in event
contain "red" or "blue", all their cells in condition
should be marked 1
. In other words, I'd like this:
ID event condition
a red 1
a black 1
a blue 1
b white 0
c orange 1
c red 1
c gray 1
c green 1
What I've tried
So far, I've used this code to get this result:
df <- df %>%
mutate(condition = ifelse(df$event %in% c("red","blue"), 1, 0))
ID event condition
a red 1
a black 0
a blue 1
b white 0
c orange 0
c red 1
c gray 0
c green 0
In other words, the rows that match are marked 1
, but I'd like all rows for an ID with any matching row to be marked 1
.
CodePudding user response:
We need any
wrapped around the logical vector from %in%
- in addition the arguments can be reversed (In the OPs code, it is return 1 where it matches the elements 'red' or 'blue', leaving the others 0.
library(dplyr)
df %>%
group_by(ID) %>%
mutate(condition = (any(c('red', 'blue') %in% event))) %>%
ungroup
-output
# A tibble: 8 × 3
ID event condition
<chr> <chr> <int>
1 a red 1
2 a black 1
3 a blue 1
4 b white 0
5 c orange 1
6 c red 1
7 c gray 1
8 c green 1
CodePudding user response:
Here is an alternative approach:
library(dplyr)
library(stringr)
df %>%
group_by(ID) %>%
mutate(condition = if_else(str_detect(event, paste(c("red", "blue"), collapse = "|")), 1, 0))
ID event condition
<chr> <chr> <dbl>
1 a red 1
2 a black 0
3 a blue 1
4 b white 0
5 c orange 0
6 c red 1
7 c gray 0
8 c green 0