I have the following dataset:
ID = c('A','A','B','B','B','C','C','D','D','D')
B = c(1,1,1,1,2,1,2,1,2,1)
Condition1 = c(1,0,1,0,1,1,0,0,1,1)
Condition2 = c(0,1,0,1,0,0,1,1,0,0)
data2 <- data.frame(ID,B,Condition1,Condition2)
ID B Condition1 Condition2
1 A 1 1 0
2 A 1 0 1
3 B 1 1 0
4 B 1 0 1
5 B 2 1 0
6 C 1 1 0
7 C 2 0 1
8 D 1 0 1
9 D 2 1 0
10 D 1 1 0
I want to get the ID that meets the conditions based on B, Condition1, and Condition 2 which
B[Condition1 ==1] != B[Condition2 ==1]
Desired output is a subset that satisfied the criteria above. In this case only C satisfied the criteria:
ID B Condition1 Condition2
C 1 1 0
C 2 0 1
I tried :
data2 %>% group_by(ID) %>%
filter((B[Condition1 ==1]) != (B[Condition2 ==1]))
But this only works when there is no additional row for each ID. For example: (no ID satisfied the criteria)
ID B Condition1 Condition2
1 A 1 1 0
2 A 1 0 1
3 B 1 1 0
4 B 1 0 1
but if there is an additional row for ID 'B',
ID B Condition1 Condition2
1 A 1 1 0
2 A 1 0 1
3 B 1 1 0
4 B 1 0 1
5 B 2 1 0
It would prompt an error
Error in `filter()`:
! Problem while computing `..1 = (B[Condition1 == 1]) != (B[Condition2 == 1])`.
x Input `..1` must be of size 3 or 1, not size 2.
ℹ The error occurred in group 2: ID = "B".
How do I write the condition statement to fix this problem? Thanks!
CodePudding user response:
We may need to wrap with all
, and also instead of !=
use %in%
with !
as there can be length difference
library(dplyr)
data2 %>%
group_by(ID) %>%
filter(all(!B[Condition1 == 1] %in% B[Condition2 == 1])) %>%
ungroup
-output
# A tibble: 2 × 4
ID B Condition1 Condition2
<chr> <dbl> <dbl> <dbl>
1 C 1 1 0
2 C 2 0 1