Hi I have a dataset such as below:
df <-
read.table(textConnection("ID Condition1 Condition2
A 1 0
A 1 1
A 0 1
A 1 0
A 0 0
A 0 1
B 1 1
B 0 1
C 1 0
C 0 0
D 1 0
D 0 1
D 1 1
D 0 0
"), header=TRUE)
I am looking to group by ID and filter to keep rows based on below conditions:
- If a value of '1' is in column 'Condition1' then the row is retained
- If a value of '1' is in column 'Condition2' then that row is also kept if it is immediately below a row that has a '1' in Condition1 above it.
Rows may have any number of combinations of 1s and 0s in the two columns. So for the below example, expected output would be:
ID | Condition1 | Condition2 |
---|---|---|
A | 1 | 0 |
A | 1 | 1 |
A | 0 | 1 |
A | 1 | 0 |
B | 1 | 1 |
B | 0 | 1 |
C | 1 | 0 |
D | 1 | 0 |
D | 0 | 1 |
D | 1 | 1 |
CodePudding user response:
You could use filter
and lag
:
library(dplyr)
df |>
filter(Condition1 == 1 | (Condition2 == 1 & lag(Condition1 == 1)))
Output:
ID Condition1 Condition2
1 A 1 0
2 A 1 1
3 A 0 1
4 A 1 0
5 B 1 1
6 B 0 1
7 C 1 0
8 D 1 0
9 D 0 1
10 D 1 1
CodePudding user response:
You can try:
i <- df$Condition1 == 1 #If a value of '1' is in column 'Condition1'
j <- c(FALSE, #First row
df$Condition2[-1] == 1 #If a value of '1' is in column 'Condition2'
& i[-length(i)] #if it is immediately below a row that has a '1' in Condition1 above it
& df$ID[-length(i)] == df$ID[-1]) #group by ID
df[i | j,] #Combine both i and j with or to subset
# ID Condition1 Condition2
#1 A 1 0
#2 A 1 1
#3 A 0 1
#4 A 1 0
#7 B 1 1
#8 B 0 1
#9 C 1 0
#11 D 1 0
#12 D 0 1
#13 D 1 1