I have a data frame that looks something like this:
x | y | z |
---|---|---|
23 | 1 | 1 |
23 | 4 | 2 |
23 | 56 | 1 |
23 | 59 | 2 |
15 | 89 | 1 |
15 | 12 | 1 |
15 | 15 | 2 |
17 | 18 | 1 |
17 | 21 | 2 |
78 | 11 | 1 |
78 | 38 | 1 |
78 | 41 | 2 |
Now this data has certain pattern on column y and column z. I want to get all the data where in column z we have a row wise pair of 2 followed by 1 for a given value in x. Simply put, we need to remove all rows that have 1 in column z but that 1 is not followed by 2 in next row.
The final output should look like this:
x | y | z |
---|---|---|
23 | 1 | 1 |
23 | 4 | 2 |
23 | 56 | 1 |
23 | 59 | 2 |
15 | 12 | 1 |
15 | 15 | 2 |
17 | 18 | 1 |
17 | 21 | 2 |
78 | 38 | 1 |
78 | 41 | 2 |
CodePudding user response:
library(tidyverse)
df <- data.frame(x = c(23,23,23,23,15,15,15,17,17,78,78,78),
y = c(1,4,56,59,89,12,15,18,21,11,38,41),
z = c(1,2,1,2,1,1,2,1,2,1,1,2))
df %>%
filter(!(z == 1 & lead(z) != 2))
CodePudding user response:
You can do this:
library(dplyr)
df %>%
group_by(x) %>%
filter((((z == 1) & (lead(z) == 2)) | ((z == 2) & (lag(z) == 1))))
# A tibble: 10 × 3
# Groups: x [4]
x y z
<int> <int> <int>
1 23 1 1
2 23 4 2
3 23 56 1
4 23 59 2
5 15 12 1
6 15 15 2
7 17 18 1
8 17 21 2
9 78 38 1
10 78 41 2