In this type of data:
df <- data.frame(
Sequ = c(1,1,2,2,2,3,3,3),
G = c("A", "B", "*", "B", "A", "A", "*", "B")
)
I need to filter out rows grouped by Sequ
iff the Sequ
-first value is *
. I can do it like so, but was wondering if there's a more direct and more elegant way in dplyr
:
library(dplyr)
df %>%
group_by(Sequ) %>%
mutate(check = ifelse(first(G)=="*", 1, 0)) %>%
filter(check != 1)
# A tibble: 5 × 3
# Groups: Sequ [2]
Sequ G check
<dbl> <chr> <dbl>
1 1 A 0
2 1 B 0
3 3 A 0
4 3 * 0
5 3 B 0
CodePudding user response:
We can try the following base R code using subset
ave
subset(
df,
!ave(G == "*", Sequ, FUN = function(x) head(x, 1))
)
which gives
Sequ G
1 1 A
2 1 B
6 3 A
7 3 *
8 3 B
CodePudding user response:
Another base R
option with duplicated
subset(df, !Sequ %in% Sequ[G == "*" & !duplicated(Sequ)])
Sequ G
1 1 A
2 1 B
6 3 A
7 3 *
8 3 B
CodePudding user response:
Here is a direct dplyr
way:
library(dplyr)
df %>%
group_by(Sequ) %>%
filter(!first(G == "*"))
Sequ G
<dbl> <chr>
1 1 A
2 1 B
3 3 A
4 3 *
5 3 B