I have the following dataframe called df (dput
below):
group indicator value
1 A FALSE 2
2 A FALSE 1
3 A FALSE 2
4 A TRUE 4
5 B FALSE 5
6 B FALSE 1
7 B TRUE 3
I would like to remove the non-last rows with indicator == FALSE
per group. This means that in df the rows: 1,2 and 5 should be removed because they are not the last rows with FALSE per group. Here is the desired output:
group indicator value
1 A FALSE 2
2 A TRUE 4
3 B FALSE 1
4 B TRUE 3
So I was wondering if anyone knows how to remove non-last rows with certain condition per group in R?
dput
of df:
df <- structure(list(group = c("A", "A", "A", "A", "B", "B", "B"),
indicator = c(FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, TRUE
), value = c(2, 1, 2, 4, 5, 1, 3)), class = "data.frame", row.names = c(NA,
-7L))
CodePudding user response:
You can do this with lead
and check if the coming indicator is TRUE
.
library(tidyverse)
df <- structure(list(group = c("A", "A", "A", "A", "B", "B", "B"),
indicator = c(FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, TRUE
), value = c(2, 1, 2, 4, 5, 1, 3)), class = "data.frame", row.names = c(NA,
-7L))
df |>
group_by(group) |>
mutate(slicer = if_else(lead(indicator) ==F, 1, 0)) |>
mutate(slicer = if_else(is.na(slicer), 0 , slicer)) |>
filter(slicer == 0) |>
select(-slicer)
#> # A tibble: 4 × 3
#> # Groups: group [2]
#> group indicator value
#> <chr> <lgl> <dbl>
#> 1 A FALSE 2
#> 2 A TRUE 4
#> 3 B FALSE 1
#> 4 B TRUE 3
CodePudding user response:
This may help:
library(dplyr)
df %>%
group_by(group) %>%
filter(indicator | ((row_number() == n() - 1) & !indicator))
# A tibble: 4 × 3
# Groups: group [2]
group indicator value
<chr> <lgl> <dbl>
1 A FALSE 2
2 A TRUE 4
3 B FALSE 1
4 B TRUE 3