I found this: Remove subsequent rows of a group after first occurence of 0 in a column
and this: Remove rows after first occurrence of a certain value
But it's not quite what I am looking for. Here's a reprex:
df <- data.frame(
id = c(1001, 1001, 1002, 1002, 1002, 1005, 1005, 1005, 1005, 1005),
name = c("monkey", "gorilla", "chimp", "monkey", "giraffe", "tarzan", "whale", "princess", "phone", "kindle"),
char = c(0, 1, 0, 1, 0, 0, 0, 1, 0, 0))
df
#> id name char
#> 1 1001 monkey 0
#> 2 1001 gorilla 1
#> 3 1002 chimp 0
#> 4 1002 monkey 1
#> 5 1002 giraffe 0
#> 6 1005 tarzan 0
#> 7 1005 whale 0
#> 8 1005 princess 1
#> 9 1005 phone 0
#> 10 1005 kindle 0
df_desired <- data.frame(
id = c(1001, 1002, 1005, 1005),
name = c("monkey", "chimp", "tarzan", "whale"),
char = c(0, 0, 0, 0))
df_desired
#> id name char
#> 1 1001 monkey 0
#> 3 1002 chimp 0
#> 6 1005 tarzan 0
#> 7 1005 whale 0
Created on 2022-08-10 by the reprex package (v2.0.1)
I'm trying to remove the row and its subsequent rows after char hits 1, when grouped by id and arranged by name.
CodePudding user response:
Thanks for updating the details in your question @taimishu; if I've understood you correctly, here is a potential solution:
library(tidyverse)
df <- data.frame(
id = c(1001, 1001, 1002, 1002, 1002, 1005, 1005, 1005, 1005, 1005),
name = c("monkey", "gorilla", "chimp", "monkey", "giraffe", "tarzan", "whale", "princess", "phone", "kindle"),
char = c(0, 1, 0, 1, 0, 0, 0, 1, 0, 0))
df
#> id name char
#> 1 1001 monkey 0
#> 2 1001 gorilla 1
#> 3 1002 chimp 0
#> 4 1002 monkey 1
#> 5 1002 giraffe 0
#> 6 1005 tarzan 0
#> 7 1005 whale 0
#> 8 1005 princess 1
#> 9 1005 phone 0
#> 10 1005 kindle 0
df_desired <- data.frame(
id = c(1001, 1002, 1005, 1005),
name = c("monkey", "chimp", "tarzan", "whale"),
char = c(0, 0, 0, 0))
df_desired
#> id name char
#> 1 1001 monkey 0
#> 2 1002 chimp 0
#> 3 1005 tarzan 0
#> 4 1005 whale 0
df_filtered <- df %>%
group_by(id) %>%
filter(cummax(char) < 1)
df_filtered
#> # A tibble: 4 × 3
#> # Groups: id [3]
#> id name char
#> <dbl> <chr> <dbl>
#> 1 1001 monkey 0
#> 2 1002 chimp 0
#> 3 1005 tarzan 0
#> 4 1005 whale 0
all_equal(df_desired, df_filtered)
#> [1] TRUE
Created on 2022-08-10 by the reprex package (v2.0.1)
CodePudding user response:
You could use cumany
/ cumall
/ cumsum
:
library(dplyr)
df %>%
group_by(id) %>%
filter( ... ) %>%
ungroup()
The ...
part can be filled with
!cumany(char == 1)
cumall(char != 1)
!cumsum(char == 1)
All give
# A tibble: 4 × 3
id name char
<dbl> <chr> <dbl>
1 1001 monkey 0
2 1002 chimp 0
3 1005 tarzan 0
4 1005 whale 0