Home > Software engineering >  select all the rows after specific rows in each group in R
select all the rows after specific rows in each group in R

Time:12-16

I want to select rows that appear after first V (in action column) for each user.

df<-read.table(text="
user   action
1        D
1        D
1        P
1        E
1        V
1        D
1        D
2        E
2        V
2        V
2        P",header=T,stringsAsFactors = F)

resutl:
user   action
1       V
1       D
1       D
2       V
2       V
2       P

CodePudding user response:

Using cumsum in a group_by filter you could do:

library(dplyr)

df |> 
  group_by(user) |> 
  filter(cumsum(action == "V") >= 1) |> 
  ungroup()
#> # A tibble: 6 × 2
#>    user action
#>   <int> <chr> 
#> 1     1 V     
#> 2     1 D     
#> 3     1 D     
#> 4     2 V     
#> 5     2 V     
#> 6     2 P

And thanks to the comment by @r2evans this could be simplified by using cumany:

df |> 
  group_by(user) |> 
  filter(cumany(action == "V")) |> 
  ungroup()
#> # A tibble: 6 × 2
#>    user action
#>   <int> <chr> 
#> 1     1 V     
#> 2     1 D     
#> 3     1 D     
#> 4     2 V     
#> 5     2 V     
#> 6     2 P
  • Related