Home > OS >  Selecting rows after a specific value occurrs in r
Selecting rows after a specific value occurrs in r

Time:06-24

I just want to select all the rows (including the row itself) of my data frame between the value "openwebsite" and "closewebsite" (see variable "activity"). Do I need to use the select- or filter-function?

Thank you a lot!

Dataframe:

Person activity duration
1 write 9
1 openwebsite 8
1 paint 9
1 write 2
1 write 4
1 closewebsite 9
1 write 4

Output

Person activity duration
1 openwebsite 8
1 paint 9
1 write 2
1 write 4
1 closewebsite 9

CodePudding user response:

start_row <- (1:nrow(df))[df$activity == "openwebsite"]
end_row <- (1:nrow(df))[df$activity == "closewebsite"]
df[start_row:end_row,]

  Person     activity duration
2      1  openwebsite        8
3      1        paint        9
4      1        write        2
5      1        write        4
6      1 closewebsite        9

You can also get the start and end row number with grep, e.g.

grep("openwebsite", df$activity)

CodePudding user response:

You may try

library(dplyr)
df %>%
  filter(1 == cumsum((activity == "openwebsite") - 
                       lag(activity == "closewebsite", default = 0)))

  Person     activity duration
1      1  openwebsite        8
2      1        paint        9
3      1        write        2
4      1        write        4
5      1 closewebsite        9

or

df %>%
  filter(1 <= cumsum(activity == "openwebsite"),
         lag(cumsum(activity == "closewebsite")) < 1)
  • Related