Home > Software engineering >  Index all rows between two specific character values in a data frame in R
Index all rows between two specific character values in a data frame in R

Time:08-17

I have created a data frame z.

a = c(1,1,1)
b = c("start",2,2)
c = c(3,3,3)
d = c("end",4,4)

z = data.frame(rbind(a,b,c,d))
z

How can I index to get the row numbers between "start" and "end" if I don't know how many rows are going to be between "start" and "end"? In this instance there is only one row between "start" and "end" but for the next dataset there may be 50 rows between "start" and "end". Is there a way to use "start" and "end" to extract all the rows between them?

CodePudding user response:

If you want the rows that are between start and end, but not include those actual rows, you could do

seq(which(z$X1 == 'start')   1, which(z$X1 == 'end') - 1)
#> [1] 3

Or, to get the row(s) itself

z[seq(which(z$X1 == 'start')   1, which(z$X1 == 'end') - 1),]
#>   X1 X2 X3
#> c  3  3  3

CodePudding user response:

  • We can use
library(dplyr)

z |> slice((which(X1 == "start") 1 ): (which(X1 == "end")-1))
  • Or we can use grepl
z |> slice((which(grepl("start" , X1)) 1):
           (which(grepl("end" , X1))-1))
  •  Tags:  
  • r
  • Related