Get specific text row and then row after that of an R dataframe-CodePudding

I am looking for a way to get all rows of an (ordered) R dataframe that contain a specific text phrase and then always the row afterward.

E.g.:

      A   
    1 TEXT
    2 ABC
    3 CCC
    4 TEXT
    5 AAA
    6 GGG

In this example, I would want all the rows with A=="TEXT" and then also the value of the next row, no matter what text is there (so in this case one time "ABC" and one time "AAA"). Is there a way to do this in R?

Thanks!

CodePudding user response：

With dplyr you could check a lag() value in filter:

library(dplyr)
df <- data.frame(id = 1:6, A = c("TEXT", "ABC", "CCC", "TEXT", "AAA", "GGG"))

# dplyr:
df %>% filter(A == "TEXT" | lag(A) == "TEXT")
#>   id    A
#> 1  1 TEXT
#> 2  2  ABC
#> 3  4 TEXT
#> 4  5  AAA

With base R you could add 1 to the vector of TEXT locations:

text_idx <- grep("TEXT", df$A, fixed = T)

# "TEXT" locations:
text_idx
#> [1] 1 4
# "TEXT"   leading, unsorted:
c(text_idx, text_idx   1)
#> [1] 1 4 2 5

df[sort(c(text_idx, text_idx   1)),]
#>   id    A
#> 1  1 TEXT
#> 2  2  ABC
#> 4  4 TEXT
#> 5  5  AAA

^{Created on 2023-02-01 with reprex v2.0.2}