Home > Software engineering >  How to filter certain rows based on the preceding row's value using dplyr?
How to filter certain rows based on the preceding row's value using dplyr?

Time:11-30

I have a dataframe like this:

df1 <- data.frame(
  Group = c('A', 'B', 'C', 'D', 'Total: CD', 'E', 'F', 'G', 'Total: FG', 'H'),
  Value1 = c(12, 88, 10, 90, 100, 8, 4, 11, 15, 77)
)

I would like to filter out any rows that come after a row containing the word Total with the result looking like this:

df_desired <- data.frame(
  Group = c('A', 'B', 'C', 'D', 'Total: CD', 'F', 'G', 'Total: FG'),
  Value1 = c(12, 88, 10, 90, 100, 4, 11, 15)
)

How would I achieve this in R, ideally using the dplyr package?

CodePudding user response:

Using lag in dplyr -

library(dplyr)

df1 %>% filter(!lag(grepl('Total', Group), default = FALSE))

#      Group Value1
#1         A     12
#2         B     88
#3         C     10
#4         D     90
#5 Total: CD    100
#6         F      4
#7         G     11
#8 Total: FG     15

CodePudding user response:

Maybe something like this:

library(tidyverse)

df1[-c(which(str_detect(df1$Group, 'Total'))   1), ]
#>       Group Value1
#> 1         A     12
#> 2         B     88
#> 3         C     10
#> 4         D     90
#> 5 Total: CD    100
#> 7         F      4
#> 8         G     11
#> 9 Total: FG     15

Created on 2021-11-30 by the reprex package (v2.0.1)

  • Related