How to remove pairs of rows corresponding to same value in R dataframe?-CodePudding

For the unique pair of ID, if both corresponding rows are 0, I need to remove them. In this case, remove row #5 and #6 but not row #7 and #8.

tmt.pair <- c("A","A","A","A","B","B","B","B")
tmt <- c("1000 C","4000 C","1000 C","4000 C","1000 C","4000 C","1000 C","4000 C")
year <- c("2021","2021","2021","2021","2021","2021","2020","2020")
month <- c("A","A","A","A","J","J","O","O")
level <- c("Low","Low","Up","Up","Low","Low","Low","Low")
site <- c(1,1,2,2,1,1,1,1)
val <- c(100,2,10,9,0,0,1,0)

df <- data.frame(tmt.pair, year,month, level,tmt,val)

df$ID <- cumsum(!duplicated(df[1:4]))

   tmt.pair year month level    tmt val ID
1        A 2021     A   Low 1000 C 100  1
2        A 2021     A   Low 4000 C   2  1
3        A 2021     A    Up 1000 C  10  2
4        A 2021     A    Up 1000 C  10  2
5        B 2021     J   Low 1000 C   0  3
6        B 2021     J   Low 4000 C   0  3
7        B 2020     O   Low 1000 C   1  4
8        B 2020     O   Low 4000 C   0  4

CodePudding user response：

You can use the following base R option:

df[df$ID %in% df$ID[df$val!=0], ]

Output:

  tmt.pair year month level    tmt val ID
1        A 2021     A   Low 1000 C 100  1
2        A 2021     A   Low 4000 C   2  1
3        A 2021     A    Up 1000 C  10  2
4        A 2021     A    Up 4000 C   9  2
7        B 2020     O   Low 1000 C   1  4
8        B 2020     O   Low 4000 C   0  4

CodePudding user response：

With dplyr, we can first group_by the ID column, then use filter to check if all val are "0".

library(dplyr)

df %>% group_by(ID) %>% filter(!all(val == 0)) %>% ungroup()

# A tibble: 6 × 7
  tmt.pair year  month level tmt      val    ID
  <chr>    <chr> <chr> <chr> <chr>  <dbl> <int>
1 A        2021  A     Low   1000 C   100     1
2 A        2021  A     Low   4000 C     2     1
3 A        2021  A     Up    1000 C    10     2
4 A        2021  A     Up    4000 C     9     2
5 B        2020  O     Low   1000 C     1     4
6 B        2020  O     Low   4000 C     0     4

CodePudding user response：

df[as.logical(with(df, ave(val, ID, FUN = \(x) !all(x == 0)))), ]

  tmt.pair year month level    tmt val ID
1        A 2021     A   Low 1000 C 100  1
2        A 2021     A   Low 4000 C   2  1
3        A 2021     A    Up 1000 C  10  2
4        A 2021     A    Up 4000 C   9  2
7        B 2020     O   Low 1000 C   1  4
8        B 2020     O   Low 4000 C   0  4