Home > Blockchain >  Remove rows of a certain value, before values change in R
Remove rows of a certain value, before values change in R

Time:06-24

I have a data frame like the following:

dat <- data.frame(Target = c(rep("01", times = 8), rep("02", times = 5), 
                             rep("03", times = 4)),
                         targ2clicks = c(1, 1, 1, 1, 0, 0 ,0 , 1, 1, 0, 0, 0, 1,
                                         0, 0, 0, 1))

    Target targ2clicks
1      01           1
2      01           1
3      01           1
4      01           1
5      01           0
6      01           0
7      01           0
8      01           1
9      02           1
10     02           0
11     02           0
12     02           0
13     02           1
14     03           0
15     03           0
16     03           0
17     03           1

Where the first instance for each Target is 1 in the targ2clicks column, I want to remove all rows from the data frame that have 1 in this column before the first occurrence of 0 for that Target. However, where the first value is 0 for a Target, I want to keep all of the values/rows.

What I want to end up with is:

   Target targ2clicks
     01           0
     01           0
     01           0
     01           1
     02           0
     02           0
     02           0
     02           1
     03           0
     03           0
     03           0
     03           1

Where all instances for a Target are 1 with no 0s (not in the example df, but just to consider in any solutions), all rows for that Target should be removed.

I have tried coding this in various different ways with no success! Any help hugely appreciated.

CodePudding user response:

You could use ave() cumsum():

dat[with(dat, ave(targ2clicks == 0, Target, FUN = cumsum)) > 0, ]

#    Target targ2clicks
# 5      01           0
# 6      01           0
# 7      01           0
# 8      01           1
# 10     02           0
# 11     02           0
# 12     02           0
# 13     02           1
# 14     03           0
# 15     03           0
# 16     03           0
# 17     03           1

Its dplyr equivalent is

library(dplyr)

dat %>%
  group_by(Target) %>%
  filter(cumany(targ2clicks == 0)) %>%
  ungroup()

CodePudding user response:

We could use slice with match

library(dplyr)
dat %>%
  group_by(Target) %>%
  slice(match(0, targ2clicks):n()) %>% 
  ungroup

-output

# A tibble: 12 × 2
   Target targ2clicks
   <chr>        <dbl>
 1 01               0
 2 01               0
 3 01               0
 4 01               1
 5 02               0
 6 02               0
 7 02               0
 8 02               1
 9 03               0
10 03               0
11 03               0
12 03               1
  • Related