Replace certain values across multiple columns, before column values change, with NA in R-CodePudding

I have a dataframe like this:

dat <- data.frame(Target = c(rep("01", times = 8), rep("02", times = 5), 
                             rep("03", times = 4)),
                         targ2clicks = c(1, 1, 1, 1, 0, 0 ,0 , 1, 1, 0, 0, 0, 1,
                                         0, 0, 0, 1),
                  targ2midclicks = c(1, 1, 1, 1, 0, 0 ,0 , 1, 1, 0, 0, 0, 1,
                                     1, 0, 0, 1),
                  targ2rClicks = c(0, 0, 0, 1, 0, 0 ,0 , 1, 0, 0, 0, 0, 1,
                                  1, 1, 1, 1))

    Target targ2clicks targ2midclicks targ2rClicks
1      01           1              1            0
2      01           1              1            0
3      01           1              1            0
4      01           1              1            1
5      01           0              0            0
6      01           0              0            0
7      01           0              0            0
8      01           1              1            1
9      02           1              1            0
10     02           0              0            0
11     02           0              0            0
12     02           0              0            0
13     02           1              1            1
14     03           0              1            1
15     03           0              0            1
16     03           0              0            1
17     03           1              1            1

I want to write some code that detects whether the first row for each target in the remaining 3 columns is a 1, and if so, replaces all occurrences of '1' before the first 0 with 'NA'. However, I want it to do this individually for all 3 columns, i.e. using OR logic instead of AND. Here is what I would like to end up with:

    Target   targ2clicks  targ2midclicks targ2rClicks
1      01          NA             NA            0
2      01          NA             NA            0
3      01          NA             NA            0
4      01          NA             NA            1
5      01           0              0            0
6      01           0              0            0
7      01           0              0            0
8      01           1              1            1
9      02          NA             NA            0
10     02           0              0            0
11     02           0              0            0
12     02           0              0            0
13     02           1              1            1
14     03           0             NA           NA
15     03           0              0           NA
16     03           0              0           NA
17     03           1              1           NA

This post is related and expands upon a previous one found here: Remove rows of a certain value, before values change in R There is also a similar post here which I am struggling to adapt to my own needs: replace values with NA across multiple columns if a condition is met in R

Here is what I have attempted so far, but is not quite right:

library(naniar)
dat1 <- dat %>%
  group_by(Target) %>%
  replace_with_na_if(.predicate = c(dat$targ2clicks[1]==1 | dat$targ2midclicks[1]==1 | dat$targ2rClicks[1]==1),
                   condition = cumany(~.x) == 0)

This gives me a warning: Error in probe(.x, .p) : length(.p) == length(.x) is not TRUE

Any help much appreciated!

CodePudding user response：

This is something you can try using cumall. This effectively will convert values not equal to zero to NA until the first zero. This logic will give the same results as desired above. However, if your actual needs differ or you want to explicitly check the first row for a given Target to ensure is equal to 1, you can change to first(.) == 1 & cumall(. != 0) instead.

library(tidyverse)

dat %>%
  group_by(Target) %>%
  mutate(across(.cols = starts_with("targ2"), 
                ~replace(., cumall(. != 0), NA)))

Output

   Target targ2clicks targ2midclicks targ2rClicks
   <chr>        <dbl>          <dbl>        <dbl>
 1 01              NA             NA            0
 2 01              NA             NA            0
 3 01              NA             NA            0
 4 01              NA             NA            1
 5 01               0              0            0
 6 01               0              0            0
 7 01               0              0            0
 8 01               1              1            1
 9 02              NA             NA            0
10 02               0              0            0
11 02               0              0            0
12 02               0              0            0
13 02               1              1            1
14 03               0             NA           NA
15 03               0              0           NA
16 03               0              0           NA
17 03               1              1           NA