Home > front end >  Fill Column with 1 Until Value Found, Then Repeat Fill with 0 Until Value Found Again
Fill Column with 1 Until Value Found, Then Repeat Fill with 0 Until Value Found Again

Time:12-07

I have the following data frame:

miniDF1 <- data.frame(Pred = c("A","A","B","A","B","B","C","A","B","C","A","A","A","A","B","A","C","B"))

    Pred
1     A
2     A
3     B
4     A
5     B
6     B
7     C
8     A
9     B
10    C
11    A
12    A
13    A
14    A
15    B
16    A
17    C
18    B

I am trying to make a new column filled with 1's until "C" is found in Pred, and then fill the new column with 0's until the next "C" is found, and repeat as such until the end of the DF. I have tried the following:

miniDF1 <- miniDF1 %>%
  mutate(Outcome = ifelse(str_detect(Pred, "C"), 1, 0)) %>% 
  fill(Outcome, .direction = 'up')

     Pred Outcome
1     A       0
2     A       0
3     B       0
4     A       0
5     B       0
6     B       0
7     C       1
8     A       0
9     B       0
10    C       1
11    A       0
12    A       0
13    A       0
14    A       0
15    B       0
16    A       0
17    C       1
18    B       0

but this is only putting 1's in the same row where there are "C's" located.

This is how it is expected to look like:

miniDF2 <- data.frame(Pred = c("A","A","B","A","B","B","C","A","B","C","A","A","A","A","B","A","C","B"),
                     Outcome = c(1,1,1,1,1,1,1,0,0,0,1,1,1,1,1,1,1,0))

     Pred Outcome
1     A       1
2     A       1
3     B       1
4     A       1
5     B       1
6     B       1
7     C       1
8     A       0
9     B       0
10    C       0
11    A       1
12    A       1
13    A       1
14    A       1
15    B       1
16    A       1
17    C       1
18    B       0

I can't figure out how to get the values to flip accordingly, but I thought that that's what the fill(Outcome, .direction = 'up') part of my code was intended to do.

CodePudding user response:

Looks cumbersome but does what needed,

i1 <- which(miniDF1$Pred == 'C')
dd <- data.frame(v1 = c(1, 0),v2 = c(i1[1], abs(diff(i1)), nrow(miniDF1)-max(i1)))

rep(dd$v1, dd$v2)
#[1] 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0

Maybe wrap it in a function too,

fun1 <- function(x, val){
  i1 <- which(x == val)
  dd <- data.frame(v1 = c(1, 0),
                   v2 = c(i1[1], abs(diff(i1)), nrow(miniDF1)-max(i1)))
  return(rep(dd$v1, dd$v2))
}

miniDF1$outcome <- fun1(miniDF1$Pred, 'C')

#   Pred outcome
#1     A       1
#2     A       1
#3     B       1
#4     A       1
#5     B       1
#6     B       1
#7     C       1
#8     A       0
#9     B       0
#10    C       0
#11    A       1
#12    A       1
#13    A       1
#14    A       1
#15    B       1
#16    A       1
#17    C       1
#18    B       0

CodePudding user response:

miniDF1$Outcome <- cumsum(c(1, head(miniDF1$Pred == 'C', -1))) %% 2

miniDF1
   Pred Outcome
1     A       1
2     A       1
3     B       1
4     A       1
5     B       1
6     B       1
7     C       1
8     A       0
9     B       0
10    C       0
11    A       1
12    A       1
13    A       1
14    A       1
15    B       1
16    A       1
17    C       1
18    B       0

IN tidyverse:

library(dplyr)
miniDF1 %>%
  mutate(Outcome = cumsum(lag(Pred == 'C', default = TRUE)) %% 2)

CodePudding user response:

You may try using rleid like

miniDF1$Outcome <- ifelse(data.table::rleid(miniDF1$Pred == "C") %% 4 %in% c(1,2), 1, 0)
miniDF1

   Pred Outcome
1     A       1
2     A       1
3     B       1
4     A       1
5     B       1
6     B       1
7     C       1
8     A       0
9     B       0
10    C       0
11    A       1
12    A       1
13    A       1
14    A       1
15    B       1
16    A       1
17    C       1
18    B       0

explanation

For easier comparison, let's try miniDF1$x <- rleid(miniDF1$Pred == "C").

   Pred Outcome x
1     A       1 1
2     A       1 1
3     B       1 1
4     A       1 1
5     B       1 1
6     B       1 1
7     C       1 2
8     A       0 3
9     B       0 3
10    C       0 4
11    A       1 5
12    A       1 5
13    A       1 5
14    A       1 5
15    B       1 5
16    A       1 5
17    C       1 6
18    B       0 7

You can see that rleid's result change if C appears and next. Also, as you need to switch from 1 to 0 as C appears.

It means, if x is 1 and 2 or 3 and 4 or .... need to have 1, 0, 1, and so on. So I divide x with 4 and get repeated 1 and 2/ 3 and 0 values.

  • Related