I have the following data frame:
miniDF1 <- data.frame(Pred = c("A","A","B","A","B","B","C","A","B","C","A","A","A","A","B","A","C","B"))
Pred
1 A
2 A
3 B
4 A
5 B
6 B
7 C
8 A
9 B
10 C
11 A
12 A
13 A
14 A
15 B
16 A
17 C
18 B
I am trying to make a new column filled with 1's until "C" is found in Pred
, and then fill the new column with 0's until the next "C" is found, and repeat as such until the end of the DF. I have tried the following:
miniDF1 <- miniDF1 %>%
mutate(Outcome = ifelse(str_detect(Pred, "C"), 1, 0)) %>%
fill(Outcome, .direction = 'up')
Pred Outcome
1 A 0
2 A 0
3 B 0
4 A 0
5 B 0
6 B 0
7 C 1
8 A 0
9 B 0
10 C 1
11 A 0
12 A 0
13 A 0
14 A 0
15 B 0
16 A 0
17 C 1
18 B 0
but this is only putting 1's in the same row where there are "C's" located.
This is how it is expected to look like:
miniDF2 <- data.frame(Pred = c("A","A","B","A","B","B","C","A","B","C","A","A","A","A","B","A","C","B"),
Outcome = c(1,1,1,1,1,1,1,0,0,0,1,1,1,1,1,1,1,0))
Pred Outcome
1 A 1
2 A 1
3 B 1
4 A 1
5 B 1
6 B 1
7 C 1
8 A 0
9 B 0
10 C 0
11 A 1
12 A 1
13 A 1
14 A 1
15 B 1
16 A 1
17 C 1
18 B 0
I can't figure out how to get the values to flip accordingly, but I thought that that's what the fill(Outcome, .direction = 'up')
part of my code was intended to do.
CodePudding user response:
Looks cumbersome but does what needed,
i1 <- which(miniDF1$Pred == 'C')
dd <- data.frame(v1 = c(1, 0),v2 = c(i1[1], abs(diff(i1)), nrow(miniDF1)-max(i1)))
rep(dd$v1, dd$v2)
#[1] 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0
Maybe wrap it in a function too,
fun1 <- function(x, val){
i1 <- which(x == val)
dd <- data.frame(v1 = c(1, 0),
v2 = c(i1[1], abs(diff(i1)), nrow(miniDF1)-max(i1)))
return(rep(dd$v1, dd$v2))
}
miniDF1$outcome <- fun1(miniDF1$Pred, 'C')
# Pred outcome
#1 A 1
#2 A 1
#3 B 1
#4 A 1
#5 B 1
#6 B 1
#7 C 1
#8 A 0
#9 B 0
#10 C 0
#11 A 1
#12 A 1
#13 A 1
#14 A 1
#15 B 1
#16 A 1
#17 C 1
#18 B 0
CodePudding user response:
miniDF1$Outcome <- cumsum(c(1, head(miniDF1$Pred == 'C', -1))) %% 2
miniDF1
Pred Outcome
1 A 1
2 A 1
3 B 1
4 A 1
5 B 1
6 B 1
7 C 1
8 A 0
9 B 0
10 C 0
11 A 1
12 A 1
13 A 1
14 A 1
15 B 1
16 A 1
17 C 1
18 B 0
IN tidyverse:
library(dplyr)
miniDF1 %>%
mutate(Outcome = cumsum(lag(Pred == 'C', default = TRUE)) %% 2)
CodePudding user response:
You may try using rleid
like
miniDF1$Outcome <- ifelse(data.table::rleid(miniDF1$Pred == "C") %% 4 %in% c(1,2), 1, 0)
miniDF1
Pred Outcome
1 A 1
2 A 1
3 B 1
4 A 1
5 B 1
6 B 1
7 C 1
8 A 0
9 B 0
10 C 0
11 A 1
12 A 1
13 A 1
14 A 1
15 B 1
16 A 1
17 C 1
18 B 0
explanation
For easier comparison, let's try miniDF1$x <- rleid(miniDF1$Pred == "C")
.
Pred Outcome x
1 A 1 1
2 A 1 1
3 B 1 1
4 A 1 1
5 B 1 1
6 B 1 1
7 C 1 2
8 A 0 3
9 B 0 3
10 C 0 4
11 A 1 5
12 A 1 5
13 A 1 5
14 A 1 5
15 B 1 5
16 A 1 5
17 C 1 6
18 B 0 7
You can see that rleid
's result change if C
appears and next. Also, as you need to switch from 1
to 0
as C
appears.
It means, if x
is 1 and 2
or 3 and 4
or .... need to have 1
, 0
, 1
, and so on. So I divide x
with 4 and get repeated 1 and 2
/ 3 and 0
values.