I'm essentially after a "next" statement which I can use within a dplyr ifelse statement, although other R alternatives are also welcome.
Here's the code so far:
df1 <- data%>%
arrange(Var1, Var2, Var3, Var4, Var5)%>%
group_by(Var1)%>%
distinct(Var1, Var2, Var3, Var4, Var5)%>%
mutate(Var6 = ifelse(Var4 == "COMPLETE", row_number(), row_number() 1))
the output is (relevant version)
| Var4 | Var6 |
| ------------ | -------------|
| COMPLETE | 1 |
**| INCOMPLETE | 3 |**
| COMPLETE | 3 |
| COMPLETE | 4 |
| COMPLETE | 5 |
**| INCOMPLETE | 7 |**
| COMPLETE | 7 |
| COMPLETE | 8 |
| COMPLETE | 9 |
the intended output is
| Var4 | Var6 |
| ------------ | -------------|
| COMPLETE | 1 |
**| INCOMPLETE | 2 |**
| COMPLETE | 2 |
| COMPLETE | 3 |
| COMPLETE | 4 |
**| INCOMPLETE | 5 |**
| COMPLETE | 5 |
| COMPLETE | 6 |
| COMPLETE | 7 |
In summary, my goal is that when Var4 == INCOMPLETE I am able to ignore that row and continue with row_number().
CodePudding user response:
Here is one way
library(data.table)
library(dplyr)
library(tidyr)
setDT(df1)[Var4 == "COMPLETE", Var6 := .I]
df1 %>%
fill(Var6, .direction = "updown")
-output
Var4 Var6
1: COMPLETE 1
2: INCOMPLETE 2
3: COMPLETE 2
4: COMPLETE 3
5: COMPLETE 4
6: INCOMPLETE 5
7: COMPLETE 5
8: COMPLETE 6
9: COMPLETE 7
Or with tidyverse
df1 %>%
mutate(Var6 = na_if(replace(Var4, Var4 == "COMPLETE",
seq_len(sum(Var4 == "COMPLETE"))), "INCOMPLETE")) %>%
fill(Var6, .direction = "updown")
Var4 Var6
1 COMPLETE 1
2 INCOMPLETE 2
3 COMPLETE 2
4 COMPLETE 3
5 COMPLETE 4
6 INCOMPLETE 5
7 COMPLETE 5
8 COMPLETE 6
9 COMPLETE 7
data
df1 <- structure(list(Var4 = c("COMPLETE", "INCOMPLETE", "COMPLETE",
"COMPLETE", "COMPLETE", "INCOMPLETE", "COMPLETE", "COMPLETE",
"COMPLETE")), class = "data.frame", row.names = c(NA, -9L))
CodePudding user response:
We can use cumsum
and replace
or 'case_when':
df1 %>% mutate(var6 = cumsum(Var4=='COMPLETE') %>% replace(., Var4=='INCOMPLETE', . 1))
#OR
df1 %>% mutate(var6 = cumsum(Var4=='COMPLETE') %>% case_when(Var4=='INCOMPLETE', ~ . 1))