Essentially, I need to alter every row that occurs after a certain condition has been met. Though I also need the loop to obey a grouping variable. A simplified version of my data (shown below), is the grouping variable (Groups), followed by a value (N) and then the conditional variable (R). You can create a simplified version of my data as follows:
Groups <- c("A", "A", "A", "A", "B", "B", "B", "B", "C", "C")
N <- c(1,1,1,1,1,1,1,1,1,1)
R <- c("N", "N", "Y", "N", "N", "N", "Y", "N", "N", "N")
Dat <- as.data.frame(cbind(Groups, N, R))
What I need, is for when R == "Y", that row and every row after that for that group, has 1 added to the N variable. So the solution should look like this:
Groups N R
1 A 1 N
2 A 1 N
3 A 2 Y
4 A 2 N
5 B 1 N
6 B 1 N
7 B 2 Y
8 B 2 N
9 C 1 N
10 C 1 N
So the loop needs to restart with each new group. Ideally, a solution within dplyr is preferred but I have not been able to find one yet.
Any help or guidance would be much appreciated!
CodePudding user response:
Do a group by cumsum
on a logical vector and add to the 'N'
library(dplyr)
Dat %>%
group_by(Groups) %>%
mutate(N = cumsum(R == "Y") N) %>%
ungroup()
-output
# A tibble: 10 × 3
Groups N R
<chr> <dbl> <chr>
1 A 1 N
2 A 1 N
3 A 2 Y
4 A 2 N
5 B 1 N
6 B 1 N
7 B 2 Y
8 B 2 N
9 C 1 N
10 C 1 N
data
Dat <- data.frame(Groups, N, R)
# NOTE: Using `cbind` converts to `matrix` and matrix can have only a single class. Directly use `data.frame` instead of roundabout way which is not a correct approach.