I am trying to achieve a result in R as follows: I have a dataset, like below
and I'm trying to achieve result as follows:
Conditions are:
- Ignore value of first row of every group in column 3, based on value of next row determine how many FALSE are in sequence. Input the number in the previous row.
- If the next value is TRUE input 0 and move to next value of the row and repeat the process.
- The last row of each group should have 0
Constraints: Do not want to use nested for loops. Is there a better way to achieve the result in R.
Thank you in advance for the suggestions/solutions.
CodePudding user response:
It's much easier to help if your data is reproducible. The data in the images doesn't have column names, but the following code reproduces the data frame and names the columns col1
, col2
and col3
:
df <- data.frame(col1 = rep(1:17, 2),
col2 = rep(c("A", "B"), each = 17),
col3 = rep(rep(c(FALSE, TRUE), 4),
times = c(2, 10, 3, 4, 5, 3, 3, 4)))
To do what you need, we can group the data according to each run of TRUE or FALSE in column 3 using rleid
from data.table
, then insert a rev
erse integer seq
uence for the FALSE segments. Finally, ungroup
, then shift the new column up one position using lead
.
library(tidyverse)
df <- df %>%
group_by(grp = data.table::rleid(col3)) %>%
mutate(col4 = rev(seq(n())) * (1 - col3)) %>%
group_by(col2) %>%
mutate(col4 = lead(col4, default = 0)) %>%
select(-grp)
The result matches your expected output:
print(df, n = 34)
#> # A tibble: 34 x 4
#> col1 col2 col3 col4
#> <int> <chr> <lgl> <dbl>
#> 1 1 A FALSE 1
#> 2 2 A FALSE 0
#> 3 3 A TRUE 0
#> 4 4 A TRUE 0
#> 5 5 A TRUE 0
#> 6 6 A TRUE 0
#> 7 7 A TRUE 0
#> 8 8 A TRUE 0
#> 9 9 A TRUE 0
#> 10 10 A TRUE 0
#> 11 11 A TRUE 0
#> 12 12 A TRUE 3
#> 13 13 A FALSE 2
#> 14 14 A FALSE 1
#> 15 15 A FALSE 0
#> 16 16 A TRUE 0
#> 17 17 A TRUE 0
#> 18 1 B TRUE 0
#> 19 2 B TRUE 5
#> 20 3 B FALSE 4
#> 21 4 B FALSE 3
#> 22 5 B FALSE 2
#> 23 6 B FALSE 1
#> 24 7 B FALSE 0
#> 25 8 B TRUE 0
#> 26 9 B TRUE 0
#> 27 10 B TRUE 3
#> 28 11 B FALSE 2
#> 29 12 B FALSE 1
#> 30 13 B FALSE 0
#> 31 14 B TRUE 0
#> 32 15 B TRUE 0
#> 33 16 B TRUE 0
#> 34 17 B TRUE 0
Created on 2022-09-07 with reprex v2.0.2