Home > Software engineering >  increment column if FALSE, stop incrementing if TRUE (tidyverse, dplyr, R)
increment column if FALSE, stop incrementing if TRUE (tidyverse, dplyr, R)

Time:04-25

I have a table that looks like this:

data <- structure(list(group = c(0L, 0L, 1L, 2L), id = c("1", "2", "3", 
"4"), m = c("ac1", "ac1", "ac1", "me0"), together = c(FALSE, 
FALSE, TRUE, TRUE)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-4L))

I would like to make a column group that increments when together == FALSE and stays at the current number when together == TRUE.

This is my desired output:

# A tibble: 4 x 4
  id    m     together group
  <chr> <chr> <lgl>    <dbl>
1 1     ac1   FALSE        0
2 2     ac1   FALSE        1
3 3     ac1   TRUE         2
4 4     me0   TRUE         2

I've already tried solutions like R increment by 1 for every change in value column and restart the counter and it's not giving me exactly what I want..

# A tibble: 4 x 4
# Groups:   group [3]
  id    m     together group
  <chr> <chr> <lgl>    <int>
1 1     ac1   FALSE        0
2 2     ac1   FALSE        0
3 3     ac1   TRUE         1
4 4     me0   TRUE         2

See I want group to read 0,1,2,2 in this case. Any ideas? Thank you so much.

CodePudding user response:

Here's a tidy solution:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
data <- structure(list(id = c("1", "2", "3", "4"), 
                       m = c("ac1", "ac1", "ac1", "me0"), 
                       together = c(FALSE, FALSE, TRUE, TRUE)), 
                  class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -4L))
data <- data %>% mutate(
  group = cumsum(c(0, na.omit(lag(!data$together)))))
data
#> # A tibble: 4 × 4
#>   id    m     together group
#>   <chr> <chr> <lgl>    <dbl>
#> 1 1     ac1   FALSE        0
#> 2 2     ac1   FALSE        1
#> 3 3     ac1   TRUE         2
#> 4 4     me0   TRUE         2

Created on 2022-04-24 by the reprex package (v2.0.1)

CodePudding user response:

Another way to solve this is as given below

data %>%
       mutate(grp2 = c(0, cumsum(!na.omit(together * lag(together)))))
    # A tibble: 4 x 5
      group id    m     together  grp2
      <int> <chr> <chr> <lgl>    <dbl>
    1     0 1     ac1   FALSE        0
    2     0 2     ac1   FALSE        1
    3     1 3     ac1   TRUE         2
    4     2 4     me0   TRUE         2

Using the data given in the comments and running the same code:

data1 <- data.frame(together=c(FALSE, FALSE, TRUE, TRUE, FALSE, TRUE))
data1 %>%
   mutate(grp2 = c(0, cumsum(!na.omit(together * lag(together)))))
  together grp2
1    FALSE    0
2    FALSE    1
3     TRUE    2
4     TRUE    2
5    FALSE    3
6     TRUE    4
  • Related