Home > Net >  How to create a sequency by group from a specific string in R?
How to create a sequency by group from a specific string in R?

Time:03-31

I would like to create a sequence of numbers within a group but starting from a specific string.

In this example, If the string matches UNP then sequence(seq coloumn) should start from the next row.

ColA Colb Seq
A HM 0
A RES 0
A UNP 0
A RES 1
A RES 2
A HM 3
B HM 0
B RES 0
B UNP 0
B RES 1
B UNP 2
C UNP 0

Only 1st instance of UNP should be considered not every instance on UNP for each group

CodePudding user response:

You can first create a column specifying the first occurrence of "UNP", then use cumsum() and lag() to calculate the Seq column.

library(dplyr)

df <- read.table(header = T, text = "
ColA    Colb    Seq
A   HM  0
A   RES 0
A   UNP 0
A   RES 1
A   RES 2
A   HM  3
B   HM  0
B   RES 0
B   UNP 0
B   RES 1
B   UNP 2
C   UNP 0") %>% 
  select(-Seq)

df %>% 
  group_by(ColA, Colb) %>% 
  mutate(seq_count = ifelse(first(Colb) == "UNP" & !duplicated(Colb), 1, 0)) %>% 
  group_by(ColA) %>% 
  mutate(Seq = lag(cumsum(cumsum(seq_count)), default = 0), .keep = "unused")
#> # A tibble: 12 × 3
#> # Groups:   ColA [3]
#>    ColA  Colb    Seq
#>    <chr> <chr> <dbl>
#>  1 A     HM        0
#>  2 A     RES       0
#>  3 A     UNP       0
#>  4 A     RES       1
#>  5 A     RES       2
#>  6 A     HM        3
#>  7 B     HM        0
#>  8 B     RES       0
#>  9 B     UNP       0
#> 10 B     RES       1
#> 11 B     UNP       2
#> 12 C     UNP       0

Created on 2022-03-31 by the reprex package (v2.0.1)

  • Related