In R, how can I label the first instance of a repeated value run within a column while grouping by a-CodePudding

Take the following data frame for example:

Group<-c("AGroup", "AGroup", "AGroup", "AGroup", "BGroup", "BGroup", "BGroup", "BGroup", "CGroup", "CGroup", "CGroup", "CGroup")
Status<-c("Low", "Low", "High", "High", "High", "Low", "High", "Low", "Low", "Low", "High", "High")

df<-data.frame(Group, Status)

df$FirstHighRun<-c(0,0,1,1,1,0,0,0,0,0,1,1)

This creates the following with "FirstHighRun" being the column I'm trying to create:

Group   Status  FirstHighRun
AGroup  Low     0 
AGroup  Low     0
AGroup  High    1
AGroup  High    1
BGroup  High    1
BGroup  Low     0
BGroup  High    0
BGroup  Low     0
CGroup  Low     0
CGroup  Low     0
CGroup  High    1
CGroup  High    1

As one can see, I'm trying to label the first time "High", and directly repeating occurrences of this entry, appear in the Status column for each Group.

In the "BGroup", there are two "High" entries. However, since the second instance did not directly follow the first instance, it is not labeled with a 1.

CodePudding user response：

Try with rle

library(dplyr)
df %>% 
  group_by(Group) %>%
  mutate(FirstHighRun2 =  (inverse.rle(within.list(rle(Status ==
       "High"), { values[which(values)[-1]] <- FALSE})))) %>%
  ungroup

-output

# A tibble: 12 × 4
   Group  Status FirstHighRun FirstHighRun2
   <chr>  <chr>         <dbl>         <int>
 1 AGroup Low               0             0
 2 AGroup Low               0             0
 3 AGroup High              1             1
 4 AGroup High              1             1
 5 BGroup High              1             1
 6 BGroup Low               0             0
 7 BGroup High              0             0
 8 BGroup Low               0             0
 9 CGroup Low               0             0
10 CGroup Low               0             0
11 CGroup High              1             1
12 CGroup High              1             1