Home > Enterprise >  Assign values to a column depending on condition in another column in R
Assign values to a column depending on condition in another column in R

Time:07-31

I have a dataframe with columns seq (sequence) and num, e.g.:

seq  num
1    0.1  
2    0.1
3    0.2
1    0
2    0
3    0
1    0.5
2    2
3    6
4    9
5    12
1    0
2    0
3    0

I need to create a new binary column state, that would be state=1 for the sequences that have num>7.5.

So, I need state=1 to start from the closest seq=1 prior to the num>7.5 value:

seq  num  state
1    0.1  0
2    0.1  0
3    0.2  0
1    0    0
2    0    0
3    0    0
1    0.5  1
2    2    1
3    6    1
4    9    1
5    12   1
1    0    0
2    0    0
3    0    0

This seems like it should be simple, but I've been failing with it for a few days.

To state the obvious if I just do a conditional that takes over 7.5 I would not get state=1 for the full sequence:

for(i in 1:(length(df$state))){
    if(df$num[i] > 7.5){
      df$state[i] = 1
    }
}

seq  num  state
1    0.1  0
2    0.1  0
3    0.2  0
1    0    0
2    0    0
3    0    0
1    0.5  0
2    2    0
3    6    0
4    9    1
5    12   1
1    0    0
2    0    0
3    0    0

Thank you!

CodePudding user response:

We can define a grouping variable that is the cumulative count of 1s in the seq column, and then assign state by group:

library(dplyr)
df %>%
  group_by(grp = cumsum(seq == 1)) %>%
  mutate(state = as.integer(any(num > 7.5))) %>%
  ungroup()
# # A tibble: 14 × 4
#      seq   num   grp state
#    <int> <dbl> <int> <int>
#  1     1   0.1     1     0
#  2     2   0.1     1     0
#  3     3   0.2     1     0
#  4     1   0       2     0
#  5     2   0       2     0
#  6     3   0       2     0
#  7     1   0.5     3     1
#  8     2   2       3     1
#  9     3   6       3     1
# 10     4   9       3     1
# 11     5  12       3     1
# 12     1   0       4     0
# 13     2   0       4     0
# 14     3   0       4     0
  • Related