Home > Software engineering >  Label all following rows once value is greater than xy
Label all following rows once value is greater than xy

Time:09-23

I have a dataframe in the following format:

time parameter TimeDelta
1 123 -
2 456 1
4 122 2
7 344 3
8 344 1

How to build an additional column with labeling, once TimeDelta is greater than e.g. 1.5? And also apply this labeling for the following rows once TimeDelta is again greater than 1.5?

time parameter TimeDelta Label
1 123 - 1
2 456 1 1
4 122 2 2
7 344 3 3
8 344 1 3

I do not want to loop over every row, which is extremely slow. Maybe it is possible with cumsum() to flag all the following rows up to the next value above threshold?

CodePudding user response:

You can use part of soluton from previous answer, add 1 and assign to new column:

df['Label'] = pd.to_numeric(df['TimeDelta'], errors='coerce').gt(1.5).cumsum().add(1)
print (df)
   time  parameter TimeDelta  Label
0     1        123         -      1
1     2        456         1      1
2     4        122         2      2
3     7        344         3      3
4     8        344         1      3
  • Related