Home > database >  How to make derivative variables according to the sequence of data in pandas
How to make derivative variables according to the sequence of data in pandas

Time:02-25

I have a dataframe:

df =

No. Scenario Exe Seq Action
1 A 1 a
2 A 2 b
3 A 3 c
4 A 1 a
5 A 2 b
6 A 1 a

Those are same scenarios, but some reach three, but some stop at two or one. I want to distinguish this.

The "Scenario" values may have values other than "A"

So I will get:

No. Scenario Exe Seq Action New_Scenario
1 A 1 a A_1
2 A 2 b A_1
3 A 3 c A_1
4 A 1 a A_2
5 A 2 b A_2
6 A 1 a A_3

CodePudding user response:

IIUC use:

#sequence start if consecutive differencies if not 1
df['New_Scenario'] = df['Scenario']   '_'   df['Exe Seq'].diff().ne(1).cumsum().astype(str)
print (df)

Or:

#sequence start by 1
df['New_Scenario'] = df['Scenario']   '_'   df['Exe Seq'].eq(1).cumsum().astype(str)

Or maybe:

#sequence start if consecutive differencies if less like 0
df['New_Scenario'] = (df['Scenario']   '_'   
                      df['Exe Seq'].diff().fillna(-1).le(0).cumsum().astype(str))
  • Related