Home > database >  create a derivative variable that distinguishes scenarios that vary according to Sequence (Exe Seq)
create a derivative variable that distinguishes scenarios that vary according to Sequence (Exe Seq)

Time:02-25

I have a dataframe:

df =

No. Scenario Exe Seq Action
1 A 1 a
2 A 2 b
3 A 3 c
4 A 1 a
5 A 2 b
6 A 1 a
7 A 1 a
8 A 2 b
9 B 1 z
10 B 1 z
11 B 2 c
12 B 3 z
13 B 1 z
14 B 1 z
15 B 2 c
16 B 3 z
17 B 4 f
18 B 1 z
19 B 2 c

Some are same scenarios, but some reach three, but some stop at two or one. I want to distinguish these.

The "Scenario" values may have values other than "A"

start sequence always by 1

So I will get:

No. Scenario Exe Seq Action New_Scenario
1 A 1 a A_1
2 A 2 b A_1
3 A 3 c A_1
4 A 1 a A_2
5 A 2 b A_2
6 A 1 a A_3
7 A 1 a A_2
8 A 2 b A_2
9 B 1 z B_1
10 B 1 z B_2
11 B 2 c B_2
12 B 3 z B_2
13 B 1 z B_1
14 B 1 z B_3
15 B 2 c B_3
16 B 3 z B_3
17 B 4 f B_3
18 B 1 z B_4
19 B 2 c B_4

CodePudding user response:

First join values of Exe Seq to strings and then call factorize per secenario groups:

df['g'] = df['Exe Seq'].eq(1).cumsum()

df['New_Scenario'] = (df.groupby(['Scenario', 'g'])['Exe Seq']
                       .transform(lambda x: ','.join(x.astype(str))))

df['New_Scenario'] = (df['Scenario']   '_'   
                      df.groupby('Scenario')['New_Scenario']
                        .transform(lambda x: pd.factorize(x)[0]   1)).astype(str)
print (df)
    No. Scenario  Exe Seq Action  g New_Scenario
0     1        A        1      a  1          A_1
1     2        A        2      b  1          A_1
2     3        A        3      c  1          A_1
3     4        A        1      a  2          A_2
4     5        A        2      b  2          A_2
5     6        A        1      a  3          A_3
6     7        A        1      a  4          A_2
7     8        A        2      b  4          A_2
8     9        B        1      z  5          B_1
9    10        B        1      z  6          B_2
10   11        B        2      c  6          B_2
11   12        B        3      z  6          B_2
12   13        B        1      z  7          B_1
13   14        B        1      z  8          B_3
14   15        B        2      c  8          B_3
15   16        B        3      z  8          B_3
16   17        B        4      f  8          B_3
17   18        B        1      z  9          B_4
18   19        B        2      c  9          B_4
  • Related