Home > other >  State Transition count in Pandas dataframe
State Transition count in Pandas dataframe

Time:09-29

I am working on a panda data frame like the below:

Items    Run    Status
T1       R1      OK
T2       R1      OK
T3       R1      OK
T4       R1      OK
T1       R2      NOK
T2       R2      OK
T3       R2      NOK
T4       R2      OK
T1       R3      OK
T2       R3      NOK
T3       R3      OK
T4       R3      OK

I want to capture the state transition that occurred per Item(OK to NOK to OK and vice-versa) and also the number of times the state transition occurred like below:

Items    Run    Status   Transition   Count of Transition per Items
T1       R1      OK      0             1
T1       R2      NOK     1 
T1       R3      OK      0
T2       R1      OK      0             1
T2       R2      OK      0
T2       R3      NOK     1
T3       R1      OK      0             2
T3       R2      NOK     1
T3       R3      OK      1
T4       R1      OK      0             0
T4       R2      OK      0
T4       R3      OK      0

CodePudding user response:

IIUC, you can use:

# sort values by Items and Run
df = df.sort_values(by=['Items', 'Run'])

# identify changes in Status
m = df['Status'].ne(df.groupby('Items')['Status'].shift().bfill())

# convert boolean to integer
df['Transition'] = m.astype(int)

# count per group masking the non-first values with empty string
df['Count'] = (m.groupby(df['Items']).transform('sum')
                .mask(df['Items'].duplicated(), '') # optional
               )

output:

   Items Run Status  Transition Count
0     T1  R1     OK           0     2
4     T1  R2    NOK           1      
8     T1  R3     OK           1      
1     T2  R1     OK           0     1
5     T2  R2     OK           0      
9     T2  R3    NOK           1      
2     T3  R1     OK           0     2
6     T3  R2    NOK           1      
10    T3  R3     OK           1      
3     T4  R1     OK           0     0
7     T4  R2     OK           0      
11    T4  R3     OK           0      
  • Related