Home > Back-end >  concatenating multiple values present a single column to a new column based on a condition in Pandas
concatenating multiple values present a single column to a new column based on a condition in Pandas

Time:09-29

First, I should say I have gone through probable solutions for this kind of problem but couldn't find a close match. My data frame looks like below

Type    Run    Status    Message
P1      R1     OK 
P1      R2     NOK       Unable to connect
P1      R3     OK
P1      R4     NOK       Unable to fetch
P2      R1     OK
P2      R2     OK
P2      R3     NOK       Entry not present
P2      R4     NOK       Entry not present

I want to concatenate all the unique error messages per 'Type' == NOK to a new column

Type    Run    Status    Message                Error
P1      R1     OK 
P1      R2     NOK       Unable to connect      Unable to connect, Unable to fetch
P1      R3     OK
P1      R4     NOK       Unable to fetch
P2      R1     OK
P2      R2     OK
P2      R3     NOK       Entry not present      Entry not present
P2      R4     NOK       Entry not present

Any leads will be helpful

CodePudding user response:

You can use groupby operations:

# identify NOK rows
m = df['Status'].eq('NOK')
# get index of first NOK per group
idx = m[m].groupby(df['Type']).cumcount().loc[lambda x: x==0].index

# initialize Error to empty string (optional)
df['Error'] = ''

# concatenate unique error messages per group
# and assign to first NOK per group
df.loc[idx, 'Error'] = (df.groupby('Type')['Message']
                          .agg(lambda g: ', '.join(dict.fromkeys(g.replace('', pd.NA).dropna())))
                          .tolist()
                       )

output:

  Type Run Status            Message                               Error
0   P1  R1     OK               None                                    
1   P1  R2    NOK  Unable to connect  Unable to connect, Unable to fetch
2   P1  R3     OK               None                                    
3   P1  R4    NOK    Unable to fetch                                    
4   P2  R1     OK               None                                    
5   P2  R2     OK               None                                    
6   P2  R3    NOK  Entry not present                   Entry not present
7   P2  R4    NOK  Entry not present                                    
  • Related