Home > other >  How can I merge column value in one cell in pandas with respect to another column
How can I merge column value in one cell in pandas with respect to another column

Time:11-03

I have one data frame.

import pandas as pd
import numpy as np
df1 = pd.DataFrame({'vin':['aaa','aaa','aaa','bbb','ccc','ccc','ddd','eee','eee','fff'],
                    'module':['ABS','ABS','IPMA','BCCM','HPOC','ABS','ABS','HPOC','ABS','ABS']})

I want to concatenate all values of column module with respect to column vin.

Below is my expected output.

df1 = pd.DataFrame({'vin':['aaa','aaa','aaa','bbb','ccc','ccc','ddd','eee','eee','fff'],
                    'module':['ABS','ABS','IPMA','BCCM','HPOC','ABS','ABS','HPOC','ABS','ABS'],
                   'New_module':['ABS-ABS-IPMA','ABS-ABS-IPMA','ABS-ABS-IPMA','BCCM','HPOC-ABS','HPOC-ABS','ABS','HPOC-ABS','HPOC-ABS','ABS']})

I tried one method, in which I have to duplicate the data frame and apply below code.

df_merge = pd.merge(df2, df1.groupby(['vin'])['module'].apply(list), on ='vin', how ='left')
df_merge['module'] = df_merge['module'].astype('str').str.replace("\[|\]|\'| ","")
df_merge

Any simple code to get my desired output ?

CodePudding user response:

You can use groupby.transform with join:

df1['New_module'] = df1.groupby('vin')['module'].transform('-'.join)

output:

   vin module    New_module
0  aaa    ABS  ABS-ABS-IPMA
1  aaa    ABS  ABS-ABS-IPMA
2  aaa   IPMA  ABS-ABS-IPMA
3  bbb   BCCM          BCCM
4  ccc   HPOC      HPOC-ABS
5  ccc    ABS      HPOC-ABS
6  ddd    ABS           ABS
7  eee   HPOC      HPOC-ABS
8  eee    ABS      HPOC-ABS
9  fff    ABS           ABS
  • Related