I have one data frame.
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'vin':['aaa','aaa','aaa','bbb','ccc','ccc','ddd','eee','eee','fff'],
'module':['ABS','ABS','IPMA','BCCM','HPOC','ABS','ABS','HPOC','ABS','ABS']})
I want to concatenate all values of column module with respect to column vin.
Below is my expected output.
df1 = pd.DataFrame({'vin':['aaa','aaa','aaa','bbb','ccc','ccc','ddd','eee','eee','fff'],
'module':['ABS','ABS','IPMA','BCCM','HPOC','ABS','ABS','HPOC','ABS','ABS'],
'New_module':['ABS-ABS-IPMA','ABS-ABS-IPMA','ABS-ABS-IPMA','BCCM','HPOC-ABS','HPOC-ABS','ABS','HPOC-ABS','HPOC-ABS','ABS']})
I tried one method, in which I have to duplicate the data frame and apply below code.
df_merge = pd.merge(df2, df1.groupby(['vin'])['module'].apply(list), on ='vin', how ='left')
df_merge['module'] = df_merge['module'].astype('str').str.replace("\[|\]|\'| ","")
df_merge
Any simple code to get my desired output ?
CodePudding user response:
You can use groupby.transform
with join
:
df1['New_module'] = df1.groupby('vin')['module'].transform('-'.join)
output:
vin module New_module
0 aaa ABS ABS-ABS-IPMA
1 aaa ABS ABS-ABS-IPMA
2 aaa IPMA ABS-ABS-IPMA
3 bbb BCCM BCCM
4 ccc HPOC HPOC-ABS
5 ccc ABS HPOC-ABS
6 ddd ABS ABS
7 eee HPOC HPOC-ABS
8 eee ABS HPOC-ABS
9 fff ABS ABS