Home > other >  I am stuck in writing the python code while pulling one data from another dataframe
I am stuck in writing the python code while pulling one data from another dataframe

Time:07-06

I have two data frames.

df1 = pd.DataFrame({'vin':['aaa','aaa','aaa','bbb','ccc','ccc','ddd','eee','eee','fff'],'module':['ABS','ABS','IPMA','BCCM','HPOC','ABS','ABS','HPOC','ABS','ABS']})


df2 = pd.DataFrame({'vin':['aaa','bbb','ccc','ddd','eee','fff']})

So basically in df2, I want to pull values of the 'module' column from df1 with the respective column 'vin' but the challenge is I want all values in one cell separated by a comma. I tried the below command.

df_merge = pd.merge(df2, df1[['module','vin']], on ='vin', how ='left')

Now the problem with this line of code is, that it is pulling data in multiple rows that I don't want.

My expected output will be like this:-

df2 = pd.DataFrame({'vin':['aaa','bbb','ccc','ddd'],'module':['ABS,ABS,IPMA','BCCM','HPOC,ABS','ABS']})

CodePudding user response:

Check below code

df_merge = pd.merge(df2, df1.groupby(['vin'])['module'].apply(list), on ='vin', how ='left')
df_merge['module'] = df_merge['module'].astype('str').str.replace("\[|\]|\'| ","")
df_merge

Output:

enter image description here

CodePudding user response:

You can simply do:

df2.merge(df1, how='left').groupby('vin').agg({'module': lambda x: ', '.join(x)})

It gives you:

vin module
aaa ABS, ABS, IPMA
bbb BCCM
ccc HPOC, ABS
ddd ABS
eee HPOC, ABS
fff ABS
  • Related