i have two dataframe. df1:
import pandas as pd
values=[[1,[1,2]],[2,[2,2]],[3,[2,3]]]
df=pd.DataFrame(values,columns=['idx','value'])
print(df)
'''
idx value
1 [1,2]
2 [2,2]
3 [2,3]
4 []
'''
df2:
values=[[1,'json'],[2,'csv'],[3,'xml']]
df2=pd.DataFrame(values,columns=['id2','type'])
print(df2)
'''
id2 type
1 json
2 csv
3 xml
'''
i want to merge this two dataframes. But the values column in the first df consists of lists. Expected output:
idx value type
1 [1,2] [json,csv]
2 [2,2] [csv,csv]
3 [2,3] [csv,xml]
4 [] []
I tried the code below but got an error.. Is there a way I can merge for each element in the list?
final=df.merge(df2,how='left',left_on='value',right_on='id2')
#returns
TypeError: unhashable type: 'list'
CodePudding user response:
here is one way to do it
df.explode('value').merge(df2, left_on = 'idx',
right_on='id2').drop(columns='id2').pivot_table(index='idx', aggfunc=list).reset_index()
idx type value
0 1 [json, json] [1, 2]
1 2 [csv, csv] [2, 2]
2 3 [xml, xml] [2, 3]
CodePudding user response:
Explode the value column then map the type using common id then groupby and aggregate back to list
d = df2.set_index('id2')['type']
df['type'] = df['value'].explode().map(d).groupby(level=0).agg(list)
Alternative approach with list comp and dict lookup
d = df2.set_index('id2')['type']
df['type'] = df['value'].map(lambda l: [d.get(i) for i in l])
idx value type
0 1 [1, 2] [json, csv]
1 2 [2, 2] [csv, csv]
2 3 [2, 3] [csv, xml]