Home > Back-end >  pandas merge on list type rows
pandas merge on list type rows

Time:07-27

i have two dataframe. df1:

import pandas as pd
values=[[1,[1,2]],[2,[2,2]],[3,[2,3]]]
df=pd.DataFrame(values,columns=['idx','value'])
print(df)

'''
idx   value
1    [1,2]
2    [2,2]
3    [2,3]
4    []
'''

df2:

values=[[1,'json'],[2,'csv'],[3,'xml']]
df2=pd.DataFrame(values,columns=['id2','type'])
print(df2)

'''
id2   type
1    json
2    csv
3    xml
'''

i want to merge this two dataframes. But the values ​​column in the first df consists of lists. Expected output:

idx   value  type
1    [1,2]   [json,csv]
2    [2,2]   [csv,csv] 
3    [2,3]   [csv,xml]
4    []      []

I tried the code below but got an error.. Is there a way I can merge for each element in the list?

final=df.merge(df2,how='left',left_on='value',right_on='id2')

#returns
TypeError: unhashable type: 'list'

CodePudding user response:

here is one way to do it

df.explode('value').merge(df2, left_on = 'idx', 
          right_on='id2').drop(columns='id2').pivot_table(index='idx',  aggfunc=list).reset_index()


    idx     type            value
0     1     [json, json]    [1, 2]
1     2     [csv, csv]      [2, 2]
2     3     [xml, xml]      [2, 3]

CodePudding user response:

Explode the value column then map the type using common id then groupby and aggregate back to list

d = df2.set_index('id2')['type']
df['type'] = df['value'].explode().map(d).groupby(level=0).agg(list)

Alternative approach with list comp and dict lookup

d = df2.set_index('id2')['type']
df['type']  = df['value'].map(lambda l: [d.get(i) for i in l])

   idx   value         type
0    1  [1, 2]  [json, csv]
1    2  [2, 2]   [csv, csv]
2    3  [2, 3]   [csv, xml]
  • Related