Home > OS >  Python: Join/Merge 2 dfs on Approximate Key Match
Python: Join/Merge 2 dfs on Approximate Key Match

Time:02-11

I have two DataFrames:

data = [['B100',30], ['C200',33], ['C201',11]]

data2 = [['B99/B100/B105','Yes'], ['C150/C200/C201','Yes'], ['D56/D500/D501','Yes']]

df_1 = pd.DataFrame(data, columns = ['code', 'value']) 

df_2 = pd.DataFrame(data2, columns = ['code_agg', 'rating']) 

I need to pull in the rating from df_2 into df_1 using a partial match from the 'code' columns in each dataframe (df_1 only has a partial key/code). The result should look like this:

enter image description here

I have tried several methods, the most common error I get is "TypeError: 'Series' objects are mutable, thus they cannot be hashed"

I would greatly appreciate any help on this. thank you!

CodePudding user response:

df_1.merge(df_2.assign(code=df_2.code_agg.str.split('/')).explode('code'))
Out[]: 
   code  value        code_agg rating
0  B100     30   B99/B100/B105    Yes
1  C200     33  C150/C200/C201    Yes
2  C201     11  C150/C200/C201    Yes
  • Related