Home > database >  merge/lookup to another df
merge/lookup to another df

Time:09-21

I have two dataframes which I need to merge/lookup values:

df1

Key Value
123
456
789
101 0.3

df2

Key Value
123 0.1
456 0.2

key = distinct value. I need to look up df2 based on the Key column.

Expected result

Key Value
123 0.1
456 0.2
789
101 0.3

I have tried following:

merged_df = pd.merge(
left = df1,
right = df2,
on = 'Key',
how = 'outer',
)

This creates an additional column 'Value_y' in merged df, while I need to get the data in the original df1 'Value' column.

How could I tweak the code, or perhaps there is another function for this task? Also currently 'Key' value is 3 combined values (product code and two dates), perhaps it's possible to merge on several columns, instead of creating an additional 'Key' column for merging?

CodePudding user response:

Concatenate the two dataframes, then groupby Key and call first on the DataFrame Groupby object:

>>> pd.concat([df1, df2]).groupby('Key', sort=False).first().reset_index()

   Key  Value
0  123    0.1
1  456    0.2
2  789    NaN
3  101    0.3

CodePudding user response:

here is one way to do it using mask and map

map the key in the reference DF, and if value is not null, assign it to value in first DF


(df['Value']=df['Value'].mask(  
                             (df['Key'].map(df2.set_index('Key').Value)).notna(),
                              df['Key'].map(df2.set_index('Key').Value)
                             )) 
df

OR using loc

df.loc[df['Value'].isna(), 'Value'] = df['Key'].map(df2.set_index('Key').Value)
df
Key     Value
0   123     0.1
1   456     0.2
2   789     NaN
3   101     0.3

CodePudding user response:

Try with update

df1.set_index('Key', inplace=True)
df1.update(df2.set_index('Key'))
df1.reset_index()
  • Related