I have two DataFrame A and B as-
table_a = pd.DataFrame({
'unique_id':[3823762,3976695,4199277,4201777,4202712],
'full_name':['A','B','C','D','E'],
'freq_match_it':[np.NaN,np.NaN,np.NaN,np.NaN,np.NaN],
'address':['hyd','jgl','krmr','wl','ktk']
})
table_b = pd.DataFrame({
'unique_id':[419434,4201777,784744,4202712,10000],
'freq_match_it':[12,15,8,5,100]
})
Here I would like to fill in freq_match_it
column in table_a
whose unique_id
s exist in table_b
using a left join in pandas as
pd.merge(table_a,table_b,how='left',on='unique_id')
it shows as-
Here it has created an extra column freq_match_it_y
and I want to fill in the already existing column freq_match_it
rather than create an additional column.
The expected output is
CodePudding user response:
Replace missing values by mapping Series from second DataFrame by Series.fillna
and Series.map
:
s = table_b.set_index('unique_id')['freq_match_it']
table_a['freq_match_it'] = table_a['freq_match_it'].fillna(table_a['unique_id'].map(s))
CodePudding user response:
You could map
it:
table_a['freq_match_it'] = table_a['unique_id'].map(table_b.set_index('unique_id')['freq_match_it'])
Output:
unique_id full_name freq_match_it address
0 3823762 A NaN hyd
1 3976695 B NaN jgl
2 4199277 C NaN krmr
3 4201777 D 15.0 wl
4 4202712 E 5.0 ktk
CodePudding user response:
You were almost there, merge and use suffixes and then drop either based on suffixes names or on which column has all NaNs. No need of multiple computations proposed elsewhere
pd.merge(table_a,table_b,how='left',on='unique_id', suffixes=('_x','')).dropna(axis=1,how='all')
unique_id full_name address freq_match_it
0 3823762 A hyd NaN
1 3976695 B jgl NaN
2 4199277 C krmr NaN
3 4201777 D wl 15.0
4 4202712 E ktk 5.0
CodePudding user response:
Use:
In [2276]: x = pd.merge(table_a.drop('freq_match_it', 1),table_b,how='left',on='unique_id')
In [2288]: cols = x.columns.tolist()
In [2294]: x = x[cols[:2] [cols[-1]] [cols[2]]]
In [2295]: x
Out[2295]:
unique_id full_name freq_match_it address
0 3823762 A NaN hyd
1 3976695 B NaN jgl
2 4199277 C NaN krmr
3 4201777 D 15.0 wl
4 4202712 E 5.0 ktk