Home > Enterprise >  How to fill in existing column in table-A with values pulled out from table-B after doing left join
How to fill in existing column in table-A with values pulled out from table-B after doing left join

Time:03-11

I have two DataFrame A and B as-

table_a = pd.DataFrame({
    'unique_id':[3823762,3976695,4199277,4201777,4202712],
    'full_name':['A','B','C','D','E'],
    'freq_match_it':[np.NaN,np.NaN,np.NaN,np.NaN,np.NaN],
    'address':['hyd','jgl','krmr','wl','ktk']
})
table_b = pd.DataFrame({
    'unique_id':[419434,4201777,784744,4202712,10000],
    'freq_match_it':[12,15,8,5,100]
    
})

Here I would like to fill in freq_match_it column in table_a whose unique_ids exist in table_b using a left join in pandas as

pd.merge(table_a,table_b,how='left',on='unique_id')

it shows as-

enter image description here

Here it has created an extra column freq_match_it_y and I want to fill in the already existing column freq_match_it rather than create an additional column.

The expected output is

enter image description here

CodePudding user response:

Replace missing values by mapping Series from second DataFrame by Series.fillna and Series.map:

s = table_b.set_index('unique_id')['freq_match_it']
table_a['freq_match_it'] = table_a['freq_match_it'].fillna(table_a['unique_id'].map(s))

CodePudding user response:

You could map it:

table_a['freq_match_it'] = table_a['unique_id'].map(table_b.set_index('unique_id')['freq_match_it'])

Output:

   unique_id full_name  freq_match_it address
0    3823762         A            NaN     hyd
1    3976695         B            NaN     jgl
2    4199277         C            NaN    krmr
3    4201777         D           15.0      wl
4    4202712         E            5.0     ktk

CodePudding user response:

You were almost there, merge and use suffixes and then drop either based on suffixes names or on which column has all NaNs. No need of multiple computations proposed elsewhere

pd.merge(table_a,table_b,how='left',on='unique_id', suffixes=('_x','')).dropna(axis=1,how='all')



 unique_id full_name address  freq_match_it
0    3823762         A     hyd            NaN
1    3976695         B     jgl            NaN
2    4199277         C    krmr            NaN
3    4201777         D      wl           15.0
4    4202712         E     ktk            5.0

CodePudding user response:

Use:

In [2276]: x = pd.merge(table_a.drop('freq_match_it', 1),table_b,how='left',on='unique_id')

In [2288]: cols = x.columns.tolist()

In [2294]: x = x[cols[:2]   [cols[-1]]   [cols[2]]]

In [2295]: x
Out[2295]: 
   unique_id full_name  freq_match_it address
0    3823762         A            NaN     hyd
1    3976695         B            NaN     jgl
2    4199277         C            NaN    krmr
3    4201777         D           15.0      wl
4    4202712         E            5.0     ktk
  • Related