Home > Software engineering >  How to do map of two columns of different data frame and if get some null values then again mapping
How to do map of two columns of different data frame and if get some null values then again mapping

Time:03-30

df1              df2               df3
t_cell t_psc     cell1  psc       cell2  psc2
fh4               dh2   2          fh4    5
dh2               jk2   3          ik5    7
jk2               po6   2          it7    9
ik5                                po6    3
po6
it7

output should be

t_cell t_psc
fh4     5
dh2     2
jk2     3
ik5     7
it7     9
po6     2

CodePudding user response:

First, rename your columns of df2 and df3 as df1 then concat your 2 dataframes to create a dict mapping. Finally, map to the t_cell column:

df2.columns = ['t_cell', 't_psc']
df3.columns = ['t_cell', 't_psc']
df1['t_psc'] = df1['t_cell'].map(pd.concat([df2, df3]).set_index('t_cell')['t_psc'])
print(df1)

# Output
  t_cell  t_psc
0    fh4      5
1    dh2      2
2    jk2      3
3    ik5      7
4    po6      2
5    it7      9

CodePudding user response:

Use Series.map with rename columnsnames in df2 for same column like in df3 with remove duplicates by cell2 column:

s = (df2.rename(columns={'cell1':'cell2', 'psc':'psc2'})
        .append(df3)
        .drop_duplicates(['cell2'])
        .set_index('cell2')['psc2'])

df1['t_psc'] = df1['t_cell'].map(s)
print (df1)
  t_cell  t_psc
0    fh4      5
1    dh2      2
2    jk2      3
3    ik5      7
4    po6      2
5    it7      9

Another idea is use:

s1 = df1['t_cell'].map(df2.set_index('cell1')['psc'])
s2 = df1['t_cell'].map(df3.set_index('cell2')['psc2'])
df1['t_psc'] = s1.fillna(s2)
print (df1)
  t_cell  t_psc
0    fh4    5.0
1    dh2    2.0
2    jk2    3.0
3    ik5    7.0
4    po6    2.0
5    it7    9.0
  • Related