df1 df2 df3
t_cell t_psc cell1 psc cell2 psc2
fh4 dh2 2 fh4 5
dh2 jk2 3 ik5 7
jk2 po6 2 it7 9
ik5 po6 3
po6
it7
output should be
t_cell t_psc
fh4 5
dh2 2
jk2 3
ik5 7
it7 9
po6 2
CodePudding user response:
First, rename your columns of df2
and df3
as df1
then concat
your 2 dataframes to create a dict mapping. Finally, map to the t_cell
column:
df2.columns = ['t_cell', 't_psc']
df3.columns = ['t_cell', 't_psc']
df1['t_psc'] = df1['t_cell'].map(pd.concat([df2, df3]).set_index('t_cell')['t_psc'])
print(df1)
# Output
t_cell t_psc
0 fh4 5
1 dh2 2
2 jk2 3
3 ik5 7
4 po6 2
5 it7 9
CodePudding user response:
Use Series.map
with rename columnsnames in df2
for same column like in df3
with remove duplicates by cell2
column:
s = (df2.rename(columns={'cell1':'cell2', 'psc':'psc2'})
.append(df3)
.drop_duplicates(['cell2'])
.set_index('cell2')['psc2'])
df1['t_psc'] = df1['t_cell'].map(s)
print (df1)
t_cell t_psc
0 fh4 5
1 dh2 2
2 jk2 3
3 ik5 7
4 po6 2
5 it7 9
Another idea is use:
s1 = df1['t_cell'].map(df2.set_index('cell1')['psc'])
s2 = df1['t_cell'].map(df3.set_index('cell2')['psc2'])
df1['t_psc'] = s1.fillna(s2)
print (df1)
t_cell t_psc
0 fh4 5.0
1 dh2 2.0
2 jk2 3.0
3 ik5 7.0
4 po6 2.0
5 it7 9.0