I have 2 dataframes
db1 = pd.DataFrame(
columns=["value", "type",'type2'], data=[[1, "A","ca"],[2, "B","cb"],[3, "C","cc"],[4, "D",'dd'],[5, "E",'ee'],["Nan","F",'ff']]
)
print(db1)
db2 = pd.DataFrame(
columns=["value", "type"], data=[["Nan", "A"],["Nan", "F"]]
)
db1
value type type2
0 1 A ca
1 2 B cb
2 3 C cc
3 4 D dd
4 5 E ee
5 Nan F ff
db2
value type
0 Nan A
1 Nan F
I want to have in a new column of db2 the value of "type2" if (db1['type'] == db2['type']) & (db1['value'] == db2['value'])
like that:
db2
value type new
Nan A
Nan F ff
I try something but it doesn't work
db2["new"] =np.nan
db2["new"]= np.where((db1['type'] == db2['type']) & (db1['value'] == db2['value']), db2['type2'], '')
Thanks!
CodePudding user response:
try this (kind of ugly but work :)
import pandas as pd
db1 = pd.DataFrame(
columns=["value", "type",'type2'], data=[[1, "A","ca"],[2, "B","cb"],[3, "C","cc"],[4, "D",'dd'],[5, "E",'ee'],["Nan","F",'ff']]
)
print(db1)
db2 = pd.DataFrame(
columns=["value", "type"], data=[["Nan", "A"],["Nan", "F"]]
)
for index,row in db2.iterrows():
type_=db2.loc[index,'type']
value_=db2.loc[index,'value']
df_tempo=db1.loc[(db1['type']==type_)&(db1['value']==value_),:]
if df_tempo.shape[0]==1 :
db2.loc[index,'new']=df_tempo['type2'].values[0]
else :
db2.loc[index,'new']='no_val'
Output :
db2
value type new
0 nan A no_val
1 nan F ff
2 3 F no_val
If you only to keep the match between both of them :
db2=db2.loc[db2['new']!='no_val',:]
db2
value type new
1 Nan F ff
CodePudding user response:
IIUC you can just do a merge and that should give you the results you are expecting
pd.merge(db1, db2, on = ['value', 'type'])
And if you wanted to keep data from the db2 like you have in your example you could add a how = 'right' to ensure that no matter what all db2 rows are present
pd.merge(db1, db2, on = ['value', 'type'], how = 'right')