Home > Software engineering >  Attribute a value based on different dataframe
Attribute a value based on different dataframe

Time:06-02

I have 2 dataframes

db1 = pd.DataFrame(
    columns=["value", "type",'type2'], data=[[1, "A","ca"],[2, "B","cb"],[3, "C","cc"],[4, "D",'dd'],[5, "E",'ee'],["Nan","F",'ff']]
)
print(db1)

db2 = pd.DataFrame(
    columns=["value", "type"], data=[["Nan", "A"],["Nan", "F"]]
)

db1

    value   type    type2
0   1       A       ca
1   2       B       cb
2   3       C       cc
3   4       D       dd
4   5       E       ee
5   Nan     F       ff

db2

    value   type
0   Nan     A
1   Nan     F

I want to have in a new column of db2 the value of "type2" if (db1['type'] == db2['type']) & (db1['value'] == db2['value'])

like that:

db2

value   type    new
Nan     A   
Nan     F       ff

I try something but it doesn't work

db2["new"] =np.nan

db2["new"]= np.where((db1['type'] == db2['type']) & (db1['value'] == db2['value']), db2['type2'], '')

Thanks!

CodePudding user response:

try this (kind of ugly but work :)

import pandas as pd 
db1 = pd.DataFrame(
    columns=["value", "type",'type2'], data=[[1, "A","ca"],[2, "B","cb"],[3, "C","cc"],[4, "D",'dd'],[5, "E",'ee'],["Nan","F",'ff']]
)
print(db1)

db2 = pd.DataFrame(
    columns=["value", "type"], data=[["Nan", "A"],["Nan", "F"]]
)
for index,row in db2.iterrows():
    type_=db2.loc[index,'type']
    value_=db2.loc[index,'value']
    
    df_tempo=db1.loc[(db1['type']==type_)&(db1['value']==value_),:]
    if df_tempo.shape[0]==1 : 
        db2.loc[index,'new']=df_tempo['type2'].values[0]
    else : 
        db2.loc[index,'new']='no_val'

Output :

db2
    value   type    new
0   nan     A      no_val
1   nan     F      ff
2   3       F      no_val

If you only to keep the match between both of them :

db2=db2.loc[db2['new']!='no_val',:]
db2
    value   type    new
1   Nan     F       ff

CodePudding user response:

IIUC you can just do a merge and that should give you the results you are expecting

pd.merge(db1, db2, on = ['value', 'type'])

And if you wanted to keep data from the db2 like you have in your example you could add a how = 'right' to ensure that no matter what all db2 rows are present

pd.merge(db1, db2, on = ['value', 'type'], how = 'right')
  • Related