I have dataframe as below
Slno Name_x Age_x Sex_x Name_y Age_y Sex_y
0 1 A 27 Male A 32 Male
1 2 B 28 Female B 28 Female
2 3 C 8 Female C 1 Female
3 4 D 28 Male D 72 Male
4 5 E 25 Female E 64 Female
I need to create calculated column , difference between age, check gender match and to achieve this in one go I am using
DF3.loc[:,["Gendermatch","Agematch"]]= pd.DataFrame([np.where(DF3["Name_x"]==DF3["Name_y"],True,False),np.where(DF3["Age_x"]-DF3["Age_y"]==0,True,False)])
and the resultant dataframe looks like as below
Slno Name_x Age_x Sex_x Name_y Age_y Sex_y Gendermatch Agematch
0 1 A 27 Male A 32 Male NaN NaN
1 2 B 28 Female B 28 Female NaN NaN
2 3 C 8 Female C 1 Female NaN NaN
3 4 D 28 Male D 72 Male NaN NaN
4 5 E 25 Female E 64 Female NaN NaN
Resultant columns shows not a number , what wrong am I doing here?
CodePudding user response:
DF3[["Gendermatch","Agematch"]]= np.where(DF3["Name_x"]==DF3["Name_y"],True,False),np.where(DF3["Age_x"]-DF3["Age_y"]==0,True,False)
CodePudding user response:
DF3[["Gendermatch","Agematch"]] = pd.DataFrame([np.where(DF3["Name_x"]==DF3["Name_y"],True,False),np.where(DF3["Age_x"]-DF3["Age_y"]==0,True,False)]).T
CodePudding user response:
np.where
is useless, Series comparison already returns boolean Series
DF3["Gendermatch"] = DF3["Name_x"]==DF3["Name_y"]
DF3["Agematch"] = DF3["Age_x"]==DF3["Age_y"]
# or in one line
DF3["Gendermatch"], DF3["Agematch"] = (DF3["Name_x"]==DF3["Name_y"]), (DF3["Age_x"]==DF3["Age_y"])
print(DF3)
Slno Name_x Age_x Sex_x Name_y Age_y Sex_y Gendermatch Agematch
0 1 A 27 Male A 32 Male True False
1 2 B 28 Female B 28 Female True True
2 3 C 8 Female C 1 Female True False
3 4 D 28 Male D 72 Male True False
4 5 E 25 Female E 64 Female True False