I'm following the answer from this question
I have a df
like this:
score_1 score_2
1.11 NaN
2.22 3.33
NaN 3.33
NaN NaN
........
The rule for calculating final_score
is that we require at least one of the scores to be non-null
, if one of the scores in NULL, then final_score will equal to another score (it has all the weights)
This is the code to replicate:
import numpy as np
import pandas as pd
df = pd.DataFrame({
'score_1': [1.11, 2.22, np.nan],
'score_2': [np.nan, 3.33, 3.33]
})
def final_score(df):
if (df['score_1'] != np.nan) and (df['score_2'] != np.nan):
print('I am condition one')
return df['score_1'] * 0.2 df['score_2'] * 0.8
elif (df['score_1'] == np.nan) and (df['score_2'] != np.nan):
print('I am the condition two')
return df['score_2']
elif (df['score_1'] != np.nan) and (df['score_2'] == np.nan):
print('I am the condition three')
return df['score_1']
elif (df['score_1'] == np.nan) and (df['score_2'] == np.nan):
print('I am the condition four')
return np.nan
df['final_score'] = df.apply(final_score, axis=1)
print(df)
This gave me output:
score_1 score_2 final_score
1.11 NaN NaN
2.22 3.33 3.108
NaN 3.33 NaN
NaN NaN NaN
........
But my expected output is below:
score_1 score_2 final_score
1.11 NaN 1.11
2.22 3.33 3.108
NaN 3.33 3.33
NaN NaN NaN
........
The first and third row are not the result I'm expecting, can someone help me, what's wrong with my code? Thanks a lot.
CodePudding user response:
Lets appy your conditions using np.where
df['final_score'] =np.where(df.notna().all(1),df['score_1'] * 0.2 df['score_2'] * 0.8,df.mean(1))
score_1 score_2 final_score
0 1.11 NaN 1.110
1 2.22 3.33 3.108
2 NaN 3.33 3.330
3 NaN NaN NaN
CodePudding user response:
using np.isnan() for comparison should solve the problem