Am having dataframe ,need to implement multiple ifelse conditions to filter value column
& multiply expo_value column
with cal_value column
Input Dataframe
client_id expo_value value cal_value
1 126 30 27.06
2 135 60 36.18
3 144 120 45
4 162 30 54.09
5 153 90 63.63
6 181 120 72.9
4 207 30 99.09
5 315 90 126.63
6 414 120 81.9
I have written function to filter multiple conditions and apply formula to calculate for that condition
def cal_df(df):
if df[df['value'] <=30]:
df= df['expo_value'] *30 df['cal_value'] * 45
return df
elif df[(df['value'] <=60 and (df['value'] >=100)]:
df= df['expo_value'] *60 df['cal_value'] * 90
return df
elif df[(df['value'] <=100 and (df['value'] >=150)]:
df= df['expo_value'] *100 df['cal_value'] * 120
return df
else df[df['value'] <=10]:
return np.nan
data = data.groupby('client_id').apply(lambda x:cal_df(x)).reset_index()
results should be stored as new column
### if am applying groupby condition am getting following error
Valueerror: The truth value of a series is ambiguous. Use a.empty ,a.bool(),a.item(),a.any() or a.all()
if am applying groupby condition am getting following error
#Valueerror: The truth value of a series is ambiguous. Use a.empty ,a.bool(),a.item(),a.any() or a.all()
CodePudding user response:
If I understand what you want to do, you don't need to do a groupby. You can do this:
data.loc[data['value'] <= 30, "new_column"] = data['expo_value']*30 data['cal_value']*45
data.loc[(data['value'] <= 60) & (data['value'] >= 100), "new_column"] = data['expo_value']*60 data['cal_value']*90
data.loc[(data['value'] <= 100) & (data['value'] >= 150), "new_column"] = data['expo_value']*100 data['cal_value']*120
The values for "new_column" in any rows not captured by the above will be NaN by default. But there is another problem, which is those last two lines won't match anything, because something can't be both below 60 and above 100, or below 100 and above 150. Should it be <= for both of them?
CodePudding user response:
Try this
def func(value, expo_value, cal_value):
if value <= 30: return expo_value * 30 cal_value * 45
elif value <= 60 and value >= 100: return expo_value * 60 cal_value * 90
elif value <= 100 and value >= 150: return expo_value * 100 cal_value * 120
else: return np.nan
df['new_values'] = df[['value', 'expo_value', 'cal_value']].apply(lambda x: func(*x.values.tolist()), axis = 1)