I'm trying to calculate the Bonus Pay given to our Courier's based on their Success Rate
which is based off the number of Eligible
orders they've successfully delivered.
Here's my code:
from openpyxl import load_workbook
import pandas as pd
df = pd.read_excel(r'path\filename.xlsx')
df['DeliveredAt'] = pd.to_datetime(df['DeliveredAt'].astype(str))
df['Date'] = df['DeliveredAt'].dt.strftime('%d/%m/%y')
df['Total Orders'] = df['OrderNumber']
df['Total Orders'] = 1
df['Eligible'] = df['DeliveryOnTime'].isin(["On-time", "Early"])
df['Success Rate'] = (df['Eligible']/df['Total Orders'])*100
df['Bonus'] = df['Eligible'].mul(1.2).where(df['Success Rate'] >= 95)
per_day = df.groupby(['Date', 'Courier']).agg({'Success Rate': 'mean', 'Total Orders': 'count', 'Eligible': 'sum', 'Incentive': 'sum'})
per_courier = df.groupby('Courier').agg({'Total Orders': 'count', 'Incentive': 'sum'})
I'm having issues with
df['Bonus'] = df['Eligible'].mul(1.2).where(df['Success Rate'] >= 95)
It seems to ignore the condition that Success Rate
has to be over 95 for it to qualify for a Bonus. Right now, each row has a Bonus pay which is calculated from the number of Eligible
* 1.2 regardless of the success rate value.
Sample Data:
data = {'ID': [1, 1, 1, 2, 2, 3, 4, 5, 5],
'DeliveryOnTime': ["On-time", "Late", "Early", "On-time", "On-time", "Late", "Early", "Early", "Late"],
}
df = pd.DataFrame(data)
#For the sake of example data, the count of `DeliveryOnTime` will be the total number of orders.
df['Total Orders'] = df['DeliveryOnTime'].count()
df['Eligible'] = df['DeliveryOnTime'].isin(["On-time", "Early"])
df['Success Rate'] = (df['Eligible']/df['Total Orders'])*100
df['Bonus'] = df['Eligible'].mul(1.2).where(df['Success Rate'] >= 95)
Current Output
ID Eligible Total Orders Success Rate Bonus
1 2 3 66.66 2.4
2 2 2 100 2.4
3 0 1 0 0
4 1 1 100 1.2
5 1 2 50 1.2
This is wrong as ID 1 and 5 should not have any Bonus
at all with the Success Rate
being lower than 95.
Expected Output
ID Eligible Total Orders Success Rate Bonus
1 2 3 66.66 0
2 2 2 100 2.4
3 0 1 0 0
4 1 1 100 1.2
5 1 2 50 0
I'm not sure if it's something I have to change in the Per_day
or Per_courier
section of the .groupby()
as that seems to be where most of my issues have occurred when writing this code.
Thanks
CodePudding user response:
As per what I understood, you need to find Bonus based on the success rate of the delivery boy. For that first you need to filter out with the eligible one.
Considering your own example
data = {'ID': [1, 1, 1, 2, 2, 3, 4, 5, 5],
'DeliveryOnTime': ["On-time", "Late", "Early", "On-time", "On-time", "Late", "Early", "Early", "Late"],
}
df = pd.DataFrame(data)
df['Eligible'] = df['DeliveryOnTime'].isin(["On-time", "Early"])
df['Total Orders'] = df['DeliveryOnTime'].count()
df.replace([True,False], [1,0],inplace=True)
per_courier_eligible = df.groupby('ID').agg({'Eligible': 'sum','Total Orders':'count'})
per_courier_eligible['Success Rate'] = per_courier_eligible['Eligible']/per_courier_eligible['Total Orders']
per_courier_eligible['Bonus'] = per_courier_eligible['Eligible']
per_courier_eligible['Bonus'].loc[(per_courier_eligible['Success Rate'] >= 0.95)] = per_courier_eligible['Eligible'] 1.2
print(per_courier_eligible)
CodePudding user response:
Is the problem with the order of .mul and .where?
This worked when I tried to reproduce:
df['Bonus'] = df['Eligible'].where(df['Success Rate'] >= 95).mul(1.2)
CodePudding user response:
I used an element of what @Rupal Shah said by filtering it per ID for the Success Rate
and Bonus
pay to work.
All I had to do was create a .groupby()
:
ID = df.groupby(['Urber', 'Date']).agg({'Total Orders':'sum','Eligible': 'sum'})
After this, I changed any following syntaxes with calculations from df.
to ID.
e.g.
ID['Success Rate'] = (ID['Eligible']/ID['Total Orders'])*100