I have a DataFrame with Employees and their hours for different categories.
I need to recalculate only specific categories (OT, MILE and REST Categories SHOULD NOT Be Updated, ALL Other Should be updated) ONLY if OT category is present under Empl_Id.
data = {'Empl_Id': [1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3],
'Category': ["MILE", "REST", "OT", "TRVL", "REG", "ADMIN", "REST", "REG", "MILE", "OT", "TRVL", "REST", "MAT", "REG"],
'Value': [43, 0.7, 6.33, 2.67, 52, 22, 1.17, 16.5, 73.6, 4.75, 1.33, 2.5, 5.5, 52.25]}
df = pd.DataFrame(data=data)
df
Empl_Id | Category | Value |
---|---|---|
1 | MILE | 43 |
1 | REST | 0.7 |
1 | OT | 6.33 |
1 | TRVL | 2.67 |
1 | REG | 52 |
2 | ADMIN | 22 |
2 | REST | 1.17 |
2 | REG | 16.5 |
3 | MILE | 73.6 |
3 | OT | 4.75 |
3 | TRVL | 1.33 |
3 | REST | 2.5 |
3 | MAT | 5.5 |
3 | REG | 52.25 |
The Logic is to:
1) Find % of OT Hours from Total Hours (OT, REST and MILE don't count):
- 1st Empl_Id: 6.33 (OT) / 2.67 (TRVL) 52 (REG) = 6.33 / 54.67 = 11.58 %
- 2nd Empl_Id: OT Hours Not present, nothing should be updated
- 3rd Empl_Id: 4.75 (OT) / 1.33 (TRVL) 5.5 (MAT) 52.25 (REG) = 4.75 / 59.08 = 8.04 %
2) Substract % of OT from each category (OT, REST and MILE don't count):
Empl_Id | Category | Value |
---|---|---|
1 | MILE | 43 |
1 | REST | 0.7 |
1 | OT | 6.33 |
1 | TRVL | 2.67 - 11.58 % (0.31) = 2.36 |
1 | REG | 52 - 11.58 % (6.02) = 45.98 |
2 | ADMIN | 22 |
2 | REST | 1.17 |
2 | REG | 16.5 |
3 | MILE | 73.6 |
3 | OT | 4.75 |
3 | TRVL | 1.33 - 8.04 % (0.11) = 1.22 |
3 | REST | 2.5 |
3 | MAT | 5.5 - 8.04 % (0.44) = 5.06 |
3 | REG | 52.25 - 8.04 % (4.2) = 48.05 |
CodePudding user response:
You can use:
keep = ['OT', 'MILE', 'REST']
# get factor
factor = (df.groupby(df['Empl_Id'])
.apply(lambda g: g.loc[g['Category'].eq('OT'),'Value'].sum()
/g.loc[~g['Category'].isin(keep),'Value'].sum()
)
.rsub(1)
)
# update
df.loc[~df['Category'].isin(keep), 'Value'] *= df['Empl_Id'].map(factor)
output:
Empl_Id Category Value
0 1 MILE 43.000000
1 1 REST 0.700000
2 1 OT 6.330000
3 1 TRVL 2.360852
4 1 REG 45.979148
5 2 ADMIN 22.000000
6 2 REST 1.170000
7 2 REG 16.500000
8 3 MILE 73.600000
9 3 OT 1.750000
10 3 TRVL 1.290604
11 3 REST 2.500000
12 3 MAT 5.337085
13 3 REG 50.702310