I have dataframe where i'm trying to create a new column showing the value with conditional groupby.
conditions:
- tag == 1, profit - cost
- tag > 1, -(cost)
- net is summed after every iteration
original df:
╔══════╦═════╦══════╦════════╗ ║ rep ║ tag ║ cost ║ profit ║ ╠══════╬═════╬══════╬════════╣ ║ john ║ 1 ║ 1 ║ 5 ║ ║ pete ║ 2 ║ 1 ║ 3 ║ ║ pete ║ 3 ║ 1 ║ 4 ║ ║ pete ║ 4 ║ 1 ║ 5 ║ ║ john ║ 5 ║ 1 ║ 7 ║ ║ john ║ 1 ║ 1 ║ 9 ║ ║ pete ║ 1 ║ 1 ║ 3 ║ ║ john ║ 3 ║ 1 ║ 5 ║ ╚══════╩═════╩══════╩════════╝
output hope to get:
╔══════╦═════╦══════╦════════╦═════╗ ║ rep ║ tag ║ cost ║ profit ║ net ║ ╠══════╬═════╬══════╬════════╬═════╣ ║ john ║ 1 ║ 1 ║ 5 ║ 4 ║ ║ pete ║ 2 ║ 1 ║ 3 ║ -1 ║ ║ pete ║ 3 ║ 1 ║ 4 ║ -2 ║ ║ pete ║ 4 ║ 1 ║ 5 ║ -3 ║ ║ john ║ 5 ║ 1 ║ 7 ║ 3 ║ ║ john ║ 1 ║ 1 ║ 9 ║ 11 ║ ║ pete ║ 1 ║ 1 ║ 3 ║ -1 ║ ║ john ║ 3 ║ 1 ║ 5 ║ 15 ║ ╚══════╩═════╩══════╩════════╩═════╝
I'm able to use loc to derives 'if' conditions, however, i'm not able to figure out or find examples of groupby/if/cumsum for this.
sample df code:
data = {
"rep": ['john','pete','pete','pete','john','john','pete','john'],
"tag": [1,2,3,4,5,1,1,3],
"cost": [1,1,1,1,1,1,1,1],
"profit": [5,3,4,5,7,9,3,5]}
df = pd.DataFrame(data)
kindly advise
CodePudding user response:
cond1 = df['tag'].eq(1)
s = df['cost'].mul(-1).mask(cond1, df['profit'] - df['cost'])
s
0 4
1 -1
2 -1
3 -1
4 -1
5 8
6 2
7 -1
Name: cost, dtype: int64
df['net'] = s.groupby(df['rep']).cumsum()
df
rep tag cost profit net
0 john 1 1 5 4
1 pete 2 1 3 -1
2 pete 3 1 4 -2
3 pete 4 1 5 -3
4 john 5 1 7 3
5 john 1 1 9 11
6 pete 1 1 3 -1
7 john 3 1 5 10
Other Way
if you want 1-line code use following
out = (df.assign(net=df['cost'].mul(-1)
.mask(df['tag'].eq(1), df['profit'].sub(df['cost']))
.groupby(df['rep']).cumsum()))
same result