We have data representing workers billing history of payments and penalties after their work shifts. Sometimes penalty for the worker is wrong, because he had technical problems with his mobile app and in reality he attended the job. Later he gets his penalty refunded which goes with description 'balance_correction'. The goal is to show n lines (rows) in data to to find pattern for what he got the penalty. So here is the data:
d = {'balance_id': [70775,70775 ,70775,70775,70775], 'amount': [2500, 2450,-500,500,2700]
,'description':['payment_for_job_order_080ecd','payment_for_job_order_180eca'
,'penalty_for_being_absent_at_job','balance_correction','payment_for_job_order_120ecq']}
df1 = pd.DataFrame(data=d)
df1
balance_id amount description
0 70775 2500 payment_for_job_order_080ecd
1 70775 2450 payment_for_job_order_180eca
2 70775 -500 penalty_for_being_absent_at_job
3 70775 500 balance_correction
4 70775 2700 payment_for_job_order_120ecq
I try this:
df1.loc[df1['description']=='balance_correction'].iloc[:-2]
and get nothing. Also using shift doesn't help. If we need 2 roes to show, the result should be
balance_id amount description
1 70775 2450 payment_for_job_order_180eca
2 70775 -500 penalty_for_being_absent_at_job
What can solve the issue?
CodePudding user response:
If the index on your data frame is sequential (0, 1, 2, 3, ...), you can filter by the index:
idx = df1.loc[df1['description'] == 'balance_correction'].index
df1.loc[(idx - 2).append(idx - 1)]