I have dataset with columns user_id and type:
user_id | type | ordinal_number |
---|---|---|
1 | request | 1 |
1 | request | 1 |
1 | request | 1 |
1 | request | 1 |
1 | payment | 1 |
2 | request | 1 |
2 | request | 1 |
2 | payment | 1 |
2 | request | 2 |
2 | payment | 2 |
I want to fill column ordinal_number values likes in tables. If type == payment then assign a ordinal number and fill all previous row (type==request) on user_id values ordinal number.
for some users it may only be request and may be several payments in a row.
CodePudding user response:
You can identify "payments"; groupby
"user_id" and in each group, reverse the Series, find cumsum
, then reverse it back.
def assign_num(x):
s = x[::-1].cumsum()
# must subtract from max value to get an ascending Series
return s.iat[-1] 1 - s[::-1]
df['ordinal_number'] = df['type'].eq('payment').groupby(df['user_id']).apply(assign_num)
Output:
user_id type ordinal_number
0 1 request 1
1 1 request 1
2 1 request 1
3 1 request 1
4 1 payment 1
5 2 request 1
6 2 request 1
7 2 payment 1
8 2 request 2
9 2 payment 2
CodePudding user response:
IIUC, you want to set a counter on "payment" and backfill it per group:
m = df['type'].eq('payment')
df['ordinal_number'] = (m.cumsum().where(m)
.groupby(df['user_id'])
.bfill().astype(int)
)
Or to use the value or "user_id" as start value:
df['ordinal_number'] = (df['user_id'].where(df['type'].eq('payment'))
.groupby(df['user_id'])
.bfill().astype(int)
)
output:
user_id type ordinal_number
0 1 request 1
1 1 request 1
2 1 request 1
3 1 request 1
4 1 payment 1
5 2 request 2
6 2 request 2
7 2 payment 2