For example, my dataset has a column that specifies a user's sex, I want to create two separate columns that each cumulatively count the occurrence of each sex.
User - Sex
1 0
2 1
3 1
4 0
I want to turn the above into the table below where males are 0 and females are 1.
User - Sex - Male Count - Female Count
1 0 0 0
2 1 1 0
3 1 1 1
4 0 1 2
CodePudding user response:
IIUC, you rather have a shifted cumulated count (i.e. count for the previous rows):
s = df['Sex'].eq(0)
df['Male count'] = s.cumsum().shift(fill_value=0)
df['Female count'] = (~s).cumsum().shift(fill_value=0)
output:
User Sex Male count Female count
0 1 0 0 0
1 2 1 1 0
2 3 1 1 1
3 4 0 1 2