Home > Blockchain >  How can I count cumulative variables in pandas, but split the results into different columns?
How can I count cumulative variables in pandas, but split the results into different columns?

Time:02-16

For example, my dataset has a column that specifies a user's sex, I want to create two separate columns that each cumulatively count the occurrence of each sex.

User - Sex 
1      0
2      1
3      1
4      0

I want to turn the above into the table below where males are 0 and females are 1.

User - Sex  - Male Count - Female Count
1      0         0              0
2      1         1              0
3      1         1              1
4      0         1              2

CodePudding user response:

IIUC, you rather have a shifted cumulated count (i.e. count for the previous rows):

s = df['Sex'].eq(0)

df['Male count'] = s.cumsum().shift(fill_value=0)
df['Female count'] = (~s).cumsum().shift(fill_value=0)

output:

   User  Sex  Male count  Female count
0     1    0           0             0
1     2    1           1             0
2     3    1           1             1
3     4    0           1             2
  • Related