I want to count the number of occurrences of one specific value (string) in one column and write it down in another column cumulatively.
For example, counting the cumulative number of Y
values here:
col_1 new_col
Y 1
Y 2
N 2
Y 3
N 3
I wrote this code but it gives me the final number instead of cumulative frequencies.
df['new_col'] = 0
df['new_col'] = df.loc[df.col_1 == 'Y'].count()
CodePudding user response:
To count both values cumulatively you can use:
df['new_col'] = (df
.groupby('col_1')
.cumcount().add(1)
.cummax()
)
If you want to focus on 'Y':
df['new_col'] = (df
.groupby('col_1')
.cumcount().add(1)
.where(df['col_1'].eq('Y'))
.ffill()
.fillna(0, downcast='infer')
)
Output:
col_1 new_col
0 Y 1
1 Y 2
2 N 2
3 Y 3
4 N 3