I´m stupid and new to everything,
I want to create a new column with values that depends on the cumulative sum with a condition from an existing column and when the condition is reached the cumulative sum should start over from zero on the existing line.
For example, from this existing dataframe I want to create a new column that "tags" each line with the same tag depending on that the cumulative sum of the line with previous rows is less or equal to 80 (condition). When the condition is met the counter should start over from zero again.:
Ops | Values |
---|---|
First | 20 |
Second | 10 |
Third | 35 |
Fourth | 15 |
Fith | 70 |
Sixth | 15 |
Seventh | 32 |
Eighth | 32 |
Ninth | 32 |
... | ... |
I want to acheive this:
Ops | Values | New column |
---|---|---|
First | 20 | Stn 1 |
Second | 10 | Stn 1 |
Third | 35 | Stn 1 |
Fourth | 15 | Stn 1 |
Fith | 70 | Stn 2 |
Sixth | 15 | Stn 3 |
Seventh | 65 | Stn 3 |
Eighth | 80 | Stn 4 |
Ninth | 32 | Stn 5 |
... | ... | ... |
Unfortunately, I have no idea how to do this and I didn't find the answer somewhere else. Could someone give me a hint?
CodePudding user response:
You can divide by 80
and round up to get group.
df['New column'] = 'Stn ' np.ceil(df['Values'].cumsum().div(80)).astype(int).astype(str)
print(df)
Ops Values New column
0 First 20 Stn 1
1 Second 10 Stn 1
2 Third 35 Stn 1
3 Fourth 15 Stn 1
4 Fith 70 Stn 2
5 Sixth 15 Stn 3
6 Seventh 65 Stn 3
7 Eighth 80 Stn 4
8 Ninth 32 Stn 5