I have a dataframe looks like,
ID | SWITCH |
---|---|
A | ON |
A | ON |
A | ON |
A | OFF |
A | OFF |
A | OFF |
A | ON |
A | ON |
A | ON |
... | ... |
B | ON |
B | ON |
B | ON |
B | OFF |
B | OFF |
B | OFF |
B | ON |
B | ON |
B | ON |
Column['SWITCH'] is cyclical data and I'd like to count the number of ON and OFF for each cycle like this,
ID | SWITCH | Cum. Count |
---|---|---|
A | ON | 1 |
A | ON | 2 |
A | ON | 3 |
A | OFF | 1 |
A | OFF | 2 |
A | OFF | 3 |
A | ON | 1 |
A | ON | 2 |
A | ON | 3 |
... | ... | |
B | ON | 1 |
B | ON | 2 |
B | OFF | 1 |
B | OFF | 2 |
B | OFF | 3 |
B | ON | 1 |
B | ON | 2 |
B | ON | 3 |
I'd tried 'cumsum' or 'cumcount' but it didn't reset the count when the next 'ON' cycle has come. (it keeps counting on the number from the previous cycle)
What can I do?
Thanks in advanced!
CodePudding user response:
Try put in the cumsum of the difference as well:
switch_blocsk = df['SWITCH'].ne(df['SWITCH'].shift()).cumsum()
df['cum.count'] = df.groupby(['ID', switch_blocks]).cumcount().add(1)
CodePudding user response:
You need to create a new column which indicates the change in the 'SWITCH' column, then you can use 'groupby' to perform the cumulative count.
import pandas as pd
# Create sample data
df = pd.DataFrame({'ID': ['A'] * 9 ['B'] * 9,
'SWITCH': ['ON'] * 3 ['OFF'] * 3 ['ON'] * 3 ['ON'] * 3 ['OFF'] * 3 ['OFF'] * 3})
df['SWITCH_CHANGE'] = (df['SWITCH'] != df['SWITCH'].shift()).astype(int)
df['Cum. Count'] = df.groupby(['ID', df.SWITCH_CHANGE.cumsum()])['SWITCH'].cumcount() 1
print(df)
Result:
ID SWITCH | SWITCH_CHANGE | Cum. Count | ||
---|---|---|---|---|
0 | A | ON | 1 | 1 |
1 | A | ON | 0 | 2 |
2 | A | ON | 0 | 3 |
3 | A | OFF | 1 | 1 |
4 | A | OFF | 0 | 2 |
5 | A | OFF | 0 | 3 |
6 | A | ON | 1 | 1 |
7 | A | ON | 0 | 2 |
8 | A | ON | 0 | 3 |
9 | B | ON | 0 | 1 |
10 | B | ON | 0 | 2 |
11 | B | ON | 0 | 3 |
12 | B | OFF | 1 | 1 |
13 | B | OFF | 0 | 2 |
14 | B | OFF | 0 | 3 |
15 | B | OFF | 0 | 4 |
16 | B | OFF | 0 | 5 |
17 | B | OFF | 0 | 6 |