I have the following dataframe:
Steps
0 False
1 False
2 True
3 True
4 True
5 False
6 True
7 True
8 False
9 False
10 False
11 True
I would like to number the True
sequences in an additional column:
Steps Numbered
0 False 0
1 False 0
2 True 1
3 True 1
4 True 1
5 False 0
6 True 2
7 True 2
8 False 0
9 False 0
10 False 0
11 True 3
Filling the rows containing False
is secondary. Do you have any ideas?
CodePudding user response:
You can use the values that are both True and different from the previous one (using diff
) to initiate a cumsum
:
df['Numbered'] = (df['Steps']&df['Steps'].diff()).cumsum().where(df['Steps'], 0)
output:
Steps Numbered
0 False 0
1 False 0
2 True 1
3 True 1
4 True 1
5 False 0
6 True 2
7 True 2
8 False 0
9 False 0
10 False 0
11 True 3
CodePudding user response:
Chain shifted values by Series.shift
by &
for bitwise AND
for counter by first True
s and then set False
rows to 0
by Series.where
:
df['Numbered'] = ((df['Steps'] & ~df['Steps'].shift(fill_value=False)).cumsum()
.where(df['Steps'], 0))
print (df)
Steps Numbered
0 False 0
1 False 0
2 True 1
3 True 1
4 True 1
5 False 0
6 True 2
7 True 2
8 False 0
9 False 0
10 False 0
11 True 3
Solution working well if first value is True
:
df['Numbered'] = ((df['Steps'] & ~df['Steps'].shift(fill_value=False)).cumsum()
.where(df['Steps'], 0))
print (df)
Steps Numbered
0 True 1
1 False 0
2 True 2
3 True 2
4 True 2
5 False 0
6 True 3
7 True 3
8 False 0
9 False 0
10 False 0
11 True 4