I have the following column in my pandas dataframe named as FailureLabel
ID FailureLabel
0 1 1
1 2 1
2 3 1
3 4 0
4 5 0
5 6 0
6 7 0
7 8 1
8 9 1
9 10 0
10 11 0
11 12 1
12 13 1
I would like to assign a unique_id
to this column such that eachs 1's have a unique id whereas all zeros the next one have a common "unique id".
I tried using the following code ,
df['unique_id'] = (df['FailureLabel'] | (df['FailureLabel']!=df['FailureLabel'].shift())).cumsum()
which gives me the following output,
ID FailureLabel unique_id
0 1 1 1
1 2 1 2
2 3 1 3
3 4 0 4
4 5 0 4
5 6 0 4
6 7 0 4
7 8 1 5
8 9 1 6
9 10 0 7
10 11 0 7
11 12 1 8
12 13 1 9
But what I desire is,
ID FailureLabel unique_id
0 1 1 1
1 2 1 2
2 3 1 3
3 4 0 4
4 5 0 4
5 6 0 4
6 7 0 4
7 8 1 4
8 9 1 5
9 10 0 6
10 11 0 6
11 12 1 6
12 13 1 7
CodePudding user response:
Use Series.shift
with backfilling first value, compare by 1
and add cumulative sum:
df['unique_id'] = df['FailureLabel'].shift().bfill().eq(1).cumsum()
print (df)
ID FailureLabel unique_id
0 1 1 1
1 2 1 2
2 3 1 3
3 4 0 4
4 5 0 4
5 6 0 4
6 7 0 4
7 8 1 4
8 9 1 5
9 10 0 6
10 11 0 6
11 12 1 6
12 13 1 7