In the following dataframe, I want to create a sequence number based on the
0
0 A
1 A
2 A
3 D
4 D
5 A
6 D
7 A
8 D
9 A
10 D
I want to produce
0 1
0 A 1
1 A 1
2 A 1
3 D 2
4 D 2
5 A 3
6 D 4
7 A 5
8 D 6
9 A 7
10 D 8
solution I tried
diff = d[0].ne(d[0].shift())
d['seqn']=diff.groupby([d[0]]).cumsum()
That is I want to create and assing a value which keeps tracks until it finds new value in the grouupby
column. Example if it finds A for the first three rows then they all get value of 1, and then if it finds value of 'D' then it assigns value 2.
CodePudding user response:
Try:
df[1] = (df[0] != df[0].shift()).cumsum()
print(df)
Prints:
0 1
0 A 1
1 A 1
2 A 1
3 D 2
4 D 2
5 A 3
6 D 4
7 A 5
8 D 6
9 A 7
10 D 8