Home > Mobile >  Pandas: how to assign a label to each group of values?
Pandas: how to assign a label to each group of values?

Time:10-15

Say that I have a df like this:

   Value
0  True 
1  True
2  False
3  False
4  False
5  True
6  True
7  False
8  True
9  True

And say that I want to assign each group of True values a label, such that consecutive True values are assigned the same label because they constitute a cluster, whereas False values get always 0:

   Value  Label
0  True   1
1  True   1
2  False  0
3  False  0
4  False  0
5  True   2
6  True   2
7  False  0
8  True   3
9  True   3

How could I do this in pandas?

CodePudding user response:

Try this:

>>> df['Label'] = df[df['Value']].index.to_series().diff().ne(1).cumsum()
>>> df
   Value  Label
0   True    1.0
1   True    1.0
2  False    NaN
3  False    NaN
4  False    NaN
5   True    2.0
6   True    2.0
7  False    NaN
8   True    3.0
9   True    3.0
>>> 

CodePudding user response:

Here is another approach that is fully independent of the index:

m = df['Value']
df['Label'] = m.ne(m.shift()).cumsum().where(m)//2 df['Value'].iloc[0]

Explanation: if successive values are different, start a new group, keep only the True groups, divide the group number by two to account for the alternating True/False and correct the initial group number depending on whether the first item is False or True.

  • Related