Home > Mobile >  Python Pandas DataFrame: create a counter in the column based on start-counting condition
Python Pandas DataFrame: create a counter in the column based on start-counting condition

Time:09-15

I want to program a counter with python which counts the "1-values" in the column "values". The counter has to start counting after the first 0 value in the rows.

As it is seen in the example:

  • the counter value for the first three "1-values" is 0.
  • after having found the first 0 and the next value is 1 the count starts working
  • for each 0 value the counter is set to 0.

This is just an example. In the reality the table has more than three "1-values" at the beginning. Therefore it desired that the code should bee automated.

Values Resulted counter values
1 0
1 0
1 0
0 0
1 1
1 2
1 3
1 4
0 0
1 1
1 2
1 3

CodePudding user response:

You can use groupby.cumcount and a mask with where:

# is the value not a 1?
m = df['Values'].ne(1)
# for each group starting with "not-a-1", cumcount,
# mask the values before the first non-1
df['counter'] = df.groupby(m.cumsum()).cumcount().where(m.cummax(), 0)

output:

    Values  counter
0        1        0
1        1        0
2        1        0
3        0        0
4        1        1
5        1        2
6        1        3
7        1        4
8        0        0
9        1        1
10       1        2
11       1        3

Intermediates:

    Values  counter      m  m.cumsum()  cumcount  m.cummax()
0        1        0  False           0         0       False
1        1        0  False           0         1       False
2        1        0  False           0         2       False
3        0        0   True           1         0        True
4        1        1  False           1         1        True
5        1        2  False           1         2        True
6        1        3  False           1         3        True
7        1        4  False           1         4        True
8        0        0   True           2         0        True
9        1        1  False           2         1        True
10       1        2  False           2         2        True
11       1        3  False           2         3        True

CodePudding user response:

is_one  = df.Values.eq(1)
grouper = is_one.diff().ne(0).cumsum()
counts  = df.Values.groupby(grouper).cumcount().add(1)
places  = is_one & df.Values.eq(0).cummax()

df["new"] = counts.where(places, other=0)
  • Group by consecutive 1s (diff-ne-cumsum trick)
  • get a 1, 2, 3... counter per group (cumcount() 1)
  • put these counters only where Value is 1 and we're after first 0
    • after first 0 is via cumulative maximum of being 0 because once True, then always True
In [71]: df
Out[71]:
    Values  Resulted counter values  new
0        1                        0    0
1        1                        0    0
2        1                        0    0
3        0                        0    0
4        1                        1    1
5        1                        2    2
6        1                        3    3
7        1                        4    4
8        0                        0    0
9        1                        1    1
10       1                        2    2
11       1                        3    3
  • Related