Home > database >  Condition is true to start counting, until the next row is true to restart counting
Condition is true to start counting, until the next row is true to restart counting


expected result table

    bool    count
0   FALSE   
1   FALSE   
2   TRUE    0
3   FALSE   1
4   FALSE   2
5   FALSE   3
6   TRUE    0
7   FALSE   1
8   TRUE    0
9   TRUE    0

How to calculate the value of column 'count'

CodePudding user response:

Here you go:

# create bool dataframe
df = pd.DataFrame(dict(bool_= [0, 0, 1, 0, 0, 1, 1, 0, 0, 0]), dtype= bool)
df.index = list("abcdefghij")

# create a new Series unique integers to associate a group for the rows
# between True values
ix = pd.Series(range(df.shape[0])).where(df.bool_.values, np.nan).ffill().values

# if the first rows are False, they will be NaNs and shouldn't be 
# counted so only perform groupby and cumcount() for what is notna
notna = pd.notna(ix)
df["count"] = df[notna].groupby(ix[notna]).cumcount()

>>> df   
   bool_  count
a  False    NaN
b  False    NaN
c   True    0.0
d  False    1.0
e  False    2.0
f   True    0.0
g   True    0.0
h  False    1.0
i  False    2.0
j  False    3.0

CodePudding user response:

You can try groupby the cumsum of bool column then transform a customize function to check if first element in each group is True

df['m'] = df['bool'].cumsum()
df['out'] = (df.groupby(df['bool'].cumsum())
             ['bool'].transform(lambda col: range(len(col)) if col.iloc[0] else [pd.NA]*len(col)))

    bool  count  m   out
0  False    NaN  0  <NA>
1  False    NaN  0  <NA>
2   True    0.0  1     0
3  False    1.0  1     1
4  False    2.0  1     2
5  False    3.0  1     3
6   True    0.0  2     0
7  False    1.0  2     1
8   True    0.0  3     0
9   True    0.0  4     0

CodePudding user response:

I think your question is not clear. We need a little more context and objectives to work with here.

Let's assume that you have a dataframe of Boolean values [True, False], and you wish to compute a count of how many "True" and how many "False"

import pandas as pd
import random

## Randomly generating Boolean values to populate a dataframe
choices = [ 'True', 'False' ]
df = pd.DataFrame(index = range(10), columns = ['boolean'])
df['boolean'] = df['boolean'].apply(lambda x: random.choice(choices))

Randomly generated data

0   False
1   False
2   False
3    True
4   False
5   False
6   False
7    True
8   False
9   False
## Reporting the count of True and False values
results = df.groupby('boolean').size()


False    8
True     2

CodePudding user response:

If you want to obtain the count not in pandas way, you can try this.

result = []
count = np.nan
for i in df['bool']:
    if i == True:
        count = 0
    if i == False:
        count  = 1
    elif i == False:

Out[4]: [nan, nan, 0, 1, 2, 3, 0, 1, 0, 0]

df['count'] = result

CodePudding user response:

Use a GroupBy.cumcount and mask with where:

g = df['bool'].cumsum()
df['count'] = df['bool'].groupby(g).cumcount().where(g.gt(0))


g = df['bool'].cumsum()
df['count'] = (df['bool'].groupby(g).cumcount()


    bool  count
0  False    NaN
1  False    NaN
2   True    0.0
3  False    1.0
4  False    2.0
5   True    0.0
6   True    0.0
7  False    1.0
8  False    2.0
9  False    3.0

CodePudding user response:

If you mean the sum of all the elements in count then you can do it this way:

Count_Total = df['count'].sum()
  • Related