expected result table
bool count
0 FALSE
1 FALSE
2 TRUE 0
3 FALSE 1
4 FALSE 2
5 FALSE 3
6 TRUE 0
7 FALSE 1
8 TRUE 0
9 TRUE 0
How to calculate the value of column 'count'
CodePudding user response:
Here you go:
# create bool dataframe
df = pd.DataFrame(dict(bool_= [0, 0, 1, 0, 0, 1, 1, 0, 0, 0]), dtype= bool)
df.index = list("abcdefghij")
# create a new Series unique integers to associate a group for the rows
# between True values
ix = pd.Series(range(df.shape[0])).where(df.bool_.values, np.nan).ffill().values
# if the first rows are False, they will be NaNs and shouldn't be
# counted so only perform groupby and cumcount() for what is notna
notna = pd.notna(ix)
df["count"] = df[notna].groupby(ix[notna]).cumcount()
>>> df
bool_ count
a False NaN
b False NaN
c True 0.0
d False 1.0
e False 2.0
f True 0.0
g True 0.0
h False 1.0
i False 2.0
j False 3.0
CodePudding user response:
You can try groupby the cumsum of bool
column then transform
a customize function to check if first element in each group is True
df['m'] = df['bool'].cumsum()
df['out'] = (df.groupby(df['bool'].cumsum())
['bool'].transform(lambda col: range(len(col)) if col.iloc[0] else [pd.NA]*len(col)))
print(df)
bool count m out
0 False NaN 0 <NA>
1 False NaN 0 <NA>
2 True 0.0 1 0
3 False 1.0 1 1
4 False 2.0 1 2
5 False 3.0 1 3
6 True 0.0 2 0
7 False 1.0 2 1
8 True 0.0 3 0
9 True 0.0 4 0
CodePudding user response:
I think your question is not clear. We need a little more context and objectives to work with here.
Let's assume that you have a dataframe of Boolean values [True, False], and you wish to compute a count of how many "True" and how many "False"
import pandas as pd
import random
## Randomly generating Boolean values to populate a dataframe
choices = [ 'True', 'False' ]
df = pd.DataFrame(index = range(10), columns = ['boolean'])
df['boolean'] = df['boolean'].apply(lambda x: random.choice(choices))
Randomly generated data
boolean
0 False
1 False
2 False
3 True
4 False
5 False
6 False
7 True
8 False
9 False
## Reporting the count of True and False values
results = df.groupby('boolean').size()
print(results)
Results
boolean
False 8
True 2
CodePudding user response:
If you want to obtain the count not in pandas
way, you can try this.
result = []
count = np.nan
for i in df['bool']:
if i == True:
count = 0
result.append(count)
if i == False:
count = 1
result.append(count)
elif i == False:
result.append(np.nan)
result
Out[4]: [nan, nan, 0, 1, 2, 3, 0, 1, 0, 0]
df['count'] = result
CodePudding user response:
Use a GroupBy.cumcount
and mask with where
:
g = df['bool'].cumsum()
df['count'] = df['bool'].groupby(g).cumcount().where(g.gt(0))
Alternative:
g = df['bool'].cumsum()
df['count'] = (df['bool'].groupby(g).cumcount()
.where(df['bool'].cummax())
)
Output:
bool count
0 False NaN
1 False NaN
2 True 0.0
3 False 1.0
4 False 2.0
5 True 0.0
6 True 0.0
7 False 1.0
8 False 2.0
9 False 3.0
CodePudding user response:
If you mean the sum of all the elements in count then you can do it this way:
Count_Total = df['count'].sum()