How can we count items greater than a value and less than a value?-CodePudding

I have this DF.

import pandas as pd
import scipy.optimize as sco 

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

data = [['ATTICA',1,2,590,680],['ATTICA',1,2,800,1080],['AVON',14,2,950,1250],['AVON',15,3,500,870],['AVON',20,4,1350,1700]]
df = pd.DataFrame(data, columns=['cities','min_workers','max_workers','min_minutes','max_minutes'])
df

df['Non_HT_Outages'] = (df['min_workers'] < 15).groupby(df['cities']).transform('count')
df['HT_Outages'] = (df['min_workers'] >= 15).groupby(df['cities']).transform('count')
df

I am trying to count items in a column named 'min_workers' and if <15, put into column 'Non-HT' but if >=15, put into column 'HT'. My counts seem to be off.

CodePudding user response：

If you change count to sum in the transforms it will work.

This is because (df['min_workers'] < 15).groupby(df['cities']) creates a boolean list for each city, with True entries when the condition is met and False entries when it is not.

count gives the count of all the entries, both true and false. sum just counts the true ones (treating the trues and falses as 1s and 0s, and summing them).