I have a pandas df
of the following format
MATERIAL DATE HIGH LOW
AAA 2022-01-01 10 0
AAA 2022-01-02 0 0
AAA 2022-01-03 5 2
BBB 2022-01-01 0 0
BBB 2022-01-02 10 5
BBB 2022-01-03 8 4
I am looking to transform it such that I land up with the below result
MATERIAL HIGH_COUNT LOW_COUNT
AAA 2 1
BBB 2 2
Essentially for "HIGH_COUNT"
and "LOW_COUNT"
I want to count the number of occurrences that column was greater than 0, grouped by "MATERIAL"
.
I have tried to do df.groupby(['MATERIAL']).agg<xxx>
but I am unsure of the agg
function to use here.
Edit:
I used
df.groupby(['MATERIAL']).agg({'HIGH':'count', 'LOW':'count})
but this counts even the 0
rows.
CodePudding user response:
You could create a boolean DataFrame and groupby
sum
:
out = df[['HIGH', 'LOW']].gt(0).groupby(df['MATERIAL']).sum().add_suffix('_COUNT').reset_index()
Output:
MATERIAL HIGH_COUNT LOW_COUNT
0 AAA 2 1
1 BBB 2 2