I'm trying to filter the group by count from a pandas dataframe so I only end up with values over a certain amount.
Example dataframe:
Name | Thing | Count |
---|---|---|
Fred | Apple | 1 |
Harry | Banana | 1 |
Sonia | Banana | 1 |
Pete | Apple | 1 |
Tracy | Apple | 1 |
df.groupby(["Thing"]).sum()['Count']
Current output:
Thing | |
---|---|
Apple | 3 |
Banana | 2 |
But my desired output is to only include a Thing if the sum value is more than 2:
Thing | |
---|---|
Apple | 3 |
I can only find examples of filtering df columns when using groupby. How do you filter based on the aggregated values?
CodePudding user response:
Just filter it
out = df.groupby(["Thing"]).sum()['Count'].loc[lambda x : x>2]
CodePudding user response:
You can use .query()
:
df.groupby('Thing').sum().query('Count > 2')['Count']