I have a Pandas dataframe that has columns A, B, and C, and I would like to count the occurrences of the A column, grouped by B and C. DF Looks like this:
index | A | B | C |
---|---|---|---|
0 | 1 | 4 | 4 |
1 | 0 | 4 | 4 |
2 | 0 | 2 | 1 |
3 | 0 | 4 | 3 |
4 | 1 | 2 | 3 |
5 | 0 | 2 | 3 |
6 | 0 | 1 | 1 |
7 | 0 | 1 | 1 |
8 | 0 | 1 | 1 |
9 | 1 | 1 | 1 |
In MySQL I would do:
SELECT
B,
C,
COUNT(IF(A=1, 1,0)) AS CountEquals1,
COUNT(IF(A=0, 1,0)) AS CountEquals0
FROM
my_table
GROUP BY
B, C;
How do I perform this same MySQL Count/GroupBy query on my Pandas Dataframe?
edit: Expected output would be something like:
B | C | CountAEquals1 | CountAEquals0 |
---|---|---|---|
4 | 4 | 1 | 1 |
2 | 1 | 0 | 1 |
4 | 3 | 0 | 1 |
2 | 3 | 1 | 1 |
1 | 1 | 1 | 3 |
CodePudding user response:
Try this:
(df.loc[:,'A':'C'].value_counts()
.unstack(level=0)
.rename('CountAEquals{}'.format,axis=1)
.fillna(0)
.sort_index(axis=1,ascending=False)
.rename_axis(None,axis=1)
.reset_index())
Output:
B C CountAEquals1 CountAEquals0
0 1 1 1.0 3.0
1 2 1 0.0 1.0
2 2 3 1.0 1.0
3 4 3 0.0 1.0
4 4 4 1.0 1.0