**Using Pandas 1.4.2, Python 3.9.12 I have a data frame as follows:
Neighbourhood No-show
0 JARDIM DA PENHA No
1 JARDIM DA PENHA Yes
2 MATA DA PRAIA No
3 PONTAL DE CAMBURI No
4 JARDIM DA PENHA No
5 MARIA ORTIZ Yes
6 MARIA ORTIZ Yes
7 MATA DA PRAIA Yes
8 PONTAL DE CAMBURI No
9 MARIA ORTIZ No
How would I use groupby to get the total(count) of 'Yes' and total(count) of 'No' grouped by each 'Neighbourhood'? I keep getting 'NoNoYesNo' if I use .sum() and if I can get these grouped correctly by Neighbourhood I think I can graph much easier.
This data frame is truncated as there are numerous other columns but these are the only 2 I care about for this exercise.
CodePudding user response:
Use df.groupby()
as follows:
totals = df.groupby(['Neighbourhood','No-show'])['No-show'].count()
print(totals)
Neighbourhood No-show
JARDIM DA PENHA No 2
Yes 1
MARIA ORTIZ No 1
Yes 2
MATA DA PRAIA No 1
Yes 1
PONTAL DE CAMBURI No 2
Name: No-show, dtype: int64
Good point raised by @JonClements: you might want to add .unstack(fill_value=0)
to that. So:
totals_unstacked = df.groupby(['Neighbourhood','No-show'])['No-show'].count().unstack(fill_value=0)
print(totals_unstacked)
No-show No Yes
Neighbourhood
JARDIM DA PENHA 2 1
MARIA ORTIZ 1 2
MATA DA PRAIA 1 1
PONTAL DE CAMBURI 2 0
CodePudding user response:
You can use:
df[['Neighbourhood', 'No-show']].value_counts()