Home > Mobile >  Pandas Groupby Newbie Conundrum
Pandas Groupby Newbie Conundrum

Time:07-09

**Using Pandas 1.4.2, Python 3.9.12 I have a data frame as follows:

 Neighbourhood            No-show
0   JARDIM DA PENHA         No
1   JARDIM DA PENHA         Yes
2   MATA DA PRAIA           No
3   PONTAL DE CAMBURI       No
4   JARDIM DA PENHA         No
5   MARIA ORTIZ             Yes
6   MARIA ORTIZ             Yes
7   MATA DA PRAIA           Yes
8   PONTAL DE CAMBURI       No
9   MARIA ORTIZ             No

How would I use groupby to get the total(count) of 'Yes' and total(count) of 'No' grouped by each 'Neighbourhood'? I keep getting 'NoNoYesNo' if I use .sum() and if I can get these grouped correctly by Neighbourhood I think I can graph much easier.

This data frame is truncated as there are numerous other columns but these are the only 2 I care about for this exercise.

CodePudding user response:

Use df.groupby() as follows:

totals = df.groupby(['Neighbourhood','No-show'])['No-show'].count()
print(totals)

Neighbourhood      No-show
JARDIM DA PENHA    No         2
                   Yes        1
MARIA ORTIZ        No         1
                   Yes        2
MATA DA PRAIA      No         1
                   Yes        1
PONTAL DE CAMBURI  No         2
Name: No-show, dtype: int64

Good point raised by @JonClements: you might want to add .unstack(fill_value=0) to that. So:

totals_unstacked = df.groupby(['Neighbourhood','No-show'])['No-show'].count().unstack(fill_value=0)
print(totals_unstacked)

No-show            No  Yes
Neighbourhood             
JARDIM DA PENHA     2    1
MARIA ORTIZ         1    2
MATA DA PRAIA       1    1
PONTAL DE CAMBURI   2    0

CodePudding user response:

You can use:

df[['Neighbourhood', 'No-show']].value_counts()
  • Related