How do i know the age of person using groupby in Pandas-CodePudding

I am working on a dataset that knows why a patient didn't meet up with a doctor's appointment. There are many conditions. However, we want to know which affects more.

The dependent variable was initially defined with "YES" and "NO" so I had to redefine as "1" and "0":

df.No_Show[df['No_Show'] == 'Yes'] = '1'
df.No_Show[df['No_Show'] == 'No'] = '0'
df['No_Show'] = pd.to_numeric(df['No_Show'])

again, redefined as:

showed = df.No_Show == 1
No_show = df.No_Show == 0

while trying to know the mean of those who went for appointment by age, using

df.groupby('Age')[showed].mean()

I got an error.

CodePudding user response：

You can try

df[showed].groupby('Age').mean()

CodePudding user response：

import numpy as np
import pandas as pd

df = pd.DataFrame(
    data={
        "No_Show": np.array(np.random.choice([0, 1], 100)),
        "Age": np.random.randint(1, 100, 100),
    }
)
df.groupby("Age")["No_Show"].mean()

Age
1     0.666667
2     1.000000
3     0.000000
4     1.000000
6     1.000000
        ...   
92    1.000000
94    0.500000
95    1.000000
96    0.000000
98    0.500000
Name: No_Show, Length: 63, dtype: float64


df.groupby("No_Show")["Age"].mean()


No_Show
0    53.833333
1    47.346154
Name: Age, dtype: float64