Home > Mobile >  Counting the number of entries in a dataframe that satisfies multiple criteria
Counting the number of entries in a dataframe that satisfies multiple criteria

Time:03-04

I have a dataframe with 9 columns, two of which are gender and smoker status. Every row in the dataframe is a person, and each column is their entry on a particular trait. I want to count the number of entries that satisfy the condition of being both a smoker and is male. I have tried using a sum function:

maleSmoke = sum(1 for i in data['gender'] if i is 'm' and i in data['smoker'] if i is 1 )

but this always returns 0. This method works when I only check one criteria however and I can't figure how to expand it to a second. I also tried writing a function that counted its way through every entry into the dataframe but this also returns 0 for all entries.

def countSmokeGender(df):
    maleSmoke = 0
    femaleSmoke = 0
    maleNoSmoke = 0
    femaleNoSmoke = 0
    
    for i in range(20000):
        if df['gender'][i] is 'm' and df['smoker'][i] is 1:
            maleSmoke = maleSmoke   1
        if df['gender'][i] is 'f' and df['smoker'][i] is 1:
            femaleSmoke = femaleSmoke   1
        if df['gender'][i] is 'm' and df['smoker'][i] is 0:
            maleNoSmoke = maleNoSmoke   1
        if df['gender'][i] is 'f' and df['smoker'][i] is 0:
            femaleNoSmoke = femaleNoSmoke   1
    
    return maleSmoke, femaleSmoke, maleNoSmoke, femaleNoSmoke

I've tried pulling out the data sets as numpy arrays and counting those but that wasn't working either.

CodePudding user response:

Are you using pandas?

Assuming you are, you can simply do this:

# How many male smokers
len(df[(df['gender']=='m') & (df['smoker']==1)])
# How many female smokers
len(df[(df['gender']=='f') & (df['smoker']==1)])
# How many male non-smokers
len(df[(df['gender']=='m') & (df['smoker']==0)])
# How many female non-smokers
len(df[(df['gender']=='f') & (df['smoker']==0)])

Or, you can use groupby:

df.groupby(['gender'])['smoker'].sum()
  • Related