Home > database >  Functions for finding the average
Functions for finding the average

Time:02-26

I am a newbie here. English is not my native language so excuse any grammatical mistakes. I need to compute the average BMI per hair colour using the df.

 # 1. Here we import pandas
import pandas as pd
# 2. Here we import numpy
import numpy as np
np.random.seed(0)
df = pd.DataFrame({'Age':[18, 21, 28, 19, 23, 22, 18, 24, 25, 20],
                   'Hair colour':['Blonde', 'Brown', 'Black', 'Blonde', 'Blonde', 'Black','Brown', 'Brown', 'Black', 'Black'],
                   'Length (in cm)':np.random.normal(175, 10, 10).round(1),
                   'Weight (in kg)':np.random.normal(70, 5, 10).round(1)},
                index = ['Leon', 'Mirta', 'Nathan', 'Linda', 'Bandar', 'Violeta', 'Noah', 'Niji', 'Lucy', 'Mark'],)

I should get vectors with names.

Firstly, I wrote the function of BMI:

# function

def BMI():
    df['weight (in kg)'] / (df['Length']/100)**2

However, I don't know what my next step is.

Can you advise me on how to find the average BMI per hair colour?

CodePudding user response:

You can use df.groupby() which is a functionality within Pandas

For your particular case, you may use

df.groupby('Hair colour').mean()['BMI']

which gives output

Hair colour
Black     23.003356
Blonde    18.806844
Brown     23.271460
Name: BMI, dtype: float64

CodePudding user response:

You can either filter or groupby.

Your BMI function does not make sense as you are:

  • referencing columns that do not exist
  • do nothing with its return so it gets discarded

Filtering:

import pandas as pd
import numpy as np

np.random.seed(0) 

df = pd.DataFrame({'Age':[18, 21, 28, 19, 23, 22, 18, 24, 25, 20],
                'Hair colour':['Blonde', 'Brown', 'Black', 'Blonde', 
                               'Blonde', 'Black','Brown', 'Brown', 'Black', 
                               'Black'],
                'Length (in cm)':np.random.normal(175, 10, 10).round(1),
                'Weight (in kg)':np.random.normal(70, 5, 10).round(1)},
                index = ['Leon', 'Mirta', 'Nathan', 'Linda', 'Bandar', 
                         'Violeta', 'Noah', 'Niji', 'Lucy', 'Mark'],)

print(df)

# calculate BMI - not as function, using correct column names
df["BMI"] = df['Weight (in kg)'] / (df['Length (in cm)']/100)**2  

print(df)

# filter to brown
brown =  df[df["Hair colour"] == "Brown"]
print(brown)
print(brown["BMI"].mean())

Output:

# calculated BMI
         Age Hair colour  Length (in cm)  Weight (in kg)        BMI
Leon      18      Blonde           192.6            70.7  19.059296
Mirta     21       Brown           179.0            77.3  24.125339
Nathan    28       Black           184.8            73.8  21.609884
Linda     19      Blonde           197.4            70.6  18.118006
Bandar    23      Blonde           193.7            72.2  19.243229
Violeta   22       Black           165.2            71.7  26.272359
Noah      18       Brown           184.5            77.5  22.767165
Niji      24       Brown           173.5            69.0  22.921875
Lucy      25       Black           174.0            71.6  23.649095
Mark      20       Black           179.1            65.7  20.482087

# filtered output
       Age Hair colour  Length (in cm)  Weight (in kg)        BMI
Mirta   21       Brown           179.0            77.3  24.125339
Noah    18       Brown           184.5            77.5  22.767165
Niji    24       Brown           173.5            69.0  22.921875

# avg BMI
23.271459786871446

Groupby:

# use groupby 
grouped = df.groupby('Hair colour')
print(*grouped, sep="\n\n")

# https://stackoverflow.com/questions/51091331 
print(grouped.get_group("Brown")["BMI"].mean()) 

Output:

# grouped output
('Black',          Age Hair colour  Length (in cm)  Weight (in kg)        BMI
Nathan    28       Black           184.8            73.8  21.609884
Violeta   22       Black           165.2            71.7  26.272359
Lucy      25       Black           174.0            71.6  23.649095
Mark      20       Black           179.1            65.7  20.482087)

('Blonde',         Age Hair colour  Length (in cm)  Weight (in kg)        BMI
Leon     18      Blonde           192.6            70.7  19.059296
Linda    19      Blonde           197.4            70.6  18.118006
Bandar   23      Blonde           193.7            72.2  19.243229)

('Brown',        Age Hair colour  Length (in cm)  Weight (in kg)        BMI
Mirta   21       Brown           179.0            77.3  24.125339
Noah    18       Brown           184.5            77.5  22.767165
Niji    24       Brown           173.5            69.0  22.921875)

# avg BMI
23.271459786871446
  • Related