Home > Blockchain >  Using IF function in Pandas dataframe
Using IF function in Pandas dataframe

Time:03-10

Ive tried to find the OLDEST FEMALE from csv dataset, but I dont know how. Im pretty new to Python and Pandas. I clearly dont know how to use if function here.

import pandas as pd

df = pd.read_csv("people.csv", usecols=['gender', 'age'])

I tried to use something like this

print(df[df["gender"].isin(["F"])].df.age.max())

or like this

if df[df["gender"].all(["F"])] :
print(df.age.max())

even tried this

print(df.loc[df['gender'] == 'F'].max())

but this was before I found the oldest 'M' is the same age as the oldest 'F'

but still cant figure out how to find the oldest female

EDIT : I have to find the oldest female from imported dataset, not to create one. Thank you.

EDIT 2 : Sorry for bothering, I just found out, that the oldest M in my csv have the same age as the oldest F in my csv. This is embarassing

CodePudding user response:

You don't actually need an if statement in this context. See below:

import numpy as np
import pandas as pd

df = pd.DataFrame({'gender': ['M', 'F', 'F','F','M'],
      'age': [99,12,45,98,23]})

# Result
print(df[df['gender'] == 'F']['age'].max())

This should give you what you are looking for. Also, don't forget to indent the next line after an if statement.

CodePudding user response:

You can try this. First group by gender and get max values. Then get the age from it for Females.

import pandas as pd
df = pd.DataFrame([['F',20],['F',30], ['M',20]], columns=['gender', 'age'])

df = df.groupby('gender').max().reset_index()
print(df[df['gender'] == 'F'].iloc[0]['age'])

Output is 30 in this example

CodePudding user response:

df = pd.DataFrame({'gender':['F', 'M', 'F', 'M','F', 'M'],'age': [12, 33, 43, 22, 18, 16]})

oldest_female = df.loc[df['gender'] == 'F'].max()

print(oldest_female['age'])

CodePudding user response:

To find the row of the oldest female in the data set you can filter your dataframe to only females, the use idxmax to find the index:

df.loc[df.query('gender == "F"')['age'].idxmax()]

This will return the first row in your dataset with a max age of gender 'F'.

  • Related