Using IF function in Pandas dataframe-CodePudding

Ive tried to find the OLDEST FEMALE from csv dataset, but I dont know how. Im pretty new to Python and Pandas. I clearly dont know how to use if function here.

import pandas as pd

df = pd.read_csv("people.csv", usecols=['gender', 'age'])

I tried to use something like this

print(df[df["gender"].isin(["F"])].df.age.max())

or like this

if df[df["gender"].all(["F"])] :
print(df.age.max())

even tried this

print(df.loc[df['gender'] == 'F'].max())

but this was before I found the oldest 'M' is the same age as the oldest 'F'

but still cant figure out how to find the oldest female

EDIT : I have to find the oldest female from imported dataset, not to create one. Thank you.

EDIT 2 : Sorry for bothering, I just found out, that the oldest M in my csv have the same age as the oldest F in my csv. This is embarassing

CodePudding user response：

You don't actually need an if statement in this context. See below:

import numpy as np
import pandas as pd

df = pd.DataFrame({'gender': ['M', 'F', 'F','F','M'],
      'age': [99,12,45,98,23]})

# Result
print(df[df['gender'] == 'F']['age'].max())

This should give you what you are looking for. Also, don't forget to indent the next line after an if statement.

CodePudding user response：

You can try this. First group by gender and get max values. Then get the age from it for Females.

import pandas as pd
df = pd.DataFrame([['F',20],['F',30], ['M',20]], columns=['gender', 'age'])

df = df.groupby('gender').max().reset_index()
print(df[df['gender'] == 'F'].iloc[0]['age'])

Output is 30 in this example

CodePudding user response：

df = pd.DataFrame({'gender':['F', 'M', 'F', 'M','F', 'M'],'age': [12, 33, 43, 22, 18, 16]})

oldest_female = df.loc[df['gender'] == 'F'].max()

print(oldest_female['age'])

CodePudding user response：

To find the row of the oldest female in the data set you can filter your dataframe to only females, the use idxmax to find the index:

df.loc[df.query('gender == "F"')['age'].idxmax()]

This will return the first row in your dataset with a max age of gender 'F'.