Ive tried to find the OLDEST FEMALE from csv dataset, but I dont know how. Im pretty new to Python and Pandas. I clearly dont know how to use if function here.
import pandas as pd
df = pd.read_csv("people.csv", usecols=['gender', 'age'])
I tried to use something like this
print(df[df["gender"].isin(["F"])].df.age.max())
or like this
if df[df["gender"].all(["F"])] :
print(df.age.max())
even tried this
print(df.loc[df['gender'] == 'F'].max())
but this was before I found the oldest 'M' is the same age as the oldest 'F'
but still cant figure out how to find the oldest female
EDIT : I have to find the oldest female from imported dataset, not to create one. Thank you.
EDIT 2 : Sorry for bothering, I just found out, that the oldest M in my csv have the same age as the oldest F in my csv. This is embarassing
CodePudding user response:
You don't actually need an if statement in this context. See below:
import numpy as np
import pandas as pd
df = pd.DataFrame({'gender': ['M', 'F', 'F','F','M'],
'age': [99,12,45,98,23]})
# Result
print(df[df['gender'] == 'F']['age'].max())
This should give you what you are looking for. Also, don't forget to indent the next line after an if statement.
CodePudding user response:
You can try this. First group by gender
and get max values. Then get the age
from it for Females.
import pandas as pd
df = pd.DataFrame([['F',20],['F',30], ['M',20]], columns=['gender', 'age'])
df = df.groupby('gender').max().reset_index()
print(df[df['gender'] == 'F'].iloc[0]['age'])
Output is 30
in this example
CodePudding user response:
df = pd.DataFrame({'gender':['F', 'M', 'F', 'M','F', 'M'],'age': [12, 33, 43, 22, 18, 16]})
oldest_female = df.loc[df['gender'] == 'F'].max()
print(oldest_female['age'])
CodePudding user response:
To find the row of the oldest female in the data set you can filter your dataframe to only females, the use idxmax to find the index:
df.loc[df.query('gender == "F"')['age'].idxmax()]
This will return the first row in your dataset with a max age of gender 'F'.