Hello am doing my assignment and I have encountered a question that I can't answer. The question is to create another DataFrame df_urban consisting of all columns of the original dataset but comprising of only applicants with Urban status in their Property_Area attribute (exclude Rural and Semiurban) with ApplicantIncome of at least S$10,000. Reset the row index and display the last 10 rows of this DataFrame.
My code however will not meet the criteria of Applicant Income of at least 10,000 as well as only urban status in the area.
df_urban = df df_urban.iloc[-10:[11]]
I Was wondering what is the solution to the question. Data picture
CodePudding user response:
you can use the '&' operator to limit the data by multiple column conditions:
df_urban = df[(df[col]==<condition>) & (df[col] >= <condition>)]
CodePudding user response:
Following is a simple code snippet performing a proof of principle in extracting a subset of the primary data frame to produce a subset data frame of only "Urban" locations.
import pandas as pd
df=pd.read_csv('Applicants.csv',delimiter='\t')
print(df)
df_urban = df[(df['Property_Area'] == 'Urban')]
print(df_urban)
Using a simply built CSV file, here is a sample of the output.
ApplicantIncome CoapplicantIncome LoanAmount Loan_Term Credit_History Property_Area
0 4583 1508 128000 360 1 Rural
1 1222 0 55000 360 1 Rural
2 8285 0 64000 360 1 Urban
3 3988 1144 75000 360 1 Rural
4 2588 0 84700 360 1 Urban
5 5248 0 48550 360 1 Rural
6 7488 0 111000 360 1 SemiUrban
7 3252 1112 14550 360 1 Rural
8 1668 0 67500 360 1 Urban
ApplicantIncome CoapplicantIncome LoanAmount Loan_Term Credit_History Property_Area
2 8285 0 64000 360 1 Urban
4 2588 0 84700 360 1 Urban
8 1668 0 67500 360 1 Urban
Hope that helps.
Regards.
CodePudding user response:
See below. I leave it to you to work out how to reset index. You might want to look at .tail() to display last rows.
df_urban = df[(df['ApplicantIncome'] > 10000) & (df['Property_Area'] == 'Urban')]