Home > Blockchain >  How do meet a specific criteria for column in panda data frame as well as checking whether the value
How do meet a specific criteria for column in panda data frame as well as checking whether the value

Time:07-25

Hello am doing my assignment and I have encountered a question that I can't answer. The question is to create another DataFrame df_urban consisting of all columns of the original dataset but comprising of only applicants with Urban status in their Property_Area attribute (exclude Rural and Semiurban) with ApplicantIncome of at least S$10,000. Reset the row index and display the last 10 rows of this DataFrame.

Picture of the question

My code however will not meet the criteria of Applicant Income of at least 10,000 as well as only urban status in the area.

df_urban = df df_urban.iloc[-10:[11]]

I Was wondering what is the solution to the question. Data picture

CodePudding user response:

you can use the '&' operator to limit the data by multiple column conditions:

df_urban = df[(df[col]==<condition>) & (df[col] >= <condition>)]

CodePudding user response:

Following is a simple code snippet performing a proof of principle in extracting a subset of the primary data frame to produce a subset data frame of only "Urban" locations.

import pandas as pd

df=pd.read_csv('Applicants.csv',delimiter='\t')

print(df)

df_urban = df[(df['Property_Area'] == 'Urban')]

print(df_urban)

Using a simply built CSV file, here is a sample of the output.

       ApplicantIncome  CoapplicantIncome  LoanAmount  Loan_Term  Credit_History Property_Area
0             4583               1508      128000        360               1         Rural
1             1222                  0       55000        360               1         Rural
2             8285                  0       64000        360               1         Urban
3             3988               1144       75000        360               1         Rural
4             2588                  0       84700        360               1         Urban
5             5248                  0       48550        360               1         Rural
6             7488                  0      111000        360               1     SemiUrban
7             3252               1112       14550        360               1         Rural
8             1668                  0       67500        360               1         Urban
   ApplicantIncome  CoapplicantIncome  LoanAmount  Loan_Term  Credit_History Property_Area
2             8285                  0       64000        360               1         Urban
4             2588                  0       84700        360               1         Urban
8             1668                  0       67500        360               1         Urban

Hope that helps.

Regards.

CodePudding user response:

See below. I leave it to you to work out how to reset index. You might want to look at .tail() to display last rows.

df_urban = df[(df['ApplicantIncome'] > 10000) & (df['Property_Area'] == 'Urban')]
  • Related