Home > Back-end >  Grab certain values from dataframe column and making new dataframe in python
Grab certain values from dataframe column and making new dataframe in python

Time:10-27

A column in my DataFrame is labeled as Occupation. In that column, Real Estate is represented in several different ways. These are the three ways it's represented:

RealEstate
REALESTATE
RealEstateDeveloper
Other occupations I don't want

I want to pull every iteration and variation of Real Estate and put it into it's own DataFrame. This is what I have:

dfRealEstate = df[(df.Occupation == 'RealEstate') | (df.Occupation == 'REALESTATE') | (df.Occupation == 'RealEstateDeveloper')]

I get a blank dataframe. My output should look like this:

col1

RealEstate
RealEstate
REALESTATE
REALESTATE
REALESTATE
REALESTATE
RealEstateDeveloper
RealEstateDeveloper
RealEstateDeveloper

CodePudding user response:

Try to clean your rows before:

df['Occupation'].str.strip().str.casefold().str.contains('realestate')

CodePudding user response:

Try to create a mask from a list of variations:

mask_realEstate = df.loc[:,"Occupation"].isin(['RealEstate','REALESTATE','RealEstateDeveloper'])

Now, use to mask with .loc to create a new DataFrame:

dfRealEstate = df.loc[mask_realEstate,"Occupation"]
  • Related