Home > Enterprise >  Dropping Pandas Dataframe Columns based on column name
Dropping Pandas Dataframe Columns based on column name

Time:10-19

I have a Pandas dataframe df with extraneous information. The extraneous information is stored in the columnns that have names containing "PM". I would like to remove these columns but I'm not sure how to. Below is my attempt to do this. However, I received this error message: AttributeError: 'numpy.float64' object has no attribute 'PM'. I'm not sure how to interpret this error message. I also don't understand why numpy is mentioned in the message since the dataframe df is a pandas object.

for j in range(0,len(df.columns)-1):
 df.iloc[0,j].str.contains("PM"):
   df.drop(j, axis=1)

AttributeError: 'numpy.float64' object has no attribute 'PM'

CodePudding user response:

If need remove columns with PM it is same like select columns without PM, so use:

df.loc[:, ~df.columns.str.contains("PM")]

Out of box solution:

df.drop(df.filter(like='PM').columns, axis=1)

CodePudding user response:

Use regex with filter:

df.filter(regex='^((?!PM).)*$')

This is the shortest solution here.

CodePudding user response:

Using an empty dataframe

df = pd.DataFrame(columns=['a','b','ABCPMYXZ','QWEPMQWE','c','d'])
df
df = df[[i for i in df.columns if not 'PM' in i]]
df

enter image description here

CodePudding user response:

Based on what I understood, you want to delete columns, so initially you should store all the column names in a list. Next remove elements from list if 'PM' is present, then finally drop the columns.

columns = list(df.columns.values)
[columns.remove(col) for col in columns if 'PM' in col]
df.drop(columns=columns, axis=1, inplace=True)
  • Related