Home > Software design >  Pandas filter not containing a word
Pandas filter not containing a word

Time:09-27

I want to use pandas filter to drop columns that contain the string "delta".

Example dataframe:

import pandas as pd
df = pd.DataFrame(dict(x=[1], x_delta=[2]))

I want to drop all the columns containing the string delta. Keep in mind that the dataframe may have many more columns, this has to be general. I'm thinking about using the filter method but I'm not being able to do the negation properly.

Thanks for your help!

This hasn't worked for me:

def not_delta(df):
    """Drop the columns that contain the word delta"""
    return df.filter(regex="(?!delta)")

CodePudding user response:

Try this...

df = pd.DataFrame({"delta1": [1], "delta2": [2], "sdf": [3]})

col_drop = [col for col in df.columns if "delta" in col]

df1 = df.drop(col_drop,axis=1)

#Output of df1
   sdf
0    3

Hope this Helps...

CodePudding user response:

I would do it following way

import pandas as pd
df = pd.DataFrame(dict(x=[1], x_delta=[2]))
todrop = [i for i in df.columns if 'delta' in i]
df.drop(columns=todrop,inplace=True)
print(df)

output

   x
0  1

CodePudding user response:

Yes filter should do, you can use below to remove columns whose name contains 'delta':

df.filter(regex='^((?!delta).)*$', axis=1)

CodePudding user response:

df[  [col   for col in df   if "delta" not in col]  ]

(Extra spaces are only for emphasizing individual parts – a list comprehension and 3 parts in it.)


The explanation:

df itself is an iterable; it iterates over column names: [col for col in df]

Then we add the if "delta" not in col condition into this list comprehension to keep only appropriate columns.

CodePudding user response:

You can try something like this:

req_cols = [_col for _col in df.columns if not _col.__contains__('delta')]
df = df[req_cols].copy()
  • Related