Home > OS >  How to drop pandas dataframe columns containing special characters
How to drop pandas dataframe columns containing special characters

Time:04-06

How do I drop pandas dataframe columns that contains special characters such as @ / ] [ } { - _ etc.?

For example I have the following dataframe (called df):

enter image description here

I need to drop the columns Name and Matchkey becasue they contain some special characters. Also, how can I specify a list of special characters based on which the columns will be dropped?

For example: I'd like to drop the columns that contain (in any record, in any cell) any of the following special characters:

listOfSpecialCharacters: ¬,`,!,",£,$,£,#,/,\

CodePudding user response:

One option is to use a regex with str.contains and apply, then use boolean indexing to drop the columns:

import re
chars = '¬`!"£$£#/\\'
regex = f'[{"".join(map(re.escape, chars))}]'
# '[¬`!"£\\$£\\#/\\\\]'

df2 = df.loc[:, ~df.apply(lambda c: c.str.contains(regex).any())]

example:

# input
     A    B    C
0  123  12!  123
1  abc  abc  a¬b

# output
     A
0  123
1  abc
  • Related