I have created the following pandas dataframe:
import pandas as pd
ds = {'col1':['a','/','b','c'], 'col2' : [1,2,3,4]}
df = pd.DataFrame(data=ds)
print(df)
which looks like this:
col1 col2
0 a 1
1 / 2
2 b 3
3 c 4
I have a list of special characters ¬
!"£$£#/ *><@|` defined like this:
import re
chars = '¬`!"£$£#/\ *><@|'
regex = f'[{"".join(map(re.escape, chars))}]'
From the dataframe above, I need to remove only the records for which col1
contains any of the special characters included in the regex.
From the example above, the resulting dataframe should look like this:
col1 col2
0 a 1
1 b 3
2 c 4
Does anyone know how to do it?
CodePudding user response:
You can use contains
to get all rows that contain the regex and then negate:
df[~df.col1.str.contains(regex)]
Result:
col1 col2
0 a 1
2 b 3
3 c 4