Home > Software engineering >  Remove records from pandas Dataframe subject to condition
Remove records from pandas Dataframe subject to condition

Time:07-22

I have created the following pandas dataframe:

import pandas as pd

ds = {'col1':['a','/','b','c'], 'col2' : [1,2,3,4]}

df = pd.DataFrame(data=ds)
print(df)

which looks like this:

  col1  col2
0    a     1
1    /     2
2    b     3
3    c     4

I have a list of special characters ¬!"£$£#/ *><@|` defined like this:

import re

chars = '¬`!"£$£#/\ *><@|'
regex = f'[{"".join(map(re.escape, chars))}]'

From the dataframe above, I need to remove only the records for which col1 contains any of the special characters included in the regex.

From the example above, the resulting dataframe should look like this:

  col1  col2
0    a     1
1    b     3
2    c     4

Does anyone know how to do it?

CodePudding user response:

You can use contains to get all rows that contain the regex and then negate:

df[~df.col1.str.contains(regex)]

Result:

  col1  col2
0    a     1
2    b     3
3    c     4
  • Related