I have created a pandas dataframe called df
using this code:
import numpy as np import pandas as pd
ds = {'col1' : ['1','3/','4'], 'col2':['A','!B','@C']}
df =pd.DataFrame(data=ds)
The dataframe looks like this:
print(df)
col1 col2
0 1 A
1 3/ !B
2 4 @C
The columns contain some special characters (/
and @
) that I need to replace with a blank space.
Now, I have a list of special characters:
listOfSpecialChars = '¬`!"£$£#/,. *><@|"'
How can I replace any of the special characters listed in listOfSpecialChars
with a blank space, any time I encounter them at any point in a dataframe, for any columns?
At the moment I am dealing with a 100K-record dataframe with 560 columns, so I can't write a piece of code for each variable.
CodePudding user response:
You can use apply
with str.replace
:
import re
chars = ''.join(map(re.escape, listOfSpecialChars))
df2 = df.apply(lambda c: c.str.replace(f'[{chars}]', '', regex=True))
df2 = df.stack().str.replace(f'[{chars}]', '', regex=True).unstack()
output:
col1 col2
0 1 A
1 3 B
2 4 C