I would like to have a code in order to remove speciaaal character such as ? from values in a column in a dataframe
I would like to remove from
()?()()() hello world'
the () and the ?.
However wheen I am using the following code Community_final_level_2_description = Community_final_level_2_description.replace('()',' ', regex=True) or Community_final_level_2_description.replace('?',' ', regex=True). It is not working and I have a error raising.
Can someone provide me the proper code ?
CodePudding user response:
(
/)
/?
are special characters for regex. Try to escape them:
df["column"] = df["column"].replace(r"\?|\(\)", "", regex=True)
print(df)
Prints:
column
0 hello world
Dataframe used:
column
0 ()?()()() hello world
CodePudding user response:
If one has a dataframe df
that looks like the following
A B
0 ()?()()() hello world ()?()()() hello world
1 ()?()()() hello world ()?()()() hello world
2 ()?()()() hello world ()?()()() hello world
And one wants to remove the ()
and ?
from every column, one can use pandas.DataFrame.apply
with a custom lambda function as follows
df_new = df.apply(lambda x: [i.replace('()', '').replace('?', '') for i in x])
[Out]:
A B
0 hello world hello world
1 hello world hello world
2 hello world hello world
Notes:
- There are strong opinions on using
.apply
, so one might want to check this: When should I (not) want to use pandas apply() in my code?