i've been wondering how to remove the special character '^' in a python string , it seems like it doesn't count like the other special characters.
I actually was trying to remove some special characters in a dataframe by using this code below :
def remove_special_characters(text, remove_digits=True):
text=re.sub(r'[^a-zA-z0-9\s] ','',text)
return text
df['review']=df['review'].apply(remove_special_characters)
but the symbol '^' is still appearing in my data , do you know some code to remove it please ?
CodePudding user response:
You can escape special characters:
r'[\^a-zA-z0-9\s] '
But the use case you're tackling is already addressed by translate(), without any need to resort to power tools like regexes.
https://docs.python.org/3/library/stdtypes.html#str.maketrans
You're incurring the cost of parsing / compiling the regex N times, when a single time would suffice. Consider defining this:
pattern = re.compile(r'[\^a-zA-z0-9\s] ')
CodePudding user response:
It's actually still not working. What i'm searching for , is to get rid of the character '^' in a data , for example in the string 'bat^tle' i want it to be 'battle'.