Home > OS >  Regex to remove special character based on a condition
Regex to remove special character based on a condition

Time:08-14

I am using regex to remove special characters from a string in Python.

import re
txt = "This is a sample text(s). ANother sample line (testing)"

print (re.sub('[^A-Za-z0-9] ', ' ', txt))

OP:

'This is a sample text s ANother sample line testing '

Expected OP:

'This is a sample texts ANother sample line testing '

If there is no space between the word and the special character (, the Op should also not have the space. IN the given example the correct op is texts and not text s

Any suggestions will be helpful.

CodePudding user response:

the Op should also not have the space

However, re.sub('[^A-Za-z0-9] ', ' ', txt) says that each special character should be replaced by the "space" character ' '.

You can replace using the empty string '' and include the space itself into the list of "non-special" characters to avoid deleting all spaces:

>>> import re
>>> txt = "This is a sample text(s). ANother sample line (testing)"
#       add space here V     V replace with nothing
>>> re.sub('[^A-Za-z0-9 ] ', '', txt)
'This is a sample texts ANother sample line testing'
  • Related