Home > OS >  What's the proper regex for removing "?()"
What's the proper regex for removing "?()"

Time:07-15

I am trying to remove ?,() and " from the below string my initial matches are working but for these 3 it not matching kindly suggest what is the issue here

data = "(The rain - :in  ,Spain?)"

regexpat =  re.sub(r"'|,|;|.|:|/?|-|(|)", "", name)

I need all the special chars removed or replaced with blank

I tried with

-|:|\)|\?|

but substitution is breaking with this output *

*"*(*T*h*e* *r*a*i*n* ** **i*n* * *,*S*p*a*i*n***"*
*

Expected Output is

The rain in Spain

trying here - https://regex101.com/r/Ozsnzv/1

CodePudding user response:

import re

data = "(The rain - :in  ,Spain?)"

regexpat = re.sub(r"[',;.:/?()-]", "", data)
regexpat = re.sub(r"\s ", " ", regexpat)

print(regexpat)  # The rain in Spain

CodePudding user response:

You can use a character class to shorten the pattern, and escape the dot to match it literally.

Using /? in an alternation matches an optional /, but you can just add the /to the character class. If you also meant to match ? you can also add that one.

Then change possible double spaces gaps to single spaces.

Note to use data instead of name in the call to re.sub.

import re

data = "(The rain - :in  ,Spain?)"
regexpat = re.sub(r"[',;.:/?()-] ", "", data)

print(' '.join(regexpat.split()))

Output

The rain in Spain

See a Python demo.

CodePudding user response:

A long shot, but you seem to want to replace all punctuation. One way is to use the 'punctuation' attribute from the built-in string library:

import string
data = "(The rain - :in  ,Spain?)"
regexpat  = data.translate(str.maketrans('', '', string.punctuation))
print(' '.join(regexpat.split()))

Prints:

The rain in Spain

Note: The punctuation attribute includes: !"#$%&'()* ,-./:;<=>?@[]^_`{|}~ and therefor may be too extensive compared to your current character class. If however this is what you are after it seems to be faster than regex.

  • Related