I am working on a text mining project in sentiment analysis. some sentence makes better sense when they have ! and ?. I am using a regular expression technique
text = re.sub("[^a-zA-Z]".format(a), ' ', text)
where text
is a list of strings.
with this, all punctuations are removed, but I would like to keep ! and ? and remove the rest.
CodePudding user response:
You can include the ?
and !
characters in your regular expression:
text = re.sub("[^a-zA-Z!?]".format(a), ' ', text)