Let's assume I have the sentence:
sentence = "A cow runs on the grass"
If I want to replace the word cow
with "some" special token, I can do:
to_replace = "cow"
# A <SPECIAL> runs on the grass
sentence = re.sub(rf"(?!\B\w)({re.escape(to_replace)})(?<!\w\B)", "<SPECIAL>", sentence, count=1)
Additionally, if I want to replace it's plural form, I could do:
sentence = "The cows run on the grass"
to_replace = "cow"
# Influenza is one of the respiratory <SPECIAL>
sentence = re.sub(rf"(?!\B\w)({re.escape(to_replace) 's?'})(?<!\w\B)", "<SPECIAL>", sentence, count=1)
which does the replacement even if the word to replace remains in its singular form cow
, while the s?
does the job to perform the replacement.
My question is what happens if I want to apply the same in a more general way, i.e., find-and-replace words which can be singular, plural - ending with s
, and also plural - ending with es
(note that I'm intentionally ignoring many edge cases that could appear - discussed in the comments of the question). Another way to frame the question would be how can add multiple optional ending suffixes to a word, so that it works for the following examples:
to_replace = "cow"
sentence1 = "The cow runs on the grass"
sentence2 = "The cows run on the grass"
# --------------
to_replace = "gas"
sentence3 = "There are many natural gases"
CodePudding user response:
I suggest using regular python logic, remember to avoid stretching regexes too much if you don't need to:
phrase = "There are many cows in the field cowes"
for word in phrase.split():
if word == "cow" or word == "cow" "s" or word == "cow" "es":
phrase = phrase.replace(word, "replacement")
print(phrase)
Output:
There are many replacement in the field replacement
CodePudding user response:
Apparently, for the use-case I posted, I can make the suffix optional. So it could go as:
re.sub(rf"(?!\B\w)({re.escape(e_obj) '(s|es)?'})(?<!\w\B)", "<SPECIAL>", sentence, count=1)
Note that this would not work for many edge cases discussed in the comments!