So I have to replace words in a given file. The problem is you may have this subtext in some other word which shouldn't be replaced, so string.replace()
can't be used here. The word also may have symbols like ".,;:!?"
For example in the given file there is Bobtail has a tail.
and all words "tail" must be replaced with "head", so the answer in this situation shall be Bobtail has a head.
CodePudding user response:
The easiest solution which may be good enough for you, is by adding some spaces around the word you want to replace in the .replace()
call.
Using your example:
str = "Bobtail has a tail."
print(str.replace("tail", "nose")) # Bobnose has a nose.
print(str.replace(" tail", "nose")) # Bobtail has a nose.
The next step up is to use a regular expression to find the strings to replace. This is a bit more complex and case specific, so you may want to use something like RegExr to try and build one.
CodePudding user response:
You have to use a regular expression for such complicated string searches. In python, this can be done by importing the RE module. Then you can use the search()
method to find any given regular expression in a string. The result is accessible through the group()
method. Given that you know how to loop ove the contents of a file, you're solution looks like this:
import re
substring = some_file.txt
result =[]
for e in substring:
regexp = re.search("(\s|\.|\,|\?|\!|\:|\;)tail(\s|\.|\,|\?|\!|\:|\;)",e,1)
if regexp.group() is not None:
result.push(e.replace(regexp.group(),"head"))
else:
result.push(e)
You can practice more with regular expressions here: https://regexr.com/
For simplicity, I didn't include every special character. Note that you have to escape them with backslash character. Use \s
for whitespaces.
CodePudding user response:
Like Dan P mentioned, what you are looking for is the python re module, particularly the sub method.
Take this string for example:
s = "Bobtail has a !!tail.!! and the ..tail> is just a part of Bobtails' body"
Using the regex word boundry operator \b
resulting_string = re.sub('\\btail\\b','head', s)
"Bobtail has a !!head.!! and the ..head> is just a part of Bobtails' body"
To eliminate special characters you could go for something more complex in the regex pattern, like:
resulting_string = re.sub('\\W\\S?tail\\S*','head',s)
"Bobtail has a head and the head is just a part of Bobtails' body"