I am required to use regex module. I have coded this little program to replace certain regex matches such as orange with the length of orange in # signs, for example, if orange is in the string then it will be replaced with ######.
If a string has been changed it will add " !! This string has been changed !!" to the end of the string.
If a string has not been changed but has a # in it then it will not add " !! This string has been changed !!".
I am wondering, is there a more efficient way of coding this up? using regex functions and better python code.
orange = re.compile(r'\borange\b', re.IGNORECASE)
frog = re.compile(r'\bfrog\b', re.IGNORECASE)
cat = re.compile(r'\bcat\b', re.IGNORECASE)
num = 0
if re.search(orange, s):
s = re.sub(orange, "!!!!!!", s)
num =1
if re.search(frog, s):
s = re.sub(frog, "!!!!", s)
num =1
if re.search(cat, s):
s = re.sub(cat, "!!!", s)
num =1
if num > 0:
return s " !! This string has been changed !!"
else:
return s```
CodePudding user response:
Assuming your line input can contain 'orange' 'frog' 'cat' simultaneously ONE particular solution to this is, create a regex pattern which can match either of your solutions, then create an iterator for each match, re-place this found match with the 'x' according to the len of the matched string and printing the string modified (or not if that is the case)
Code is:
import re
string = "orange frog cat test"
#string = "one two tree testing stackoverflow"
regex_pattern = re.compile(r"\b(orange|frog|cat)\b", re.IGNORECASE)
total_matches = regex_pattern.finditer(string)
# We find either of the options? then changes will be made
changes_done = regex_pattern.search(string)
for match in total_matches:
element_find = match.group(0)
string = regex_pattern.sub("x" * len(element_find), string, 1)
if( changes_done ):
print(string " | changes where made")
else:
print(string " | no changes made")
What really shines in this particular solution is the third parameter of sub, where you can limit the count of matches done. As i said, one particular solution for your problem.
Output generated for the replacement will be xxxxxx xxxx xxx test | changes where made
CodePudding user response:
I guess you're using this code inside a function, since you're returning some values.
Anyway, without the num
counter:
import re
pattern = r"\b(orange|frog|cat)\b"
s = "an orange eaten by a frog and a cat"
rgx_matches = re.findall(pattern, s, flags=re.IGNORECASE)
for rgx_match in rgx_matches:
print(re.sub(rgx_match, "#"*len(rgx_match), s) \
" !! This string has been changed !!")