Is there a way to replace the matched pattern substring using a single re.sub()
line?.
What I would like to avoid is using a string replace method to the current re.sub()
output.
Input = "/J&L/LK/Tac1_1/shareloc.pdf"
Current output using re.sub("[^0-9_]", "", input): "1_1"
Desired output in a single re.sub use: "1.1"
CodePudding user response:
According to the documentation, re.sub
is defined as
re.sub(pattern, repl, string, count=0, flags=0)
If
repl
is a function, it is called for every non-overlapping occurrence of pattern.
This said, if you pass a lambda function, you can remain the code in one line. Furthermore, remember that the matched characters can be accessed easier to an individual group by: x[0]
.
I removed _
from the regex to reach the desired output.
txt = "/J&L/LK/Tac1_1/shareloc.pdf"
x = re.sub("[^0-9]", lambda x: '.' if x[0] is '_' else '', txt)
print(x)
CodePudding user response:
There is no way to use a string replacement pattern in Python re.sub
to replace with two possible strings, as there is no conditional replacement construct support in Python re.sub
. So, using a callable as the replacement argument or use other work-arounds.
It looks like you only expect one match of <DIGITS>_<DIGITS>
in the input string. In this case, you can use
import re
text = "/J&L/LK/Tac1_1/shareloc.pdf"
print( re.sub(r'^.*?(\d )_(\d ).*', r'\1.\2', text, flags=re.S) )
# => 1.1
See the Python demo. See the regex demo. Details:
^
- start of string.*?
- zero or more chars as few as possible(\d )
- Group 1: one or more digits_
- a_
char(\d )
- Group 2: one or more digits.*
- zero or more chars as many as possible.