I've got a list of strings.
input=['XX=BB|3|3|1|1|PLP|KLWE|9999|9999', 'XX=BB|3|3|1|1|2|PLP|KPOK|99999|99999', '999|999|999|9999|999', ....]
This type '999|999|999|9999|999'
remains unchanged.
I need to replace 9999|9999
with 12|21
I write this (?<=BB\|\d\|\d\|\d\|\d\|\S{3}\|\S{4}\|)9{2,9}\|9{2,9}
to match 999|999
. However, there are 4 to 6 \|\d
in the middle. So how to match |d
this pattern for multiple times.
Desired result:
['XX=BB|3|3|1|1|PLP|KLWE|12|21', 'XX=BB|3|3|1|1|2|PLP|KPOK|12|21', '999|999|999|9999|999'...]
thanks
CodePudding user response:
I would just use re.sub
here and search for the pattern \b9{2,9}\|9{2,9}\b
:
inp = ["XX=BB|3|3|1|1|PLP|KLWE|9999|9999" "XX=BB|3|3|1|1|2|PLP|KPOK|99999|99999"]
output = [re.sub(r'\b9{2,9}\|9{2,9}\b', '12|21', i) for i in inp]
print(output)
# ['XX=BB|3|3|1|1|PLP|KLWE|12|21', 'XX=BB|3|3|1|1|2|PLP|KPOK|12|21']
CodePudding user response:
You can use
re.sub(r'(BB(?:\|\d){4,6}\|[^\s|]{3}\|[^\s|]{4}\|)9{2,9}\|9{2,9}(?!\d)', r'\g<1>12|21', text)
See the regex demo.
Details:
(BB(?:\|\d){4,6}\|[^\s|]{3}\|[^\s|]{4}\|)
- Capturing group 1:BB
- aBB
string(?:\|\d){4,6}
- four, five or six repetitions of|
and any digit sequence\|
- a|
char[^\s|]{3}
- three chars other than whitespace and a pipe\|[^\s|]{4}\|
- a|
, four chars other than whitespace and a pipe, and then a pipe char
9{2,9}\|9{2,9}
- two to nine9
chars,|
and again two to nine9
chars...(?!\d)
- not followed with another digit (note you may remove this if you do not need to check for the digit boundary here. You may also use(?![^|])
instead if you need to check if there is a|
char or end of string immediately on the right).
The \g<1>12|21
replacement includes an unambiguous backreference to Group 1 (\g<1>
) and a 12|21
substring appended to it.
See the Python demo:
import re
texts=['XX=BB|3|3|1|1|PLP|KLWE|9999|9999', 'XX=BB|3|3|1|1|2|PLP|KPOK|99999|99999', '999|999|999|9999|999']
pattern = r'(BB(?:\|\d){4,6}\|[^\s|]{3}\|[^\s|]{4}\|)9{2,9}\|9{2,9}(?!\d)'
repl = r'\g<1>12|21'
for text in texts:
print( re.sub(pattern, repl, text) )
Output:
XX=BB|3|3|1|1|PLP|KLWE|12|21
XX=BB|3|3|1|1|2|PLP|KPOK|12|21
999|999|999|9999|999