Home > Back-end >  Python RegEx Substitution expression representation with asterixis expression
Python RegEx Substitution expression representation with asterixis expression

Time:11-21

I'm attempting to put a "-" in front of a list. For example:

Note: V here represents my identifier of the beginning and end of my list item. (Because the list is part of a long text).

V
Apple
Banana
Orange fruit
V

to

V
- Apple
- Banana
- Orange fruit
V

I manage to match the list by doing this

V\n([A-Za-z] ( [A-Za-z] )*)\r\n(([A-Za-z] ( [A-Za-z] )*\r\n)*)\nV

But I'm having issues with the replacement value. I was thinking of this approach

V- \1\n- \2\nV

but that only places the indents on the first item and basically deletes the rest of the list. Something like this:

V
- Apple
- 
V

I'm using re.sub, BTW

CodePudding user response:

You can use a replacement function:

import re
s = 'V\nApple\nBanana\nOrange fruit\nV'
r = re.sub('(?<=\n)[a-zA-Z]{1}', lambda x:v if (v:=x.group()) == 'V' else f'-{v}', s)
print(r)

Output:

V
-Apple
-Banana
-Orange fruit
V

CodePudding user response:

You can do:

import re 

txt='''\
Y
dog
Y
V
Apple
Banana
Orange fruit
V'''
def rfunc(m):
  return m.group(1) '\n'.join([f'- {l}' for l in m.group(2).splitlines()]) m.group(3)

print(
  re.sub(r'(^V$\n)([\s\S]*?)(\n^V$)', rfunc, txt, flags=re.M)
)  

Prints:

Y
dog
Y
V
- Apple
- Banana
- Orange fruit
V
  • Related