Home > database >  Replace with the extract in multiple substrings in Python
Replace with the extract in multiple substrings in Python

Time:08-07

I have the next string:

## This is h2
Paragraph text

### This is h3
#### This is h4

and I want this:

<h2>This is h2</h2>
Paragraph text

<h3>This is h2</h3>
<h4>This is h2</h4>

How could I do it?

With Regex is easy to find the sentence, but I don't know how to replace it with the extracting sentence and from multiple matchs in the string.

##.*

Any help?

CodePudding user response:

You can use re.sub with lambda:

import re

s = """\
## This is h2
Paragraph text

### This is h3
#### This is h4"""

s = re.sub(
    r"^(# )\s (.*)",
    lambda g: "<h{h}>{s}</h{h}>".format(h=len(g.group(1)), s=g.group(2)),
    s,
    flags=re.M,
)

print(s)

Prints:

<h2>This is h2</h2>
Paragraph text

<h3>This is h3</h3>
<h4>This is h4</h4>

CodePudding user response:

One could do tedious substitutions like

  • re.sub(r'^####', '<h4>', line)
  • re.sub(r'^###', '<h3>', line)
  • re.sub(r'^##', '<h2>', line)
  • re.sub(r'^#', '<h1>', line)

but you're probably much better off with m = re.search(r'^(# )', line), assign n = len(m.group(1)), and synthesize f'<h{n}>' from that.

You can close with e.g. </h4> if you like, or perhaps you'd prefer to let /usr/bin/tidy or bs4 soup.prettify() attend to those details.

  • Related