I have the next string:
## This is h2
Paragraph text
### This is h3
#### This is h4
and I want this:
<h2>This is h2</h2>
Paragraph text
<h3>This is h2</h3>
<h4>This is h2</h4>
How could I do it?
With Regex is easy to find the sentence, but I don't know how to replace it with the extracting sentence and from multiple matchs in the string.
##.*
Any help?
CodePudding user response:
You can use re.sub
with lambda
:
import re
s = """\
## This is h2
Paragraph text
### This is h3
#### This is h4"""
s = re.sub(
r"^(# )\s (.*)",
lambda g: "<h{h}>{s}</h{h}>".format(h=len(g.group(1)), s=g.group(2)),
s,
flags=re.M,
)
print(s)
Prints:
<h2>This is h2</h2>
Paragraph text
<h3>This is h3</h3>
<h4>This is h4</h4>
CodePudding user response:
One could do tedious substitutions like
re.sub(r'^####', '<h4>', line)
re.sub(r'^###', '<h3>', line)
re.sub(r'^##', '<h2>', line)
re.sub(r'^#', '<h1>', line)
but you're probably much better off with m = re.search(r'^(# )', line)
,
assign n = len(m.group(1))
,
and synthesize f'<h{n}>'
from that.
You can close with e.g. </h4>
if you like,
or perhaps you'd prefer to let /usr/bin/tidy
or bs4 soup.prettify()
attend to those details.