Home > Enterprise >  Generate HTML/XML in Python?
Generate HTML/XML in Python?

Time:12-19

I am using Python to programmatically generate HTML. The HTML I want to generate is this:

<p>Hello <b>world</b> how are you?</p>

However, I do not know how to add the hello before the <b> tag and the string how are you? after the bold tag.

My code looks like this:

from xml.etree import ElementTree

p = ElementTree.Element('p')

b = ElementTree.Element('b')
b.text = 'world'

p.append(b)

Where would I add hello and how are you? The paragraph element only has one p.text field, and there does not seem to be a way to intersperse text and other HTML tags when building the document.

How can I programmatically generate an HTML document with both tags and text mixed together?

CodePudding user response:

I'm not sure about other comments that you shouldn't use ElementTree for this, since no one gives a concrete example of how else to do it.

Here's how to do it with ElementTree's TreeBuilder class, it's very straight-forward, so long as you can think about/manage all the start-data-end relationships/hierarchy:

#!/usr/bin/env python3
import xml.etree.ElementTree as ET

builder = ET.TreeBuilder()
builder.start('p')
builder.data('Hello ')
builder.start('b')
builder.data('world')
builder.end('b')
builder.data(' how are you?')
builder.end('p')

root = builder.close()  # close to "finalize the tree" and return an Element

ET.dump(root)  # print the Element

And to anyone who thinks ElementTree is the wrong tool to do this, what would you recommend if OP had asked how to programmatically build DocBook XML? Because that's how I see this question: I want to build structured text, and the renderer happens to be the browser, so the structure will look like HTML.

CodePudding user response:

You CAN do this, but you'd need to put the pieces of text into <span> tags. In my opinion, this is just a bad idea. HTML is not XML. There are much better tools.

import sys
from xml.etree import ElementTree as ET

html = ET.Element('html')
body = ET.Element('body')
html.append(body)
para = ET.Element('p')
b1 = ET.Element('span')
b1.text = "Hello"
b2 = ET.Element('b')
b2.text = "world,"
b3 = ET.Element('span')
b3.text = "how are you?"
para.append(b1)
para.append(b2)
para.append(b3)
html.append(para)

ET.ElementTree(html).write(sys.stdout, encoding='unicode', method='html')
  • Related