i have this tree :
<TEI>
<teiHeader/>
<text>
<body>
<div type="chapter">
<p rend="b"><pb n="1"/>lorem ipsum...</p>
<p rend="b">lorem pb n="2"/> ipsum2...</p>
<p>lorem ipsum3...</p>
</div>
<div type="chapter">
<p>lorem ipsum4...</p>
<p rend="b">lorem ipsum5...</p>
<p rend="b">pb n="3"/> lorem ipsum6...</p>
</div>
</body>
</text>
</TEI>
and i would like to change all
<p rend="b">lorem ipsum...</p>
into
<p><hi rend="b">lorem ipsum...</hi></p>
problem is : all <pb n="X"/>
tags are removed.
i tried this (root = xml tree above) :
parser = etree.XMLParser(ns_clean=True, remove_blank_text=True)
root = etree.fromstring(root, parser)
for item in root.findall(".//p[@rend='b']"):
hi = etree.SubElement(item, "hi", rend=font_variant[variant])
hi.text = ''.join(item.itertext())
print(etree.tostring(root, pretty_print=True, xml_declaration=True))
and i get, for instance for the first <p/>
:
<p><pb n="1"/>lorem ipsum...<hi rend="b"> lorem ipsum...</hi></p>
the <pb n="1"/>
is missing.
Could you help me out?
CodePudding user response:
If I understand you correctly,you are probably looking for something like this:
for p in root.xpath('//p[@rend="b"]'):
#clone the old <p>
old = etree.fromstring(etree.tostring(p))
#change its name
old.tag = "hi"
#create a new element
new = etree.fromstring('<p/>')
#append the clone to the new element
new.append(old)
new.tail ="\n"
#delete the old <p> and replace it with the new element
p.getparent().replace(p, new)