I have a following code in python
from lxml import etree
offers = etree.parse(r'prices.xml')
print("offers\n")
target = offers.xpath('//offer[./vendor/text()="Qtap"]')
length = len(target)
for i in range(length):
print(target[i])
etree.ElementTree(target[i]).write('output.xml', encoding='utf-8', xml_declaration=True)
I simply read the xml file. Reading data from it using xpath and want to write all of it into an other file. There are around 2000 elements shown as lenght, But script writes only last one. Sorry I know that my question is rather stupid, but it is my first program on Python.
CodePudding user response:
That's because your write()
method in the loop overwrites the previous element every time it runs. Try it this way:
qtaps = etree.XML("""<offers/>""".encode())
targets = offers.xpath('//offer[./vendor/text()="Qtap"]')
for target in targets:
qtaps.insert(-1,target)
with open('output.xml', 'wb') as doc:
doc.write(etree.tostring(qtaps, pretty_print = True, encoding='utf-8', xml_declaration=True))
and see if it works.
CodePudding user response:
Since it sounds like you simply need to remove <offer>
nodes in your XML, consider XPath's generalized sibling, XSLT, the special-purpose language designed to transform XML files. Python's lxml
library can run XSLT 1.0 scripts.
Specifically, an identity template and empty template can remove the needed nodes (vendor!='Qtap'
) all without a single for
loop. Below will preserve the original structure of XML with less <offer>
nodes.
XSLT (save as .xsl file, a special XML file)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- IDENTITY TRANSFORM -->
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<!-- EMPTY TEMPLATE TO REMOVE CONTENT -->
<xsl:template match="offer[vendor!='Qtap']"/>
</xsl:stylesheet>
Python
import lxml.etree as lx
# PARSE XML AND XSLT
doc = lx.parse("input.xml")
style = lx.parse("style.xsl")
# CONFIGURE AND RUN TRANSFORMER
transformer = lx.XSLT(style)
result = transformer(doc)
# OUTPUT TO FILE
result.write_output("output.xml")