Home > Back-end >  copy one xml to an other in python using lxml
copy one xml to an other in python using lxml

Time:02-21

I have a following code in python

from lxml import etree

offers = etree.parse(r'prices.xml')

print("offers\n")

target = offers.xpath('//offer[./vendor/text()="Qtap"]')
length = len(target)
for i in  range(length):
    print(target[i])
    etree.ElementTree(target[i]).write('output.xml', encoding='utf-8', xml_declaration=True)

I simply read the xml file. Reading data from it using xpath and want to write all of it into an other file. There are around 2000 elements shown as lenght, But script writes only last one. Sorry I know that my question is rather stupid, but it is my first program on Python.

CodePudding user response:

That's because your write() method in the loop overwrites the previous element every time it runs. Try it this way:

qtaps = etree.XML("""<offers/>""".encode())
targets = offers.xpath('//offer[./vendor/text()="Qtap"]')
for target in targets:
    qtaps.insert(-1,target)
with open('output.xml', 'wb') as doc:
    doc.write(etree.tostring(qtaps, pretty_print = True, encoding='utf-8', xml_declaration=True))

and see if it works.

CodePudding user response:

Since it sounds like you simply need to remove <offer> nodes in your XML, consider XPath's generalized sibling, XSLT, the special-purpose language designed to transform XML files. Python's lxml library can run XSLT 1.0 scripts.

Specifically, an identity template and empty template can remove the needed nodes (vendor!='Qtap') all without a single for loop. Below will preserve the original structure of XML with less <offer> nodes.

XSLT (save as .xsl file, a special XML file)

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" encoding="utf-8" indent="yes"/>
    <xsl:strip-space elements="*"/>
    
    <!-- IDENTITY TRANSFORM -->
    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>

    <!-- EMPTY TEMPLATE TO REMOVE CONTENT -->
    <xsl:template match="offer[vendor!='Qtap']"/>
</xsl:stylesheet>

Python

import lxml.etree as lx

# PARSE XML AND XSLT
doc = lx.parse("input.xml")
style = lx.parse("style.xsl")

# CONFIGURE AND RUN TRANSFORMER
transformer = lx.XSLT(style)
result = transformer(doc)

# OUTPUT TO FILE
result.write_output("output.xml")
  • Related