Home > Software design >  Python parser xml nested element find by name
Python parser xml nested element find by name

Time:11-12

I have a nested XML to parser with Python. Example

<custom-objects xmlns="http://www.demandware.com/xml/impex/customobject/2006-10-31">
<custom-object type-id="AbandonedBaskets" object-id="b4122d6090d1d6a3f8dafd34b0">
    <object-attribute attribute-id="basketJson">{"UUID":"b4122d6090d1d6a3f8dafd34b0"}</object-attribute>
    <object-attribute attribute-id="cnRelatedNo">1365</object-attribute>
    <object-attribute attribute-id="customerOwnerNo">702069175</object-attribute>
    <object-attribute attribute-id="lastModifiedBasketDate">2022-06-29T21:31:04.000 0000</object-attribute>
    <object-attribute attribute-id="natg_emailAlreadySent">true</object-attribute>
</custom-object>

I need to create a for to extract a elements attribute-id="basketJson" and attribute-id="lastModifiedBasketDate" anyone can help us ? tks

CodePudding user response:

You can use for example beautifulsoup to parse the XML:

xml_doc = """\
<custom-objects xmlns="http://www.demandware.com/xml/impex/customobject/2006-10-31">
<custom-object type-id="AbandonedBaskets" object-id="b4122d6090d1d6a3f8dafd34b0">
    <object-attribute attribute-id="basketJson">{"UUID":"b4122d6090d1d6a3f8dafd34b0"}</object-attribute>
    <object-attribute attribute-id="cnRelatedNo">1365</object-attribute>
    <object-attribute attribute-id="customerOwnerNo">702069175</object-attribute>
    <object-attribute attribute-id="lastModifiedBasketDate">2022-06-29T21:31:04.000 0000</object-attribute>
    <object-attribute attribute-id="natg_emailAlreadySent">true</object-attribute>
</custom-object>"""

from bs4 import BeautifulSoup

soup = BeautifulSoup(xml_doc, "xml")

for obj in soup.select("custom-object"):
    print("basketJson =", obj.select_one('[attribute-id="basketJson"]').text)
    print(
        "lastModifiedBasketDate =",
        obj.select_one('[attribute-id="lastModifiedBasketDate"]').text,
    )

Prints:

basketJson = {"UUID":"b4122d6090d1d6a3f8dafd34b0"}
lastModifiedBasketDate = 2022-06-29T21:31:04.000 0000
  • Related