I have a nested XML to parser with Python. Example
<custom-objects xmlns="http://www.demandware.com/xml/impex/customobject/2006-10-31">
<custom-object type-id="AbandonedBaskets" object-id="b4122d6090d1d6a3f8dafd34b0">
<object-attribute attribute-id="basketJson">{"UUID":"b4122d6090d1d6a3f8dafd34b0"}</object-attribute>
<object-attribute attribute-id="cnRelatedNo">1365</object-attribute>
<object-attribute attribute-id="customerOwnerNo">702069175</object-attribute>
<object-attribute attribute-id="lastModifiedBasketDate">2022-06-29T21:31:04.000 0000</object-attribute>
<object-attribute attribute-id="natg_emailAlreadySent">true</object-attribute>
</custom-object>
I need to create a for to extract a elements attribute-id="basketJson" and attribute-id="lastModifiedBasketDate" anyone can help us ? tks
CodePudding user response:
You can use for example beautifulsoup
to parse the XML:
xml_doc = """\
<custom-objects xmlns="http://www.demandware.com/xml/impex/customobject/2006-10-31">
<custom-object type-id="AbandonedBaskets" object-id="b4122d6090d1d6a3f8dafd34b0">
<object-attribute attribute-id="basketJson">{"UUID":"b4122d6090d1d6a3f8dafd34b0"}</object-attribute>
<object-attribute attribute-id="cnRelatedNo">1365</object-attribute>
<object-attribute attribute-id="customerOwnerNo">702069175</object-attribute>
<object-attribute attribute-id="lastModifiedBasketDate">2022-06-29T21:31:04.000 0000</object-attribute>
<object-attribute attribute-id="natg_emailAlreadySent">true</object-attribute>
</custom-object>"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(xml_doc, "xml")
for obj in soup.select("custom-object"):
print("basketJson =", obj.select_one('[attribute-id="basketJson"]').text)
print(
"lastModifiedBasketDate =",
obj.select_one('[attribute-id="lastModifiedBasketDate"]').text,
)
Prints:
basketJson = {"UUID":"b4122d6090d1d6a3f8dafd34b0"}
lastModifiedBasketDate = 2022-06-29T21:31:04.000 0000