Home > other >  Using pydantic with xml
Using pydantic with xml

Time:07-26

I am working on a project that uses a lot of xml, and would like to use pydantic to model the objects. In this case I simplified the xml but included an example object.

<ns:SomeType name="NameType" shortDescription="some data">
  <ns:Bar
    thingOne="alpha"
    thingTwo="beta"
    thingThree="foobar"/>
</ns:SomeType>

Code

from pydantic import BaseModel
from typing import Optional, List
from xml.etree import ElementTree as ET


class Bar(BaseModel):
  thing_one: str
  thing_two: str
  thing_three: str


class SomeType(BaseModel):
  name: str
  short_description: str
  bar: Optional[Bar]


def main():
  with open("path/to/file.xml") as fp:
    source = fp.read()
  root = ET.fromstring(source)
  some_type_list = []
  for child in root:
    st = SomeType(
      name=child.attrib["name"],
      short_description=child.attrib["shortDescription"],
    )
    for sub in child:
      st.bar = Bar(
        thing_one=sub.attrib["thingOne"],
        thing_two=sub.attrib["thingTwo"],
        thing_three=sub.attrib["thingThree"],
      )

I looked into BaseModel.parse_obj or BaseModel.parse_raw but I don't think that will solve the problem. I also thought I could try to use xmltodict to convert the xml, the namespace's and the @ attribute's get even more in the way...

>>> import xmltodict
>>> xmltodict.parse(input_xml)
{'ns:SomeType': {'@name': 'NameType', '@shortDescription': 'some data', ... }}

CodePudding user response:

xmltodict can help in your example if you combine it with field aliases:

from typing import Optional

import xmltodict
from pydantic import BaseModel, Field


class Bar(BaseModel):
    thing_one: str = Field(alias="@thingOne")
    thing_two: str = Field(alias="@thingTwo")
    thing_three: str = Field(alias="@thingThree")


class SomeType(BaseModel):
    name: str = Field(alias="@name")
    short_description: str = Field(alias="@shortDescription")
    bar: Optional[Bar] = Field(alias="ns:Bar")


class Root(BaseModel):
    some_type: SomeType = Field(alias="ns:SomeType")


print(
    Root.parse_obj(
        xmltodict.parse(
            """<ns:SomeType name="NameType" shortDescription="some data">
  <ns:Bar
    thingOne="alpha"
    thingTwo="beta"
    thingThree="foobar"/>
</ns:SomeType>""")).some_type)

Output:

name='NameType' short_description='some data' bar=Bar(thing_one='alpha', thing_two='beta', thing_three='foobar')

You can see in the example above that a Root model is needed because the dict has an ns:SomeType key.

  • Related