I am working on a project that uses a lot of xml, and would like to use pydantic to model the objects. In this case I simplified the xml but included an example object.
<ns:SomeType name="NameType" shortDescription="some data">
<ns:Bar
thingOne="alpha"
thingTwo="beta"
thingThree="foobar"/>
</ns:SomeType>
Code
from pydantic import BaseModel
from typing import Optional, List
from xml.etree import ElementTree as ET
class Bar(BaseModel):
thing_one: str
thing_two: str
thing_three: str
class SomeType(BaseModel):
name: str
short_description: str
bar: Optional[Bar]
def main():
with open("path/to/file.xml") as fp:
source = fp.read()
root = ET.fromstring(source)
some_type_list = []
for child in root:
st = SomeType(
name=child.attrib["name"],
short_description=child.attrib["shortDescription"],
)
for sub in child:
st.bar = Bar(
thing_one=sub.attrib["thingOne"],
thing_two=sub.attrib["thingTwo"],
thing_three=sub.attrib["thingThree"],
)
I looked into BaseModel.parse_obj or BaseModel.parse_raw but I don't think that will solve the problem. I also thought I could try to use xmltodict to convert the xml, the namespace's and the @ attribute's get even more in the way...
>>> import xmltodict
>>> xmltodict.parse(input_xml)
{'ns:SomeType': {'@name': 'NameType', '@shortDescription': 'some data', ... }}
CodePudding user response:
xmltodict
can help in your example if you combine it with field aliases:
from typing import Optional
import xmltodict
from pydantic import BaseModel, Field
class Bar(BaseModel):
thing_one: str = Field(alias="@thingOne")
thing_two: str = Field(alias="@thingTwo")
thing_three: str = Field(alias="@thingThree")
class SomeType(BaseModel):
name: str = Field(alias="@name")
short_description: str = Field(alias="@shortDescription")
bar: Optional[Bar] = Field(alias="ns:Bar")
class Root(BaseModel):
some_type: SomeType = Field(alias="ns:SomeType")
print(
Root.parse_obj(
xmltodict.parse(
"""<ns:SomeType name="NameType" shortDescription="some data">
<ns:Bar
thingOne="alpha"
thingTwo="beta"
thingThree="foobar"/>
</ns:SomeType>""")).some_type)
Output:
name='NameType' short_description='some data' bar=Bar(thing_one='alpha', thing_two='beta', thing_three='foobar')
You can see in the example above that a Root
model is needed because the dict has an ns:SomeType
key.