Home > Mobile >  Use Pydantic child model to manage sets of default values for the parent model
Use Pydantic child model to manage sets of default values for the parent model

Time:07-27

I am using pydantic to manage settings for an app that supports different datasets. Each has a set of overridable defaults, but they are different per datasets. Currently, I have all of the logic correctly implemented via validators:

from pydantic import BaseModel

class DatasetSettings(BaseModel):
    dataset_name: str 
    table_name: str

    @validator("table_name", always=True)
    def validate_table_name(cls, v, values):
        if isinstance(v, str):
            return v
        if values["dataset_name"] == "DATASET_1":
            return "special_dataset_1_default_table"
        if values["dataset_name"] == "DATASET_2":
            return "special_dataset_2_default_table"
        return "default_table"

class AppSettings(BaseModel):
    dataset_settings: DatasetSettings
    app_url: str

This way, I get different defaults based on dataset_name, but the user can override them if necessary. This is the desired behavior. The trouble is that once there are more than a handful of such fields and names, it gets to be a mess to read and to maintain. It seems like inheritance/polymorphism would solve this problem but the pydantic factory logic seems too hardcoded to make it feasible, especially with nested models.

class Dataset1Settings(DatasetSettings):
    dataset_name: str = "DATASET_1"
    table_name: str = "special_dataset_1_default_table"

class Dataset2Settings(DatasetSettings):
    dataset_name: str = "DATASET_2"
    table_name: str = "special_dataset_2_default_table"

def dataset_settings_factory(dataset_name, table_name=None):
    if dataset_name == "DATASET_1":
        return Dataset1Settings(dataset_name, table_name)
    if dataset_name == "DATASET_2":
        return Dataset2Settings(dataset_name, table_name)
    return DatasetSettings(dataset_name, table_name)

class AppSettings(BaseModel):
    dataset_settings: DatasetSettings
    app_url: str

Options I've considered:

  • Create a new set of default dataset settings models, override __init__ of DatasetSettings, instantiate the subclass and copy its attributes into the parent class. Kind of clunky.
  • Override __init__ of AppSettings using the dataset_settings_factory to set the dataset_settings attribute of AppSettings. Not so good because the default behavior doesn't work in the DatasetSettings at all, only when instantiated as a nested model in AppSettings.

I was hoping Field(default_factory=dataset_settings_factory) would work, but the default_factory is only for actual defaults so it has zero args. Is there some other way to intercept the args of a particular pydantic field and use a custom factory?

CodePudding user response:

Another option would be to use a Discriminated/Tagged Unions.

But your solution (without looking in detail) looks fine too.

CodePudding user response:

I ended up solving the problem following the first option, as follows. Code is runnable with pydantic 1.8.2 and pydantic 1.9.1.

from typing import Optional
from pydantic import BaseModel, Field


class DatasetSettings(BaseModel):
    dataset_name: Optional[str] = Field(default="DATASET_1")
    table_name: Optional[str] = None

    def __init__(self, **data):
        factory_dict = {"DATASET_1": Dataset1Settings, "DATASET_2": Dataset2Settings}
        dataset_name = (
            data["dataset_name"]
            if "dataset_name" in data
            else self.__fields__["dataset_name"].default
        )
        if dataset_name in factory_dict:
            data = factory_dict[dataset_name](**data).dict()
        super().__init__(**data)


class Dataset1Settings(BaseModel):
    dataset_name: str = "DATASET_1"
    table_name: str = "special_dataset_1_default_table"


class Dataset2Settings(BaseModel):
    dataset_name: str = "DATASET_2"
    table_name: str = "special_dataset_2_default_table"


class AppSettings(BaseModel):
    dataset_settings: DatasetSettings = Field(default_factory=DatasetSettings)
    app_url: Optional[str]


app_settings = AppSettings(dataset_settings={"dataset_name": "DATASET_1"})
assert app_settings.dataset_settings.table_name == "special_dataset_1_default_table"
app_settings = AppSettings(dataset_settings={"dataset_name": "DATASET_2"})
assert app_settings.dataset_settings.table_name == "special_dataset_2_default_table"

# bonus: no args mode
app_settings = AppSettings()
assert app_settings.dataset_settings.table_name == "special_dataset_1_default_table"

A couple of gotchas I discovered along the way:

  1. If Dataset1Settings inherits from DatasetSettings, it enters a recursive loop calling init on init ad infinitum. This could be broken with some introspection, but I opted for the duck approach.
  2. The current solution destroys any validators on DatasetSettings. I'm sure there's a way to call the validation logic anyway but the current solution effectively sidesteps whatever class-level validation you have by only initing with super().__init__
  3. The same thing works for BaseSettings objects, but you have to drag their cumbersome init args:
    def __init__(
        self,
        _env_file: Union[Path, str, None] = None,
        _env_file_encoding: Optional[str] = None,
        _secrets_dir: Union[Path, str, None] = None,
        **values: Any
    ):
        ...
  • Related