I'm trying to isolate a field and a method from classes to work with mongodb.
Example of the working class:
@dataclass
class Article(Mongodata):
name: str
quantity: int
description: str
_id: Optional[int] = None
def __getdict__(self):
result = asdict(self)
result.pop("_id")
return result
How can I isolate _id and getdict into an abstract class so that everything works.
@dataclass
class Article(Mongodata):
name: str
quantity: int
description: str
@dataclass
class Mongodata(ABCMeta):
@property
@abstractmethod
def _id(self) -> Optional[int]:
return None
def __getdict__(self):
result = asdict(self)
result.pop("_id")
return result
Can you explain how abstract and metaclasses differ, and I came from java, and after reading about it I didn't understand anything?
CodePudding user response:
As you mentioned you're on Python 3.9, you can set it up the same way you had it above, however if you declare the fields in Article
as above and add a field definition in the superclass like below:
@dataclass
class Mongodata(ABC):
_id: Optional[int] = None
Then if you actually try to run the code, you would run into a TypeError
as below:
TypeError: non-default argument 'name' follows default argument
The reason for this is the order in which dataclasses
resolves the fields for a dataclass when inheritance is involved. In this case, it adds the _id
field from the superclass first, and then all the fields in the Article
dataclass next. Since the first param that it adds has a default value, but the params that follow it don't have a default value, it'll raise a TypeError
as you might expect.
Note that you'd actually run into the same behavior if you had decided to manually generate an __init__
method for the Article class in the same way:
def __init__(self, _id: Optional[int] = None, name: str, quantity: int, description: str):
^
SyntaxError: non-default argument follows default argument
The best approach in Python 3.9, seems to be declare the dataclasses this way, so that all fields in the subclass have default values:
from abc import ABC
from dataclasses import dataclass, asdict
from typing import Optional
@dataclass
class Mongodata(ABC):
_id: Optional[int] = None
def __getdict__(self):
result = asdict(self)
result.pop("_id")
return result
@dataclass
class Article(Mongodata):
name: str = None
quantity: int = None
description: str = None
But then positional arguments from creating an Article
object will be a problem, because it'll assign the first argument passed in to the constructor to _id
:
a = Article('123', 321, 'desc')
So you could instead pass None
as the first positional argument, and that'll get assigned to _id
. Another approach that works, is to then pass keyword arguments into the constructor instead:
a = Article(name='123', quantity=321, description='desc')
This actually feels more natural with the kw_only
param that was introduced to dataclasses in Python 3.10 as a means to resolve this same issue, but more on that below.
A Metaclass Approach
Another option is to declare a function which can be used as a metaclass, as below:
from dataclasses import asdict
from typing import Optional
def add_id_and_get_dict(name: str, bases: tuple[type, ...], cls_dict: dict):
"""Metaclass to add an `_id` field and a `get_dict` method."""
# Get class annotations
cls_annotations = cls_dict['__annotations__']
# This assigns the `_id: Optional[int]` annotation
cls_annotations['_id'] = Optional[int]
# This assigns the `_id = None` assignment
cls_dict['_id'] = None
def get_dict(self):
result = asdict(self)
result.pop('_id')
return result
# add get_dict() method to the class
cls_dict['get_dict'] = get_dict
# create and return a new class
cls = type(name, bases, cls_dict)
return cls
Then you can simplify your dataclass definition a little. Also you technically don't need to define a get_dict
method here, but it's useful so that an IDE knows that such a method exists on the class.
from dataclasses import dataclass
from typing import Any
@dataclass
class Article(metaclass=add_id_and_get_dict):
name: str
quantity: int
description: str
# Add for type hinting, so the IDE knows such a method exists.
def get_dict(self) -> dict[str, Any]:
...
And now it's a bit more intuitive when you want to create new Article
objects:
a = Article('abc', 123, 'desc')
print(a) # Article(name='abc', quantity=123, description='desc', _id=None)
print(a._id) # None
print(a.get_dict()) # {'name': 'abc', 'quantity': 123, 'description': 'desc'}
a2 = Article('abc', 321, 'desc', _id=12345)
print(a2) # Article(name='abc', quantity=321, description='desc', _id=12345)
print(a2._id) # 12345
print(a2.get_dict()) # {'name': 'abc', 'quantity': 321, 'description': 'desc'}
Keyword-only Arguments
In Python 3.10, if you don't want to assign default values to all the fields in a subclass, another option is to decorate the superclass with @dataclass(kw_only=True)
, so that fields defined in that class are then required to be keyword-only arguments by default.
You can also use the KW_ONLY
sentinel value as a type annotation which is provided in dataclasses in Python 3.10 as shown below, which should also make things much simpler and more intuitive to work with.
from abc import ABC
from dataclasses import dataclass, asdict, KW_ONLY
from typing import Optional
@dataclass
class Mongodata(ABC):
_: KW_ONLY
_id: Optional[int] = None
@property
def dict(self):
result = asdict(self)
result.pop("_id")
return result
# noinspection PyDataclass
@dataclass
class Article(Mongodata):
name: str
quantity: int
description: str
Essentially, any fields defined after the _: KW_ONLY
then become keyword-only arguments to the constructor.
Now the usage should be exactly as desired. You can pass both keyword and positional arguments to the constructor, and it appears to work as intended:
a = Article(name='123', quantity=123, description='desc')
print(a) # Article(_id=None, name='123', quantity=123, description='desc')
print(a._id) # None
print(a.dict) # {'name': '123', 'quantity': 123, 'description': 'desc'}
a2 = Article('123', 321, 'desc', _id=112233)
print(a2) # Article(_id=112233, name='123', quantity=321, description='desc')
print(a2._id) # 112233
print(a2.dict) # {'name': '123', 'quantity': 321, 'description': 'desc'}
Also, just a quick explanation that I've been able to come up with, on why this appears to work as it does. Since you've only decorated the superclass as kw_only=True
, all this accomplishes is in making _id
as a keyword-only argument to the constructor. The fields in the subclass are allowed as either keyword or positional arguments, since we didn't specify kw_only
for them.
An easier way to think about this, is to imagine that the signature of the __init__()
method that dataclasses
generates, actually looks like this:
def __init__(self, name: str, quantity: int, description: str, *, _id: Optional[int] = None):
In Python (not necessarily in 3.10 alone), the appearance of *
in a function signifies that all the parameters that follow it are then declared as keyword-only arguments. Note that the _id
argument, in this case is added as a keyword-argument after all the positional arguments from the subclass. This means that the method signature is valid, since it's certainly possible for keyword-only arguments to a method to have default values as we do here.