I have a class like:
class Pathology:
"""
Represents a pathology, which is initialized with a name and description.
"""
def __init__(self: str, name: str, description: str):
self.id = str(uuid.uuid4())
self.name = name
self.description = description
self.phases = []
def to_json(self):
return jsonpickle.encode(self, make_refs=False, unpicklable=False)
In this class, I do not ever want a user to pass in a value for id
, I always wish to generate it upon construction.
When deserializing from JSON, I wish to do something like:
with open('data/test_case_1.json', 'r') as test_case_1_file:
test_case_1 = test_case_1_file.read()
# parse file
obj = jsonpickle.decode(test_case_1)
assert pathology == Pathology(**obj)
However, I run into an error TypeError: __init__() got an unexpected keyword argument 'id'
I suspect this is because the init constructor does not have the field id
available.
What is the pythonic way to support this behavior?
CodePudding user response:
In this class, I do not ever want a user to pass in a value for id, I always wish to generated it upon construction.
Based on the above desired result, my recommendation is to define id as a (read-only) property
. The benefits of defining it as a property is that it won't be treated as an instance attribute, and coincidentally it won't accept a value via the constructor; the main drawback is that it won't show in the class's __repr__
value (assuming we use the generated one we get from dataclasses) or in the dataclasses.asdict
helper function.
I've also taken added a few additional changes in the implementation as well (hopefully for the better):
Re-declare the class as a dataclass, which I personally prefer as it reduces a bit of boilerplate code such as an
__init__
constructor, or the need to define an__eq__
method for example (the latter to check if two class objects are equal via==
). The dataclasses module also provides a helpfulasdict
function which we can make use of in the serialization process.Use built-in JSON (de)serialization via the json module. Part of the reason for this decision is I have personally never used the
jsonpickle
module, and I only have a rudimentary understanding of how pickling works in general. I feel that converting class objects to/from JSON is more natural, and likely also performs better in any case.Add a
from_json_file
helper method, which we can use to load a new class object from a local file path.
import json
import uuid
from dataclasses import dataclass, asdict, field, fields
from functools import cached_property
from typing import List
@dataclass
class Pathology:
"""
Represents a pathology, which is initialized with a name and description.
"""
name: str
description: str
phases: List[str] = field(init=False, default_factory=list)
@cached_property
def id(self) -> str:
return str(uuid.uuid4())
def to_json(self):
return json.dumps(asdict(self))
@classmethod
def from_json_file(cls, file_name: str):
# A list of only the fields that can be passed in to the constructor.
# Note: maybe it's worth caching this for repeated runs.
init_fields = tuple(f.name for f in fields(cls) if f.init)
if not file_name.endswith('.json'):
file_name = '.json'
with open(file_name, 'r') as in_file:
test_case_1 = json.load(in_file)
# parse file
return cls(**{k: v for k, v in test_case_1.items() if k in init_fields})
And here's some quick code I put together, to confirm that everything is as expected:
def main():
p1 = Pathology('my-name', 'my test description.')
print('P1:', p1)
p_id = p1.id
print('P1 -> id:', p_id)
assert p1.id == p_id, 'expected id value to be cached'
print('Serialized JSON:', p1.to_json())
# Save JSON to file
with open('my_file.json', 'w') as out_file:
out_file.write(p1.to_json())
# De-serialize object from file
p2 = Pathology.from_json_file('my_file')
print('P2:', p2)
# assert both objects are same
assert p2 == p1
# IDs should be unique, since it's automatically generated each time (we
# don't pass in an ID to the constructor or store it in JSON file)
assert p1.id != p2.id, 'expected IDs to be unique'
if __name__ == '__main__':
main()