Home > Software design >  Define a dataclass having an attribute as List of itself
Define a dataclass having an attribute as List of itself

Time:08-05

I am working with python3 and am just starting to learn about dataclass

I am trying to create a dataclass having an attribute that is list of itself.

Something like:

@dataclass
class Directory:
    name: str = field(default_factory=generate_randomly)
    num_of_files: int = 0
    ...
    subdirectories: List[Directory] = []

What I am struggling with is how to define the subdirectories attribute which is a List of Directory itself

If I try this

dir1 = Directory('folder1')
dir2 = Directory('folder2')
dir = Directory(subfolders=[dir1, dir2])

Traceback (most recent call last):
  File "main.py", line 14, in <module>
    class Directory:
  File "main.py", line 17, in Directory
    subfolders: List(Directory) = []
NameError: name 'Directory' is not defined

I saw one post here but that doesn't look like what I need

CodePudding user response:

Seems like a good start to me so far, though you have a few minor typos:

  1. Change def to class, since you're creating a class - dataclasses are just regular Python classes.
  2. For forward references - in this case Directory is not yet defined - wrap the type in single or double quotes ' - so it becomes a string, and thus is lazy evaluated.
  3. Use dataclasses.field() with a default_factory argument for mutable types like list, dict, and set.

Example code putting it all together:

import random
import string
from dataclasses import field, dataclass
from typing import List


def generate_randomly():
    return ''.join(random.choice(string.ascii_letters) for _ in range(15))


@dataclass
class Directory:
    name: str = field(default_factory=generate_randomly)
    num_of_files: int = 0
    subdirectories: List['Directory'] = field(default_factory=list)


print(Directory())

In Python 3.7 , you can use a __future__ import so that all annotations are forward-declared (converted to strings) by default. This can simplify logic so you don't need single quotes, or even an import from typing module.

from __future__ import annotations

from dataclasses import field, dataclass


@dataclass
class Directory:
    name: str = field(default_factory=generate_randomly)
    num_of_files: int = 0
    subdirectories: list[Directory] = field(default_factory=list)

To validate that each element in a subdirectory is actually Directory type, since dataclasses doesn't automatically handle this, you can add logic in __post_init__() to achieve this:

    def __post_init__(self):
        for dir in self.subdirectories:
            if not isinstance(dir, Directory):
                raise TypeError(f'{dir}: invalid type ({type(dir)}) for subdirectory')
  • Related