Dataclasses - AttributeError: type object 'Arguments' has no attribute 'column

I am trying to use dataclasses to create a list of strings that can be used in a function. I am getting an attribute error when trying to access the information normally as I would with something like my_int: Optional[int] = field(default=1000).

For example:

from typing import Optional, List
from dataclasses import dataclass, field

@dataclass
class Arguments:
    """
    Configuration for data loader.
    """
    
    column_names: Optional[list[str]] = field(
        default_factory= lambda:['copies', 'path', 'repo_name', 'size', 'license', 'hash', 'line_mean', 'line_max', 'alpha_frac', 'autogenerated']
    )

def build_dl(args: Arguments):
    load_train_data = args.column_names
    return load_train_data

build_dl(Arguments)

Error: AttributeError: type object 'Arguments' has no attribute 'column_names'

I am attempting to get args.column_names to be ['copies', 'path', 'repo_name', 'size', 'license', 'hash', 'line_mean', 'line_max', 'alpha_frac', 'autogenerated']

I have not used dataclasses before. Any help would be appreciated.

Thank you.

CodePudding user response：

It seems like you are trying to use column_names as a class variable, rather than a dataclass field. That is, in your case above you would need to first instantiate an instance of Arguments such as Arguments(column_names=...) to be able to access column_names attribute.

So to clarify, the quickest and easiest fix in your case would be to update the call like:

build_dl(Arguments())

One other approach you can use to get around this is by using class variables as mentioned in the docs, which are excluded from consideration by dataclasses altogether. Another option can be to remove the annotation like list[str] entirely, which also achieves the same result.

In below I've also updated the annotation for args to Type[Arguments], to indicate that we're passing the actual type (e.g. Arguments) rather than an instance of the same type.

from typing import Type, ClassVar
from dataclasses import dataclass


@dataclass
class Arguments:
    """
    Configuration for data loader.
    """

    column_names: ClassVar[list[str]] = [
        'copies', 'path', 'repo_name', 'size', 'license', 'hash', 'line_mean',
        'line_max', 'alpha_frac', 'autogenerated'
    ]


def build_dl(args: Type[Arguments]):
    load_train_data = args.column_names
    return load_train_data


build_dl(Arguments)