I am trying to use dataclasses to create a list of strings that can be used in a function. I am getting an attribute error when trying to access the information normally as I would with something like my_int: Optional[int] = field(default=1000)
.
For example:
from typing import Optional, List
from dataclasses import dataclass, field
@dataclass
class Arguments:
"""
Configuration for data loader.
"""
column_names: Optional[list[str]] = field(
default_factory= lambda:['copies', 'path', 'repo_name', 'size', 'license', 'hash', 'line_mean', 'line_max', 'alpha_frac', 'autogenerated']
)
def build_dl(args: Arguments):
load_train_data = args.column_names
return load_train_data
build_dl(Arguments)
Error: AttributeError: type object 'Arguments' has no attribute 'column_names'
I am attempting to get args.column_names
to be ['copies', 'path', 'repo_name', 'size', 'license', 'hash', 'line_mean', 'line_max', 'alpha_frac', 'autogenerated']
I have not used dataclasses before. Any help would be appreciated.
Thank you.
CodePudding user response:
It seems like you are trying to use column_names
as a class variable, rather than a dataclass field. That is, in your case above you would need to first instantiate an instance of Arguments
such as Arguments(column_names=...)
to be able to access column_names
attribute.
So to clarify, the quickest and easiest fix in your case would be to update the call like:
build_dl(Arguments())
One other approach you can use to get around this is by using class variables as mentioned in the docs, which are excluded from consideration by dataclasses
altogether. Another option can be to remove the annotation like list[str]
entirely, which also achieves the same result.
In below I've also updated the annotation for args
to Type[Arguments]
, to indicate that we're passing the actual type (e.g. Arguments
) rather than an instance of the same type.
from typing import Type, ClassVar
from dataclasses import dataclass
@dataclass
class Arguments:
"""
Configuration for data loader.
"""
column_names: ClassVar[list[str]] = [
'copies', 'path', 'repo_name', 'size', 'license', 'hash', 'line_mean',
'line_max', 'alpha_frac', 'autogenerated'
]
def build_dl(args: Type[Arguments]):
load_train_data = args.column_names
return load_train_data
build_dl(Arguments)