Home > Software design >  Filter list of dataclass instances based on specified attribute
Filter list of dataclass instances based on specified attribute

Time:09-27

Suppose I have a dataclass

@dataclass
class Person:
   
    name: str
    birthday: str

And I create some instances with these lists:

names = np.array(['John', 'Max', 'Alice', 'Bob', 'Max', 'Alice'])
bithdays = np.array(['June', 'August', 'June', 'August', 'December', 'January'])
persons = np.column_stack((names, bithdays))

persons = [Person(*person) for person in persons]

Now I want to filter my list based on a specific attribute. Say, I want to get all persons with the same name. But another time I might want to get the persons with the same birthday. Or I might even want to filter on a specific attribute and a specific value, so maybe get all persons who were born in June.

My idea is to somehow override the __eq__ and __hash__ methods to provide such flexibility. I guess this is the right approach. Since I have no experience with this I would appreciate some hints or ideas on how to create such a flexible filter elegantly.

CodePudding user response:

The simplest version of what you want to achieve doesn't need any magic method and is as simple as filtering based on two parameters: field name and field value. i.e:

field_name = 'name'
field_value = 'Alice'

filtered_persons = [
    person
    for person in filter(
        lambda p: getattr(p, field_name) == field_value,
        persons
    )
]

You can wrap that into a utility function if you wish.

I am not sure how you intended to use those two magic methods to filter a list of persons, though.

CodePudding user response:

There are two pythonic ways for achieving this. First would be to use the filter() builtin function:

same_names = filter(lambda p: p.name == "Alice", persons)

And second is to use list/generator comprehensions:

same_names = [p for p in persons if p.name == "Alice"]  # list
# or
same_names = (p for p in persons if p.name == "Alice")  # generator

The advantage of filter and generator is that it does not create a new list, but create a way to iterate over the existing list and apply the condition you need on the fly.

  • Related