There's a common problem where I need to keep track of a bunch of collections in a dictionary. Let's say I want to keep track of which items I borrowed from my friends. The defaultdict
class is quite useful to do this:
from collections import defaultdict
d = defaultdict(set)
d['Peter'].add('salt')
d['Eric'].add('car')
d['Eric'].add('jacket')
# defaultdict(<class 'set'>, {'Peter': {'salt'}, 'Eric': {'jacket', 'car'}})
This allows me to add items to the respective sets without worrying if any key is already in the set. Now if I return the salt to Peter. This means I owe him nothing and he can be removed from the dictionary. Doing this is slightly more cumbersome.
d['Peter'].remove('salt')
if not d['Peter']:
del(d['Peter'])
I know I could put this in some function, but for readability I would like a class that removes the key automatically if the corresponding set is empty. Is there some way to do this?
Edit
Okay I realize a pretty major problem with this idea when trying to solve it using inheritance and changing the index function. This is that that when calling d[index]
the value is obviously returned already before calling .remove(something)
, which makes it impossible for the dictionary to know that it has been emptied. I'm guessing there's not really a way around using something different.
CodePudding user response:
The problem with using a defaultdict to do what you want is that even accessing a key sets that key using the factory function. Consider:
from collections import defaultdict
d = defaultdict(set)
if d["Peter"]:
print("I owe something to Peter")
print(d)
# defaultdict(set, {'Peter': set()})
Also, the problem with creating a sub-class, as you've realized, the __getitem__()
method is called before the set is ever emptied, so you'd have to call another function that checks if the set is empty and remove it.
A better idea might be to just not include keys with empty sets when you're creating the string representation.
class NewDefaultDict(defaultdict):
def __repr__(self):
return (f"NewDefaultDict({repr(self.default_factory)}, {{"
", ".join(f"{repr(k)}: {repr(v)}" for k, v in self.items() if v)
"})")
nd = NewDefaultDict(set)
nd["Peter"].add("salt")
nd["Paul"].add("pepper")
nd["Paul"].remove("pepper")
print(nd)
# NewDefaultDict(<class 'set'>, {'Peter': {'salt'}})
You would also need to redefine __contains__()
to check if the value is empty, so that e.g. "Paul" in nd
returns False
:
def __contains__(self, key):
return defaultdict.__contains__(self, key) and self[key]
To make it compatible with for ... in nd
constructs and dict-unpacking, you can redefine __iter__()
:
def __iter__(self):
for key in defaultdict.__iter__(self):
if self[key]: yield key
Then,
for k in nd:
print(k)
gives:
Peter
CodePudding user response:
A dictionary comprehension might be useful.
from collections import defaultdict
d = defaultdict(set)
d['Peter'].add('salt')
d['Eric'].add('car')
d['Eric'].add('jacket')
d['Peter'].remove('salt')
d2 = {k: v for k, v in d.items() if len(v) > 0}
The d2
dictionary is now:
{'Eric': {'car', 'jacket'}}
Alternatively, using the fact that an empty set is considered false in Python.
d2 = {k: v for k, v in d.items() if v}
Defining a class to implement this logic, similar to the other answer, we can simply ignore keys/values where the value meets a criteria. A function is passed using the ignore
parameter to define that criteria.
from collections import defaultdict
class default_ignore_dict(defaultdict):
def __init__(self, factory, ignore, *args, **kwargs):
defaultdict.__init__(self, factory, *args, **kwargs)
self.ignore = ignore
def __contains__(self, key):
return defaultdict.__contains__(self, key) and not self.ignore(self[key])
def items(self):
return ((k, v) for k, v in defaultdict.items(self) if not self.ignore(v))
def keys(self):
return (k for k, _ in self.items())
def values(self):
return (v for _, v in self.items())
Testing this:
>>> d = default_ignore_dict(set, lambda s: not s)
>>> d['Peter'].add('salt')
>>> d['Peter'].remove('salt')
>>> d['Eric'].add('car')
>>> d['Eric'].add('jacket')
>>>
>>> 'Peter' in d
False
>>> list(d.items())
[('Eric', {'car', 'jacket'})]
>>>