I'd like to group emails by their domain and convert the result into a dictionary. So far I have figured out that itertools.groupby
with a custom func will do that. It correctly assigns keys to each value, but when I try to create a dictionary only the last value is used when the values to be grouped are not continues.
import re
from itertools import groupby
{k: list(v) for k, v in groupby(["bar", "foo", "baz"], key=lambda x: "to" if re.search(r"^b", x) else "cc")}
This will produce {'to': ['baz'], 'cc': ['foo']}
instead of {'to': ['bar', 'baz'], 'cc': ['foo']}
.
How I can fix that?
CodePudding user response:
Sort the group first to get correct result (itertools.groupby
groups continuous items):
import re
from itertools import groupby
out = {
k: list(v)
for k, v in groupby(
sorted(
["awol", "bar", "foo", "baz"],
key=lambda x: bool(re.search(r"^b", x)),
),
key=lambda x: "to" if re.search(r"^b", x) else "cc",
)
}
print(out)
Prints:
{'cc': ['awol', 'foo'], 'to': ['bar', 'baz']}
CodePudding user response:
You can use dict.setdefault
OR collections.defaultdict(list)
and extend
in list
like below.
# from collections import defaultdict
# dct = defaultdict(list)
from itertools import groupby
import re
dct = {}
for k, v in groupby(["awol", "bar", "foo", "baz"],
key=lambda x: "to" if re.search(r"^b", x) else "cc"):
dct.setdefault(k,[]).extend(list(v))
# If you use 'dct = defaultdict(list)'. You can add item in 'list' like below
# dct[k].extend(list(v))
print(dct)
{'cc': ['awol', 'foo'], 'to': ['bar', 'baz']}