I am having issues creating a dictionary that assigns a list of multiple values to each key. Currently the data is in a list of list of 2 items:Category and Value, for example:
sample_data = [["January", 9],["Februrary", 10], ["June", 12], ["March", 15], ["January", 10],["June", 14], ["March", 16]]
It has to be transformed into a dicitonary like this:
d = {"January" : [9,10], "February":[10], "June":[12,14], "March": [15,16]}
This is my current code:
d = defaultdict(list)
for category, value in sample_data:
d[category].append(value)
This works for small samples but with very large samples of data it raises a ValueError saying too much values to unpack. Is there any way I could improve on this code or is there another way of doing this?
CodePudding user response:
One way to solve this is prob. change the code to accept more than one values
. Like this way:
Note - *value means that it can take multiple values (more than one)
Not sure this can get rid of the problem, because the sample data is not complete enough to show the exact error point. But it could help you eliminate the "error"...or narrow down to it.
data = [["January", 9],["Februrary", 10], ["June", 12],
["March", 15], ["January", 10],["June", 14], ["March", 16],
['April', 20, 21, 22]] # <--- add April & 3 values (to handle the earlier error)
from collections import defaultdict
# d = {"January" : [9,10], "February":[10], "June":[12,14],
# "March": [15,16]}
# This is my current code:
dc = defaultdict(list)
for category, *value in data: # *value to accept multiple values
dc[category].append(value)
print(dc)
output:
defaultdict(<class 'list'>, {'January': [[9], [10]], 'Februrary': [[10]], 'June': [[12], [14]], 'March': [[15], [16]], 'April': [[20, 21, 22]]})
CodePudding user response:
So, the setdefault
method creates a list as the value for a key. This also works if the key is not in the dictionary.
d = defaultdict(list)
for k, v in sample_data:
d.setdefault(k, []).append(v)
print(d)
Output:
defaultdict(<class 'list'>, {'January': [9, 10], 'Februrary': [10], 'June': [12, 14], 'March': [15, 16]})
Note: I do not have a larger data set to work with but the setdefault method could possibly help out with that.