Suppose i have list of lists
list_l = [ ['txt1', 'val1'], ['txt2', 'val2'], ['txt1', 'val3'], ['txt1', 'val5'] ]
I want to transform this as dictionary below
dict_result = {'txt1': ['val1', 'val3', 'val5'], 'txt2': ['val2']}
Also there are performance requirements as the original list is from ~800mb of file contents.
CodePudding user response:
It's a simple for-loop :
dictionnary = {}
for i in list_l:
if i[0] not in dictionnary:
dictionnary[i[0]] = []
dictionnary[i[0]].append(i[1])
else:
dictionnary[i[0]].append(i[1])
But yeah, for a list of 800mb you should take a look to those librairies like @9769953 told you
CodePudding user response:
as @AxelRozental suggests but utilizing the dictionary pop feature:
dictionary = {}
for i in list:
k = dictionary.pop(i[0], [])
k.append(i[1])
dictionary[i[0]] = k
CodePudding user response:
The defaultdict data structure is the simplest way.
You can do a single scan in the list and collect incrementally the second element
from collections import defaultdict
list_l = [ ['txt1', 'val1'], ['txt2', 'val2'], ['txt1', 'val3'], ['txt1', 'val5'] ]
out = defaultdict(list)
for k, v in list_l:
out[k].append(v)
# defaultdict(list, {'txt1': ['val1', 'val3', 'val5'], 'txt2': ['val2']})
If the data amount could be critical consider to 'consume' the original list using pop function (as already told)