My data structure is set up similar to this:
[[{'proj': 'XABCD'}, {'test': 1}], [{'proj': 'XABCD'}, {'test': 2}], [{'proj': 'XDEFG'}, {'test': 1}]]
I'd like to be able to split the main list based in the values of 'proj' so my result would be along the lines of a list for each unique project:
[[{'proj': 'XABCD'}, {'test': 1}], [{'proj': 'XABCD'}, {'test': 2}]]
[[{'proj': 'XDEFG'}, {'test': 1}]]
I do not know how many different projects will actually be present and what their names will be so I can't hardcode any sorting in.
I was thinking of looping through the main list, assigning each unique project as a key to a dictionary then appending the sublist to the value for that projects key. My code and result comes out like this:
projects = {}
for sample in contaminated_samples:
proj = sample[0]['proj']
if proj in projects.keys():
projects[proj].append(sample)
else:
projects[proj] = [sample]
{'XABCD': [[{'proj': 'XABCD'}, {'test': 1}], [{'proj': 'XABCD'}, {'test': 2}]], 'XDEFG': [[{'proj': 'XDEFG'}, {'test': 1}]]}
While this works I was wondering if there's a more efficient way or some sort of list/dictionary comprehension that would allow me to get the same/similar results.
CodePudding user response:
I would basically do the same as you but I would simplify it slightly with setdefault()
data = [
[{'proj': 'XABCD'}, {'test': 1}],
[{'proj': 'XABCD'}, {'test': 2}],
[{'proj': 'XDEFG'}, {'test': 1}]
]
data2 = {}
for row in data:
data2.setdefault(row[0]["proj"], []).append(row)
data2 = list(data2.values())
print(data2)
result in:
[
[
[{'proj': 'XABCD'}, {'test': 1}],
[{'proj': 'XABCD'}, {'test': 2}]
],
[
[{'proj': 'XDEFG'}, {'test': 1}]
]
]