Split list based on nested dictionary value-CodePudding

My data structure is set up similar to this:

[[{'proj': 'XABCD'}, {'test': 1}], [{'proj': 'XABCD'}, {'test': 2}], [{'proj': 'XDEFG'}, {'test': 1}]]

I'd like to be able to split the main list based in the values of 'proj' so my result would be along the lines of a list for each unique project:

[[{'proj': 'XABCD'}, {'test': 1}], [{'proj': 'XABCD'}, {'test': 2}]] 
[[{'proj': 'XDEFG'}, {'test': 1}]]

I do not know how many different projects will actually be present and what their names will be so I can't hardcode any sorting in.

I was thinking of looping through the main list, assigning each unique project as a key to a dictionary then appending the sublist to the value for that projects key. My code and result comes out like this:

 projects = {}
 for sample in contaminated_samples:
     proj = sample[0]['proj']
     if proj in projects.keys():
         projects[proj].append(sample)
     else:
         projects[proj] = [sample]


{'XABCD': [[{'proj': 'XABCD'}, {'test': 1}], [{'proj': 'XABCD'}, {'test': 2}]], 'XDEFG': [[{'proj': 'XDEFG'}, {'test': 1}]]}

While this works I was wondering if there's a more efficient way or some sort of list/dictionary comprehension that would allow me to get the same/similar results.

CodePudding user response：

I would basically do the same as you but I would simplify it slightly with setdefault()

data = [
    [{'proj': 'XABCD'}, {'test': 1}],
    [{'proj': 'XABCD'}, {'test': 2}],
    [{'proj': 'XDEFG'}, {'test': 1}]
]
data2 = {}
for row in data:
    data2.setdefault(row[0]["proj"], []).append(row)
data2 = list(data2.values())
print(data2)

result in:

[
    [
        [{'proj': 'XABCD'}, {'test': 1}],
        [{'proj': 'XABCD'}, {'test': 2}]
    ],
    [
        [{'proj': 'XDEFG'}, {'test': 1}]
    ]
]