How can I safely merge a nested list in Python by use a unique value?-CodePudding

I have a nested list (list of list) and I want to merge if id from nested list is duplicates: Like :

[
  {'id': 2404, 'interfaces': [{'port': 78, 'module': 1 }]}
  {'id': 2404, 'interfaces': [{'port': 79, 'module': 1 }]} 
  {'id': 1234, 'interfaces': [{'port': 79, 'module': 1 }]} 
]

So at the final solution should be

[
  {'id': 2404, 'interfaces': [{'port': 78, 'module': 1 },{'port': 79, 'module': 1 } ]}
  {'id': 1234, 'interfaces': [{'port': 79, 'module': 1 }]} 
]

CodePudding user response：

Use a dictionary to accumulate the interfaces with the same id:

data = [
  {'id': 2404, 'interfaces': [{'port': 78, 'module': 1 }]},
  {'id': 2404, 'interfaces': [{'port': 79, 'module': 1 }]},
  {'id': 1234, 'interfaces': [{'port': 79, 'module': 1 }]}
]

lookup = {}
for d in data:
    iid = d["id"]
    if iid not in lookup:
        lookup[iid] = []
    lookup[iid].extend(d["interfaces"])

res = [{ "id" : iid, "interfaces" : interfaces } for iid, interfaces in lookup.items()]
print(res)

Output

[{'id': 2404, 'interfaces': [{'port': 78, 'module': 1}, {'port': 79, 'module': 1}]}, {'id': 1234, 'interfaces': [{'port': 79, 'module': 1}]}]

Alternative solution using collections.defaultdict:

from collections import defaultdict
lookup = defaultdict(list)
for d in data:
    iid = d["id"]
    lookup[iid].extend(d["interfaces"])

res = [{ "id" : iid, "interfaces" : interfaces } for iid, interfaces in lookup.items()]

Special Case (the groups are contiguous)

If, and only if, the dictionaries id are contiguous, you could use itertools.groupby, as below:

from itertools import groupby, chain
from operator import itemgetter

res = []
for iid, vs in groupby(data, key=itemgetter("id")):
    interfaces = chain.from_iterable(v["interfaces"] for v in vs)
    res.append({"id": iid, "interfaces" : list(interfaces) })

print(res)

Output

[{'id': 2404, 'interfaces': [{'port': 78, 'module': 1}, {'port': 79, 'module': 1}]}, {'id': 1234, 'interfaces': [{'port': 79, 'module': 1}]}]

If the groups are not contiguous you could sort your data, but that will make the approach less efficient (O(nlogn) due to the sorting) that the dictionary alternatives O(n).

CodePudding user response：

I defined your original nested list as the unmerged variable.
The merged variable will contain the output

unmerged = 
[
  {'id': 2404, 'interfaces': [{'port': 78, 'module': 1 }]},
  {'id': 2404, 'interfaces': [{'port': 79, 'module': 1 }]},
  {'id': 1234, 'interfaces': [{'port': 79, 'module': 1 }]},
]
merged = []

for unmerged_item in unmerged:
    match = next((item for item in merged if item['id'] == unmerged_item['id']), None)
    
    if match:
        match['interfaces'].extend(unmerged_item['interfaces'])
    else:
        merged.append(unmerged_item)

The output of the code will be as follows (merged):

[
    {'id': 2404, 'interfaces': [{'port': 78, 'module': 1}, {'port': 79, 'module': 1}]}, 
    {'id': 1234, 'interfaces': [{'port': 79, 'module': 1}]}
]