Home > database >  How to merge/append new items of dictionaries containing duplicate items in Python 3 ?
How to merge/append new items of dictionaries containing duplicate items in Python 3 ?

Time:08-11

I have a list of dictionaries that looks something like this:

list_of_dicts = [{'ID': 'a1', 'fruit': 'apple', 'trait': [{'colour': 'green'}]},
                 {'ID': 'a2', 'fruit': 'apple', 'trait': [{'colour': 'red'}]},
                 {'ID': 'a3', 'fruit': 'apple', 'trait': [{'colour': 'yellow'}]},
                 {'ID': 'a4', 'fruit': 'melon', 'trait': [{'colour': 'red'}]},
                 {'ID': 'a5', 'fruit': 'banana', 'trait': [{'colour': 'yellow'}]}]

As you can see, every dictionary has its own unique ID. However, some of the 'fruit' values are equal; in this case we have 3 apples. Other fruits occur only once, such as, 'melon' and 'banana'. How can I merge the nested traits of the dictionaries containing the value 'apple', so that this is merged into a single dictionary?

Desired output:

desired_list_of_dicts = 
[{'ID': 'a1',
  'fruit': 'apple',
  'trait': [{'colour': 'green'}, {'colour': 'red'}, {'colour': 'yellow'}]},
 {'ID': 'a4', 'fruit': 'melon', 'trait': [{'colour': 'red'}]},
 {'ID': 'a5', 'fruit': 'banana', 'trait': [{'colour': 'yellow'}]}]

For the moment, I have miserably only managed to count and keep track of the duplicates using Counter, but I am not sure how to go on from there:

from collections import Counter

key_counts = Counter(d['fruit'] for d in list_of_dicts)

uniqueValues = []
duplicateValues = []
for d in list_of_dicts:
    if key_counts[d['fruit']] == 1:
        uniqueValues.append(d)
    else:
        duplicateValues.append(d)
print(len(duplicateValues)) ##3
print(duplicateValues) 

How should I iteratively create new dictionaries with the correct info and append dictionary traits to the nested list?

CodePudding user response:

Try:

list_of_dicts = [
    {"ID": "a1", "fruit": "apple", "trait": [{"colour": "green"}]},
    {"ID": "a2", "fruit": "apple", "trait": [{"colour": "red"}]},
    {"ID": "a3", "fruit": "apple", "trait": [{"colour": "yellow"}]},
    {"ID": "a4", "fruit": "melon", "trait": [{"colour": "red"}]},
    {"ID": "a5", "fruit": "banana", "trait": [{"colour": "yellow"}]},
]

out = {}
for d in list_of_dicts:
    out.setdefault(d["fruit"], []).append(d)

out = [
    {
        "ID": v[0]["ID"],
        "fruit": k,
        "trait": [l for d in v for l in d["trait"]],
    }
    for k, v in out.items()
]
print(out)

Prints:

[
    {
        "ID": "a1",
        "fruit": "apple",
        "trait": [{"colour": "green"}, {"colour": "red"}, {"colour": "yellow"}],
    },
    {"ID": "a4", "fruit": "melon", "trait": [{"colour": "red"}]},
    {"ID": "a5", "fruit": "banana", "trait": [{"colour": "yellow"}]},
]
  • Related