I have a list of dictionaries that looks something like this:
list_of_dicts = [{'ID': 'a1', 'fruit': 'apple', 'trait': [{'colour': 'green'}]},
{'ID': 'a2', 'fruit': 'apple', 'trait': [{'colour': 'red'}]},
{'ID': 'a3', 'fruit': 'apple', 'trait': [{'colour': 'yellow'}]},
{'ID': 'a4', 'fruit': 'melon', 'trait': [{'colour': 'red'}]},
{'ID': 'a5', 'fruit': 'banana', 'trait': [{'colour': 'yellow'}]}]
As you can see, every dictionary has its own unique ID. However, some of the 'fruit' values are equal; in this case we have 3 apples. Other fruits occur only once, such as, 'melon' and 'banana'. How can I merge the nested traits of the dictionaries containing the value 'apple', so that this is merged into a single dictionary?
Desired output:
desired_list_of_dicts =
[{'ID': 'a1',
'fruit': 'apple',
'trait': [{'colour': 'green'}, {'colour': 'red'}, {'colour': 'yellow'}]},
{'ID': 'a4', 'fruit': 'melon', 'trait': [{'colour': 'red'}]},
{'ID': 'a5', 'fruit': 'banana', 'trait': [{'colour': 'yellow'}]}]
For the moment, I have miserably only managed to count and keep track of the duplicates using Counter, but I am not sure how to go on from there:
from collections import Counter
key_counts = Counter(d['fruit'] for d in list_of_dicts)
uniqueValues = []
duplicateValues = []
for d in list_of_dicts:
if key_counts[d['fruit']] == 1:
uniqueValues.append(d)
else:
duplicateValues.append(d)
print(len(duplicateValues)) ##3
print(duplicateValues)
How should I iteratively create new dictionaries with the correct info and append dictionary traits to the nested list?
CodePudding user response:
Try:
list_of_dicts = [
{"ID": "a1", "fruit": "apple", "trait": [{"colour": "green"}]},
{"ID": "a2", "fruit": "apple", "trait": [{"colour": "red"}]},
{"ID": "a3", "fruit": "apple", "trait": [{"colour": "yellow"}]},
{"ID": "a4", "fruit": "melon", "trait": [{"colour": "red"}]},
{"ID": "a5", "fruit": "banana", "trait": [{"colour": "yellow"}]},
]
out = {}
for d in list_of_dicts:
out.setdefault(d["fruit"], []).append(d)
out = [
{
"ID": v[0]["ID"],
"fruit": k,
"trait": [l for d in v for l in d["trait"]],
}
for k, v in out.items()
]
print(out)
Prints:
[
{
"ID": "a1",
"fruit": "apple",
"trait": [{"colour": "green"}, {"colour": "red"}, {"colour": "yellow"}],
},
{"ID": "a4", "fruit": "melon", "trait": [{"colour": "red"}]},
{"ID": "a5", "fruit": "banana", "trait": [{"colour": "yellow"}]},
]