I'm trying to remove duplicates from following structure:
data = [
{
'author': 'Isav Tuco',
'authorId': 62,
'tags': ['fire', 'works']
},
{
'author': 'Sham Isa',
'authorId': 23,
'tags': ['badminton', 'game']
},
{
'author': 'Isav Tuco',
'authorId': 62,
'tags': ['fire', 'works']
}
]
I've tried with the below method available here . Code to remove duplicates from list of dictionaries:
seen = set()
new_list = []
for d in data:
t = tuple(d.items())
if d not in seen:
seen.add(d)
new_list.append(d)
CodePudding user response:
The solution from Remove duplicate dict in list in Python is not fully appliccable because you have inner lists.
You would need to tuplify them as well for usage in a set.
One way to do it would be:
data = [{'author': 'Isav Tuco',
'authorId': 62,
'tags': ['fire', 'works']},
{'author': 'Sham Isa',
'authorId': 23,
'tags': ['badminton', 'game']},
{'author': 'Isav Tuco',
'authorId': 62,
'tags': ['fire', 'works']}]
seen = set()
new_list = []
for d in data:
l = []
# use sorted items to avoid {"a":1, "b":2} != {"b":2, "a":1} being
# different when getting the dicts items
for (a,b) in sorted(d.items()):
if isinstance(b,list):
l.append((a,tuple(b))) # convert lists to tuples
else:
l.append((a,b))
# convert list to tuples so you can put it into a set
t = tuple(l)
if t not in seen:
seen.add(t) # add the modified value
new_list.append(d) # add the original value
print(new_list)
Output:
[{'author': 'Isav Tuco', 'authorId': 62, 'tags': ['fire', 'works']},
{'author': 'Sham Isa', 'authorId': 23, 'tags': ['badminton', 'game']}]
This is hacked though - you may want to get your own better solution.