I have a list of dicts:
[{'name':'A', 'flag':'2','id':'x1'},
{'name':'A', 'flag':'2','id':'x2'},
{'name':'A','flag':'1','id':'x3'},
{'name':'B', 'flag':'2','id':'x4'}]
I want an output like :
[{'name':'A', 'flag':'2','id':'x1'},
{'name':'A', 'flag':'1','id':'x3'},
{'name':'B', 'flag':'2','id':'x4'}]
Remove duplicate dicts from list where : field name and flag are the same.
For example the second dict will be deleted because its a semi-duplicate with the first one same name, same flag (different ids but we dont care about ids: the idea is to delete al dict which have same name and same flag and keep only one of them
I can use nested loops but dont know if i can use list comprehension !?
CodePudding user response:
One approach:
# create dictionary using as keys the values to remove duplicates
lookup = {(d["name"], d["flag"]): d for d in data[::-1]}
# create list from the lookup values
res = [e for e in lookup.values()]
print(res)
Output
[{'name': 'B', 'flag': '2', 'id': 'x4'}, {'name': 'A', 'flag': '1', 'id': 'x3'}, {'name': 'A', 'flag': '2', 'id': 'x1'}]
If order of is important, just reverse the list:
print(res[::-1])
Output
[{'name': 'A', 'flag': '2', 'id': 'x1'}, {'name': 'A', 'flag': '1', 'id': 'x3'}, {'name': 'B', 'flag': '2', 'id': 'x4'}]
Using a dictionary to remove duplicates by key is a known Python trick.
CodePudding user response:
You can create DataFrame
then remove duplicated
and keep first
from name and flag then back df
to a list of dictionary like below:
>>> import pandas as pd
>>> lst = [{'name':'A', 'flag':'2','id':'x1'},{'name':'A', 'flag':'2','id':'x2'},{'name':'A','flag':'1','id':'x3'},{'name':'B', 'flag':'2','id':'x4'}]
>>> df = pd.DataFrame(lst)
>>> df
name flag id
0 A 2 x1
1 A 2 x2
2 A 1 x3
3 B 2 x4
>>> df.drop_duplicates(subset=['name','flag'], keep='first').to_dict('records')
[{'name': 'A', 'flag': '2', 'id': 'x1'},
{'name': 'A', 'flag': '1', 'id': 'x3'},
{'name': 'B', 'flag': '2', 'id': 'x4'}]
CodePudding user response:
Using list comprehension:
my_list = [{'name':'A', 'flag':'2','id':'x1'},
{'name':'A', 'flag':'2','id':'x2'},
{'name':'A','flag':'1','id':'x3'},
{'name':'B', 'flag':'2','id':'x4'}]
new_list = []
[new_list.append(item) for item in my_list if {"name":item["name"], "flag":item["flag"]} not in [{"name":element["name"], "flag":element["flag"]} for element in new_list]]
new_list
>>> [{'name': 'A', 'flag': '2', 'id': 'x1'},
{'name': 'A', 'flag': '1', 'id': 'x3'},
{'name': 'B', 'flag': '2', 'id': 'x4'}]
CodePudding user response:
To preserve order you can use a set to remember the keys you've seen and only add the keys you want to remember to the set:
data = [{'name':'A', 'flag':'2','id':'x1'}, {'name':'A', 'flag':'2','id':'x2'}, {'name':'A','flag':'1','id':'x3'}, {'name':'B', 'flag':'2','id':'x4'}]
seen = set(); seen_add = seen.add
res = [d for d in data if not ((x:=(d['name'], d['flag'])) in seen or see(x))]
[{'name': 'A', 'flag': '2', 'id': 'x1'},
{'name': 'A', 'flag': '1', 'id': 'x3'},
{'name': 'B', 'flag': '2', 'id': 'x4'}]