Home > other >  List of dictionaries remove semi-occurrences
List of dictionaries remove semi-occurrences

Time:10-07

I have a list of dicts:

[{'name':'A', 'flag':'2','id':'x1'},
{'name':'A', 'flag':'2','id':'x2'},
{'name':'A','flag':'1','id':'x3'},
{'name':'B', 'flag':'2','id':'x4'}]

I want an output like :

[{'name':'A', 'flag':'2','id':'x1'},
{'name':'A', 'flag':'1','id':'x3'},
{'name':'B', 'flag':'2','id':'x4'}]

Remove duplicate dicts from list where : field name and flag are the same.
For example the second dict will be deleted because its a semi-duplicate with the first one same name, same flag (different ids but we dont care about ids: the idea is to delete al dict which have same name and same flag and keep only one of them
I can use nested loops but dont know if i can use list comprehension !?

CodePudding user response:

One approach:

# create dictionary using as keys the values to remove duplicates 
lookup = {(d["name"], d["flag"]): d for d in data[::-1]}

# create list from the lookup values
res = [e for e in lookup.values()]

print(res)

Output

[{'name': 'B', 'flag': '2', 'id': 'x4'}, {'name': 'A', 'flag': '1', 'id': 'x3'}, {'name': 'A', 'flag': '2', 'id': 'x1'}]

If order of is important, just reverse the list:

print(res[::-1])

Output

[{'name': 'A', 'flag': '2', 'id': 'x1'}, {'name': 'A', 'flag': '1', 'id': 'x3'}, {'name': 'B', 'flag': '2', 'id': 'x4'}]

Using a dictionary to remove duplicates by key is a known Python trick.

CodePudding user response:

You can create DataFrame then remove duplicated and keep first from name and flag then back df to a list of dictionary like below:

>>> import pandas as pd
>>> lst = [{'name':'A', 'flag':'2','id':'x1'},{'name':'A', 'flag':'2','id':'x2'},{'name':'A','flag':'1','id':'x3'},{'name':'B', 'flag':'2','id':'x4'}]

>>> df = pd.DataFrame(lst)
>>> df
    name    flag    id
0      A       2    x1
1      A       2    x2
2      A       1    x3
3      B       2    x4

>>> df.drop_duplicates(subset=['name','flag'], keep='first').to_dict('records')
[{'name': 'A', 'flag': '2', 'id': 'x1'},
 {'name': 'A', 'flag': '1', 'id': 'x3'},
 {'name': 'B', 'flag': '2', 'id': 'x4'}]

CodePudding user response:

Using list comprehension:

my_list = [{'name':'A', 'flag':'2','id':'x1'},
{'name':'A', 'flag':'2','id':'x2'},
{'name':'A','flag':'1','id':'x3'},
{'name':'B', 'flag':'2','id':'x4'}]

new_list = []

[new_list.append(item) for item in my_list if {"name":item["name"], "flag":item["flag"]} not in [{"name":element["name"], "flag":element["flag"]} for element in new_list]]

new_list
>>> [{'name': 'A', 'flag': '2', 'id': 'x1'},
 {'name': 'A', 'flag': '1', 'id': 'x3'},
 {'name': 'B', 'flag': '2', 'id': 'x4'}]

CodePudding user response:

To preserve order you can use a set to remember the keys you've seen and only add the keys you want to remember to the set:

data = [{'name':'A', 'flag':'2','id':'x1'}, {'name':'A', 'flag':'2','id':'x2'}, {'name':'A','flag':'1','id':'x3'}, {'name':'B', 'flag':'2','id':'x4'}]
seen = set(); seen_add = seen.add

res = [d for d in data if not ((x:=(d['name'], d['flag'])) in seen or see(x))]

[{'name': 'A', 'flag': '2', 'id': 'x1'},
 {'name': 'A', 'flag': '1', 'id': 'x3'},
 {'name': 'B', 'flag': '2', 'id': 'x4'}]
  • Related