I would like to deduplicate the dictionaries that contain the same "id" value.
list of dicts:
example = [{'term': 'potato', 'id': 10}, {'term': 'potatoes', 'id': 10}, {'term': 'apple', 'id': 7}]
Desired output:
example = [{'term': 'potato', 'id': 10}, {'term': 'apple', 'id': 7}]
For the moment I am only able to either remove all of the duplicates instead of keeping one; or only remove those dictionaries that are fully identical whereas I am only looking to deduplicate those that have the same id value.
example code (attempt):
import ast
new_list = []
seen_keys = set()
for term in example:
d = ast.literal_eval(term) #had to convert a string-dict to a dict first because the dictionaries were transformed to a string in a Solr database
if d['id'] not in seen_keys:
new_list.append(d)
seen_keys.add(d['id'])
CodePudding user response:
Or use a one-liner list comprehension with enumerate
:
>>> [d for i, d in enumerate(example) if d['id'] not in [x['id'] for x in example[i 1:]]]
[{'term': 'potatoes', 'id': 10}, {'term': 'apple', 'id': 7}]
>>>
CodePudding user response:
you can try this
example = [
{"term": "potato", "id": 10},
{"term": "potatoes", "id": 10},
{"term": "apple", "id": 7},
]
ids = set()
for item in example:
ids.add(item["id"])
results = []
for item in example:
if item["id"] in ids:
results.append(item)
ids.remove(item["id"])
print(results)
CodePudding user response:
It can be done as easily as:
test_list = [{'term': 'potato', 'id': 10}, {'term': 'potatoes', 'id': 10}, {'term': 'apple', 'id': 7}]
res = []
[res.append(x) for x in test_list if x['id'] not in [y['id'] for y in res]]
print(res)
CodePudding user response:
No need to use ast.literal_eval
:
example = [{'term': 'potato', 'id': 10}, {'term': 'potatoes', 'id': 10}, {'term': 'apple', 'id': 7}]
seen_keys = set()
new_list = []
for d in example:
if d["id"] not in seen_keys:
seen_keys.add(d["id"])
new_list.append(d)
print(new_list)
Output
[{'term': 'potato', 'id': 10}, {'term': 'apple', 'id': 7}]
If you are interested in an O(n)
one-liner, use:
new_list = list({ d["id"] : d for d in example[::-1]}.values())[::-1]
print(new_list)
Output (from one-liner)
[{'term': 'potato', 'id': 10}, {'term': 'apple', 'id': 7}]
CodePudding user response:
After slight editing of your code:
example = [{'term': 'potato', 'id': 10}, {'term': 'potatoes', 'id': 10}, {'term': 'apple', 'id': 7}]
new_list = []
seen_keys = set()
for i in example:
if i['id'] not in seen_keys:
new_list.append(i)
seen_keys.add(i['id'])
print(new_list)
Output:
[{'term': 'potato', 'id': 10}, {'term': 'apple', 'id': 7}]
CodePudding user response:
I kinda like making a generic uniqueBy
function for this sort of problem:
example = [{'term': 'potato', 'id': 10}, {'term': 'potatoes', 'id': 10}, {'term': 'apple', 'id': 7}]
def uniqueBy (f):
return lambda a: { f(x): x for x in a }
uniqueById = uniqueBy(lambda x: x['id'])
print("{}".format(uniqueById(example).values()))