Problem
Say I have the following list of dictionaries:
givenValues=[
{'id': '0001', 'name': 'me'},
{'id': '0002', 'name': 'me'},
{'id': '0001', 'name': 'you'},
{'id': '0003', 'name': 'hi'},
{'id': '0001', 'name': 'they'},
{'id': '0002', 'name': 'me'},
{'id': '0002', 'name': 'me'}
]
Required result
I want to keep the first of each unique id and remove all other dictionaries from the list such that the result is
[
{'id': '0001', 'name': 'me'},
{'id': '0002', 'name': 'me'},
{'id': '0003', 'name': 'hi'}
]
So far I have tried the following. Some of the attempts do work if the dictionaries in the list are arranged differently but not always:
Attempt 1
tempList=[]
for i in range(len(givenValues)):
for j in range(i 1, len(givenValues)):
if givenValues[i]['id']==givenValues[j]['id']:
tempList.append(givenValues[j])
for item in tempList:
if item in givenValues:
givenValues.remove(item)
Result:
[
{'id': '0001', 'name': 'me'},
{'id': '0003', 'name': 'hi'}
]
Attempt 2
for i in range(len(givenValues)):
if i<len(givenValues):
for j in range(i 1, len(givenValues)):
if i<len(givenValues) and givenValues[i]['id']==givenValues[j]['id']:
givenValues.remove(givenValues[j])
Result
[
{'id': '0001', 'name': 'me'},
{'id': '0003', 'name': 'hi'},
{'id': '0001', 'name': 'they'},
{'id': '0002', 'name': 'me'}
]
Please help me solve this problem.
CodePudding user response:
Here'a one possible solution:
data = [
{"id": "0001", "name": "me"},
{"id": "0002", "name": "me"},
{"id": "0001", "name": "you"},
{"id": "0003", "name": "hi"},
{"id": "0001", "name": "they"},
{"id": "0002", "name": "me"},
{"id": "0002", "name": "me"},
]
selected = {}
for item in data:
if item["id"] not in selected:
selected[item["id"]] = item
output = list(selected.values())
print(output)
We use a dictionary to keep track of unique items, and we take advantage of the fact that a dictionary's .values()
method returns items in the order in which they were inserted (so this preserves the order in which we found the items in the original list).
The above code outputs:
[{'id': '0001', 'name': 'me'}, {'id': '0002', 'name': 'me'}, {'id': '0003', 'name': 'hi'}]
Here's another way of approaching this, using itertools.groupby
:
import itertools
data = [
{"id": "0001", "name": "me"},
{"id": "0002", "name": "me"},
{"id": "0001", "name": "you"},
{"id": "0003", "name": "hi"},
{"id": "0001", "name": "they"},
{"id": "0002", "name": "me"},
{"id": "0002", "name": "me"},
]
output = []
for k, g in itertools.groupby(sorted(data, key=lambda item: item['id']), lambda item: item['id']):
output.append(next(g))
print(output)
This produces the same output. It works by grouping the items in your list by id
, and then taking the first item from each group.